2023-04-12 06:21:55

by Arseniy Krasnov

[permalink] [raw]
Subject: [PATCH v1 0/5] refactoring and fix for Meson NAND

Hello,

this patchset adds one fix and several refactoring patches. First patch
is the most important - it fixes unstable behaviour of Meson driver, for
example random ECC errors during reads. I've tested this with mount/
unmount/read/write cases for JFFS2 and nanddump/nandwrite utlities on
AXG family (A113X SoC). Source of this update is old vendor's driver.

Other patches are refactoring and extra checks, not critical for this
driver's reliability.

Arseniy Krasnov (5):
mtd: rawnand: meson: fix NAND access for read/write
mtd: rawnand: meson: replace GENMASK() macro with define
mtd: rawnand: meson: check buffer length
mtd: rawnand: meson: clear OOB buffer before read
mtd: rawnand: meson: remove unneeded bitwise OR with zeroes

drivers/mtd/nand/raw/meson_nand.c | 147 ++++++++++++++++++++++++++----
1 file changed, 127 insertions(+), 20 deletions(-)

--
2.35.0


2023-04-12 06:22:01

by Arseniy Krasnov

[permalink] [raw]
Subject: [PATCH v1 2/5] mtd: rawnand: meson: replace GENMASK() macro with define

Signed-off-by: Arseniy Krasnov <[email protected]>
---
drivers/mtd/nand/raw/meson_nand.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
index 256c37c76526..45b53d420aed 100644
--- a/drivers/mtd/nand/raw/meson_nand.c
+++ b/drivers/mtd/nand/raw/meson_nand.c
@@ -110,6 +110,8 @@

#define PER_INFO_BYTE 8

+#define NFC_CMD_RAW_LEN (GENMASK(13, 0))
+
struct meson_nfc_nand_chip {
struct list_head node;
struct nand_chip nand;
@@ -300,7 +302,7 @@ static void meson_nfc_cmd_access(struct nand_chip *nand, int raw, bool dir,

if (raw) {
len = mtd->writesize + mtd->oobsize;
- cmd = (len & GENMASK(13, 0)) | scrambler | DMA_DIR(dir);
+ cmd = (len & NFC_CMD_RAW_LEN) | scrambler | DMA_DIR(dir);
writel(cmd, nfc->reg_base + NFC_REG_CMD);
return;
}
@@ -550,7 +552,7 @@ static int meson_nfc_read_buf(struct nand_chip *nand, u8 *buf, int len)
if (ret)
goto out;

- cmd = NFC_CMD_N2M | (len & GENMASK(13, 0));
+ cmd = NFC_CMD_N2M | (len & NFC_CMD_RAW_LEN);
writel(cmd, nfc->reg_base + NFC_REG_CMD);

meson_nfc_drain_cmd(nfc);
@@ -574,7 +576,7 @@ static int meson_nfc_write_buf(struct nand_chip *nand, u8 *buf, int len)
if (ret)
return ret;

- cmd = NFC_CMD_M2N | (len & GENMASK(13, 0));
+ cmd = NFC_CMD_M2N | (len & NFC_CMD_RAW_LEN);
writel(cmd, nfc->reg_base + NFC_REG_CMD);

meson_nfc_drain_cmd(nfc);
--
2.35.0

2023-04-12 06:25:51

by Arseniy Krasnov

[permalink] [raw]
Subject: [PATCH v1 5/5] mtd: rawnand: meson: remove unneeded bitwise OR with zeroes

Both operations have no effect.

Signed-off-by: Arseniy Krasnov <[email protected]>
---
drivers/mtd/nand/raw/meson_nand.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
index f2f2472cb511..f0486aba5f41 100644
--- a/drivers/mtd/nand/raw/meson_nand.c
+++ b/drivers/mtd/nand/raw/meson_nand.c
@@ -609,12 +609,12 @@ static int meson_nfc_rw_cmd_prepare_and_execute(struct nand_chip *nand,
cmd0 = in ? NAND_CMD_READ0 : NAND_CMD_SEQIN;
nfc->cmdfifo.rw.cmd0 = cs | NFC_CMD_CLE | cmd0;

- addrs[0] = cs | NFC_CMD_ALE | 0;
+ addrs[0] = cs | NFC_CMD_ALE;
if (mtd->writesize <= 512) {
cmd_num--;
row_start = 1;
} else {
- addrs[1] = cs | NFC_CMD_ALE | 0;
+ addrs[1] = cs | NFC_CMD_ALE;
row_start = 2;
}

--
2.35.0

2023-04-12 06:28:33

by Arseniy Krasnov

[permalink] [raw]
Subject: [PATCH v1 3/5] mtd: rawnand: meson: check buffer length

With this chip, buffer length is limited by hardware, so check it before
command execution to avoid length trim.

Signed-off-by: Arseniy Krasnov <[email protected]>
---
drivers/mtd/nand/raw/meson_nand.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
index 45b53d420aed..f84a10238e4d 100644
--- a/drivers/mtd/nand/raw/meson_nand.c
+++ b/drivers/mtd/nand/raw/meson_nand.c
@@ -543,6 +543,9 @@ static int meson_nfc_read_buf(struct nand_chip *nand, u8 *buf, int len)
u32 cmd;
u8 *info;

+ if (len > NFC_CMD_RAW_LEN)
+ return -EINVAL;
+
info = kzalloc(PER_INFO_BYTE, GFP_KERNEL);
if (!info)
return -ENOMEM;
@@ -571,6 +574,9 @@ static int meson_nfc_write_buf(struct nand_chip *nand, u8 *buf, int len)
int ret = 0;
u32 cmd;

+ if (len > NFC_CMD_RAW_LEN)
+ return -EINVAL;
+
ret = meson_nfc_dma_buffer_setup(nand, buf, len, NULL,
0, DMA_TO_DEVICE);
if (ret)
@@ -1269,6 +1275,7 @@ static int meson_nand_attach_chip(struct nand_chip *nand)
struct meson_nfc_nand_chip *meson_chip = to_meson_nand(nand);
struct mtd_info *mtd = nand_to_mtd(nand);
int nsectors = mtd->writesize / 1024;
+ int raw_writesize;
int ret;

if (!mtd->name) {
@@ -1280,6 +1287,13 @@ static int meson_nand_attach_chip(struct nand_chip *nand)
return -ENOMEM;
}

+ raw_writesize = mtd->writesize + mtd->oobsize;
+ if (raw_writesize > NFC_CMD_RAW_LEN) {
+ dev_err(nfc->dev, "too big write size in raw mode: %d > %ld\n",
+ raw_writesize, NFC_CMD_RAW_LEN);
+ return -EINVAL;
+ }
+
if (nand->bbt_options & NAND_BBT_USE_FLASH)
nand->bbt_options |= NAND_BBT_NO_OOB;

--
2.35.0

2023-04-12 06:31:50

by Arseniy Krasnov

[permalink] [raw]
Subject: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

This NAND reads only few user's bytes in ECC mode (not full OOB), so
fill OOB buffer with zeroes to not return garbage from previous reads
to user.
Otherwise 'nanddump' utility prints something like this for just erased
page:

...
0x000007f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
OOB Data: ff ff ff ff 00 00 ff ff 80 cf 22 99 cb ad d3 be
OOB Data: 63 27 ae 06 16 0a 2f eb bb dd 46 74 41 8e 88 6e
OOB Data: 38 a1 2d e6 77 d4 05 06 f2 a5 7e 25 eb 34 7c ff
OOB Data: 38 ea de 14 10 de 9b 40 33 16 6a cc 9d aa 2f 5e

Signed-off-by: Arseniy Krasnov <[email protected]>
---
drivers/mtd/nand/raw/meson_nand.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
index f84a10238e4d..f2f2472cb511 100644
--- a/drivers/mtd/nand/raw/meson_nand.c
+++ b/drivers/mtd/nand/raw/meson_nand.c
@@ -858,9 +858,12 @@ static int meson_nfc_read_page_sub(struct nand_chip *nand,
static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
int oob_required, int page)
{
+ struct mtd_info *mtd = nand_to_mtd(nand);
u8 *oob_buf = nand->oob_poi;
int ret;

+ memset(oob_buf, 0, mtd->oobsize);
+
ret = meson_nfc_read_page_sub(nand, page, 1);
if (ret)
return ret;
@@ -881,6 +884,8 @@ static int meson_nfc_read_page_hwecc(struct nand_chip *nand, u8 *buf,
u8 *oob_buf = nand->oob_poi;
int ret, i;

+ memset(oob_buf, 0, mtd->oobsize);
+
ret = meson_nfc_read_page_sub(nand, page, 0);
if (ret)
return ret;
--
2.35.0

2023-04-12 07:43:29

by Neil Armstrong

[permalink] [raw]
Subject: Re: [PATCH v1 2/5] mtd: rawnand: meson: replace GENMASK() macro with define

Hi,

On 12/04/2023 08:16, Arseniy Krasnov wrote:

Please add a commit message

> Signed-off-by: Arseniy Krasnov <[email protected]>
> ---
> drivers/mtd/nand/raw/meson_nand.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
> index 256c37c76526..45b53d420aed 100644
> --- a/drivers/mtd/nand/raw/meson_nand.c
> +++ b/drivers/mtd/nand/raw/meson_nand.c
> @@ -110,6 +110,8 @@
>
> #define PER_INFO_BYTE 8
>
> +#define NFC_CMD_RAW_LEN (GENMASK(13, 0))

Drop the () around GENMASK(13, 0)

> +
> struct meson_nfc_nand_chip {
> struct list_head node;
> struct nand_chip nand;
> @@ -300,7 +302,7 @@ static void meson_nfc_cmd_access(struct nand_chip *nand, int raw, bool dir,
>
> if (raw) {
> len = mtd->writesize + mtd->oobsize;
> - cmd = (len & GENMASK(13, 0)) | scrambler | DMA_DIR(dir);
> + cmd = (len & NFC_CMD_RAW_LEN) | scrambler | DMA_DIR(dir);
> writel(cmd, nfc->reg_base + NFC_REG_CMD);
> return;
> }
> @@ -550,7 +552,7 @@ static int meson_nfc_read_buf(struct nand_chip *nand, u8 *buf, int len)
> if (ret)
> goto out;
>
> - cmd = NFC_CMD_N2M | (len & GENMASK(13, 0));
> + cmd = NFC_CMD_N2M | (len & NFC_CMD_RAW_LEN);
> writel(cmd, nfc->reg_base + NFC_REG_CMD);
>
> meson_nfc_drain_cmd(nfc);
> @@ -574,7 +576,7 @@ static int meson_nfc_write_buf(struct nand_chip *nand, u8 *buf, int len)
> if (ret)
> return ret;
>
> - cmd = NFC_CMD_M2N | (len & GENMASK(13, 0));
> + cmd = NFC_CMD_M2N | (len & NFC_CMD_RAW_LEN);
> writel(cmd, nfc->reg_base + NFC_REG_CMD);
>
> meson_nfc_drain_cmd(nfc);

Thanks,
Neil

2023-04-12 07:44:05

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH v1 3/5] mtd: rawnand: meson: check buffer length

Hi Arseniy,

[email protected] wrote on Wed, 12 Apr 2023 09:16:57 +0300:

> With this chip, buffer length is limited by hardware, so check it before
> command execution to avoid length trim.

What is "this chip"? Are you talking about the controller itself? Is
there any specific SoC version targeted?

>
> Signed-off-by: Arseniy Krasnov <[email protected]>
> ---
> drivers/mtd/nand/raw/meson_nand.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
> index 45b53d420aed..f84a10238e4d 100644
> --- a/drivers/mtd/nand/raw/meson_nand.c
> +++ b/drivers/mtd/nand/raw/meson_nand.c
> @@ -543,6 +543,9 @@ static int meson_nfc_read_buf(struct nand_chip *nand, u8 *buf, int len)
> u32 cmd;
> u8 *info;
>
> + if (len > NFC_CMD_RAW_LEN)
> + return -EINVAL;
> +
> info = kzalloc(PER_INFO_BYTE, GFP_KERNEL);
> if (!info)
> return -ENOMEM;
> @@ -571,6 +574,9 @@ static int meson_nfc_write_buf(struct nand_chip *nand, u8 *buf, int len)
> int ret = 0;
> u32 cmd;
>
> + if (len > NFC_CMD_RAW_LEN)
> + return -EINVAL;
> +
> ret = meson_nfc_dma_buffer_setup(nand, buf, len, NULL,
> 0, DMA_TO_DEVICE);
> if (ret)
> @@ -1269,6 +1275,7 @@ static int meson_nand_attach_chip(struct nand_chip *nand)
> struct meson_nfc_nand_chip *meson_chip = to_meson_nand(nand);
> struct mtd_info *mtd = nand_to_mtd(nand);
> int nsectors = mtd->writesize / 1024;
> + int raw_writesize;
> int ret;
>
> if (!mtd->name) {
> @@ -1280,6 +1287,13 @@ static int meson_nand_attach_chip(struct nand_chip *nand)
> return -ENOMEM;
> }
>
> + raw_writesize = mtd->writesize + mtd->oobsize;
> + if (raw_writesize > NFC_CMD_RAW_LEN) {
> + dev_err(nfc->dev, "too big write size in raw mode: %d > %ld\n",
> + raw_writesize, NFC_CMD_RAW_LEN);
> + return -EINVAL;
> + }
> +
> if (nand->bbt_options & NAND_BBT_USE_FLASH)
> nand->bbt_options |= NAND_BBT_NO_OOB;
>


Thanks,
Miquèl

2023-04-12 07:47:02

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi Arseniy,

[email protected] wrote on Wed, 12 Apr 2023 09:16:58 +0300:

> This NAND reads only few user's bytes in ECC mode (not full OOB), so

"This NAND reads" does not look right, do you mean "Subpage reads do
not retrieve all the OOB bytes,"?

> fill OOB buffer with zeroes to not return garbage from previous reads
> to user.
> Otherwise 'nanddump' utility prints something like this for just erased
> page:
>
> ...
> 0x000007f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> OOB Data: ff ff ff ff 00 00 ff ff 80 cf 22 99 cb ad d3 be
> OOB Data: 63 27 ae 06 16 0a 2f eb bb dd 46 74 41 8e 88 6e
> OOB Data: 38 a1 2d e6 77 d4 05 06 f2 a5 7e 25 eb 34 7c ff
> OOB Data: 38 ea de 14 10 de 9b 40 33 16 6a cc 9d aa 2f 5e
>
> Signed-off-by: Arseniy Krasnov <[email protected]>
> ---
> drivers/mtd/nand/raw/meson_nand.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
> index f84a10238e4d..f2f2472cb511 100644
> --- a/drivers/mtd/nand/raw/meson_nand.c
> +++ b/drivers/mtd/nand/raw/meson_nand.c
> @@ -858,9 +858,12 @@ static int meson_nfc_read_page_sub(struct nand_chip *nand,
> static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
> int oob_required, int page)
> {
> + struct mtd_info *mtd = nand_to_mtd(nand);
> u8 *oob_buf = nand->oob_poi;
> int ret;
>
> + memset(oob_buf, 0, mtd->oobsize);

I'm surprised raw reads do not read the entire OOB?

> +
> ret = meson_nfc_read_page_sub(nand, page, 1);
> if (ret)
> return ret;
> @@ -881,6 +884,8 @@ static int meson_nfc_read_page_hwecc(struct nand_chip *nand, u8 *buf,
> u8 *oob_buf = nand->oob_poi;
> int ret, i;
>
> + memset(oob_buf, 0, mtd->oobsize);
> +
> ret = meson_nfc_read_page_sub(nand, page, 0);
> if (ret)
> return ret;


Thanks,
Miquèl

2023-04-12 07:47:36

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH v1 0/5] refactoring and fix for Meson NAND

Hi Arseniy,

[email protected] wrote on Wed, 12 Apr 2023 09:16:54 +0300:

> Hello,
>
> this patchset adds one fix and several refactoring patches. First patch
> is the most important - it fixes unstable behaviour of Meson driver, for
> example random ECC errors during reads. I've tested this with mount/
> unmount/read/write cases for JFFS2 and nanddump/nandwrite utlities on
> AXG family (A113X SoC). Source of this update is old vendor's driver.
>
> Other patches are refactoring and extra checks, not critical for this
> driver's reliability.

Thanks for the series, minor comments inside, if you respin before the
end of the week I'll take this for the next merge window.

>
> Arseniy Krasnov (5):
> mtd: rawnand: meson: fix NAND access for read/write
> mtd: rawnand: meson: replace GENMASK() macro with define
> mtd: rawnand: meson: check buffer length
> mtd: rawnand: meson: clear OOB buffer before read
> mtd: rawnand: meson: remove unneeded bitwise OR with zeroes
>
> drivers/mtd/nand/raw/meson_nand.c | 147 ++++++++++++++++++++++++++----
> 1 file changed, 127 insertions(+), 20 deletions(-)
>


Thanks,
Miquèl

2023-04-12 07:48:14

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi again,

[email protected] wrote on Wed, 12 Apr 2023 09:44:00 +0200:

> Hi Arseniy,
>
> [email protected] wrote on Wed, 12 Apr 2023 09:16:58 +0300:
>
> > This NAND reads only few user's bytes in ECC mode (not full OOB), so
>
> "This NAND reads" does not look right, do you mean "Subpage reads do
> not retrieve all the OOB bytes,"?
>
> > fill OOB buffer with zeroes to not return garbage from previous reads
> > to user.
> > Otherwise 'nanddump' utility prints something like this for just erased
> > page:
> >
> > ...
> > 0x000007f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> > OOB Data: ff ff ff ff 00 00 ff ff 80 cf 22 99 cb ad d3 be
> > OOB Data: 63 27 ae 06 16 0a 2f eb bb dd 46 74 41 8e 88 6e
> > OOB Data: 38 a1 2d e6 77 d4 05 06 f2 a5 7e 25 eb 34 7c ff
> > OOB Data: 38 ea de 14 10 de 9b 40 33 16 6a cc 9d aa 2f 5e
> >
> > Signed-off-by: Arseniy Krasnov <[email protected]>
> > ---
> > drivers/mtd/nand/raw/meson_nand.c | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
> > index f84a10238e4d..f2f2472cb511 100644
> > --- a/drivers/mtd/nand/raw/meson_nand.c
> > +++ b/drivers/mtd/nand/raw/meson_nand.c
> > @@ -858,9 +858,12 @@ static int meson_nfc_read_page_sub(struct nand_chip *nand,
> > static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
> > int oob_required, int page)
> > {
> > + struct mtd_info *mtd = nand_to_mtd(nand);
> > u8 *oob_buf = nand->oob_poi;
> > int ret;
> >
> > + memset(oob_buf, 0, mtd->oobsize);

Should use 0xff instead of 0x00, that's the default state of a NAND
cell.

And also this memset should be conditioned to 'oob_required' I
guess?

>
> I'm surprised raw reads do not read the entire OOB?
>
> > +
> > ret = meson_nfc_read_page_sub(nand, page, 1);
> > if (ret)
> > return ret;
> > @@ -881,6 +884,8 @@ static int meson_nfc_read_page_hwecc(struct nand_chip *nand, u8 *buf,
> > u8 *oob_buf = nand->oob_poi;
> > int ret, i;
> >
> > + memset(oob_buf, 0, mtd->oobsize);
> > +
> > ret = meson_nfc_read_page_sub(nand, page, 0);
> > if (ret)
> > return ret;
>
>
> Thanks,
> Miquèl


Thanks,
Miquèl

2023-04-12 09:27:15

by Arseniy Krasnov

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read



On 12.04.2023 10:44, Miquel Raynal wrote:
> Hi Arseniy,
>
> [email protected] wrote on Wed, 12 Apr 2023 09:16:58 +0300:
>
>> This NAND reads only few user's bytes in ECC mode (not full OOB), so
>
> "This NAND reads" does not look right, do you mean "Subpage reads do
> not retrieve all the OOB bytes,"?
>
>> fill OOB buffer with zeroes to not return garbage from previous reads
>> to user.
>> Otherwise 'nanddump' utility prints something like this for just erased
>> page:
>>
>> ...
>> 0x000007f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>> OOB Data: ff ff ff ff 00 00 ff ff 80 cf 22 99 cb ad d3 be
>> OOB Data: 63 27 ae 06 16 0a 2f eb bb dd 46 74 41 8e 88 6e
>> OOB Data: 38 a1 2d e6 77 d4 05 06 f2 a5 7e 25 eb 34 7c ff
>> OOB Data: 38 ea de 14 10 de 9b 40 33 16 6a cc 9d aa 2f 5e
>>
>> Signed-off-by: Arseniy Krasnov <[email protected]>
>> ---
>> drivers/mtd/nand/raw/meson_nand.c | 5 +++++
>> 1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
>> index f84a10238e4d..f2f2472cb511 100644
>> --- a/drivers/mtd/nand/raw/meson_nand.c
>> +++ b/drivers/mtd/nand/raw/meson_nand.c
>> @@ -858,9 +858,12 @@ static int meson_nfc_read_page_sub(struct nand_chip *nand,
>> static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
>> int oob_required, int page)
>> {
>> + struct mtd_info *mtd = nand_to_mtd(nand);
>> u8 *oob_buf = nand->oob_poi;
>> int ret;
>>
>> + memset(oob_buf, 0, mtd->oobsize);
>
> I'm surprised raw reads do not read the entire OOB?

Yes! Seems in case of raw access (what i see in this driver) number of OOB bytes read
still depends on ECC parameters: for each portion of data covered with ECC code we can
read it's ECC code and "user bytes" from OOB - it is what i see by dumping DMA buffer by
printk(). For example I'm working with 2K NAND pages, each page has 2 x 1K ECC blocks.
For each ECC block I have 16 OOB bytes which I can access by read/write. Each 16 bytes
contains 2 bytes of user's data and 14 bytes ECC codes. So when I read page in raw mode
controller returns 32 bytes (2 x (2 + 14)) of OOB. While OOB is reported as 64 bytes.

Thanks, Arseniy

>
>> +
>> ret = meson_nfc_read_page_sub(nand, page, 1);
>> if (ret)
>> return ret;
>> @@ -881,6 +884,8 @@ static int meson_nfc_read_page_hwecc(struct nand_chip *nand, u8 *buf,
>> u8 *oob_buf = nand->oob_poi;
>> int ret, i;
>>
>> + memset(oob_buf, 0, mtd->oobsize);
>> +
>> ret = meson_nfc_read_page_sub(nand, page, 0);
>> if (ret)
>> return ret;
>
>
> Thanks,
> Miquèl

2023-04-12 09:37:49

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi Arseniy,

[email protected] wrote on Wed, 12 Apr 2023 12:20:55 +0300:

> On 12.04.2023 10:44, Miquel Raynal wrote:
> > Hi Arseniy,
> >
> > [email protected] wrote on Wed, 12 Apr 2023 09:16:58 +0300:
> >
> >> This NAND reads only few user's bytes in ECC mode (not full OOB), so
> >
> > "This NAND reads" does not look right, do you mean "Subpage reads do
> > not retrieve all the OOB bytes,"?
> >
> >> fill OOB buffer with zeroes to not return garbage from previous reads
> >> to user.
> >> Otherwise 'nanddump' utility prints something like this for just erased
> >> page:
> >>
> >> ...
> >> 0x000007f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> >> OOB Data: ff ff ff ff 00 00 ff ff 80 cf 22 99 cb ad d3 be
> >> OOB Data: 63 27 ae 06 16 0a 2f eb bb dd 46 74 41 8e 88 6e
> >> OOB Data: 38 a1 2d e6 77 d4 05 06 f2 a5 7e 25 eb 34 7c ff
> >> OOB Data: 38 ea de 14 10 de 9b 40 33 16 6a cc 9d aa 2f 5e
> >>
> >> Signed-off-by: Arseniy Krasnov <[email protected]>
> >> ---
> >> drivers/mtd/nand/raw/meson_nand.c | 5 +++++
> >> 1 file changed, 5 insertions(+)
> >>
> >> diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
> >> index f84a10238e4d..f2f2472cb511 100644
> >> --- a/drivers/mtd/nand/raw/meson_nand.c
> >> +++ b/drivers/mtd/nand/raw/meson_nand.c
> >> @@ -858,9 +858,12 @@ static int meson_nfc_read_page_sub(struct nand_chip *nand,
> >> static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
> >> int oob_required, int page)
> >> {
> >> + struct mtd_info *mtd = nand_to_mtd(nand);
> >> u8 *oob_buf = nand->oob_poi;
> >> int ret;
> >>
> >> + memset(oob_buf, 0, mtd->oobsize);
> >
> > I'm surprised raw reads do not read the entire OOB?
>
> Yes! Seems in case of raw access (what i see in this driver) number of OOB bytes read
> still depends on ECC parameters: for each portion of data covered with ECC code we can
> read it's ECC code and "user bytes" from OOB - it is what i see by dumping DMA buffer by
> printk(). For example I'm working with 2K NAND pages, each page has 2 x 1K ECC blocks.
> For each ECC block I have 16 OOB bytes which I can access by read/write. Each 16 bytes
> contains 2 bytes of user's data and 14 bytes ECC codes. So when I read page in raw mode
> controller returns 32 bytes (2 x (2 + 14)) of OOB. While OOB is reported as 64 bytes.

In all modes, when you read OOB, you should get the full OOB. The fact
that ECC correction is enabled or disabled does not matter. If the NAND
features OOB sections of 64 bytes, you should get the 64 bytes.

What happens sometimes, is that some of the bytes are not protected
against bitflips, but the policy is to return the full buffer.

>
> Thanks, Arseniy
>
> >
> >> +
> >> ret = meson_nfc_read_page_sub(nand, page, 1);
> >> if (ret)
> >> return ret;
> >> @@ -881,6 +884,8 @@ static int meson_nfc_read_page_hwecc(struct nand_chip *nand, u8 *buf,
> >> u8 *oob_buf = nand->oob_poi;
> >> int ret, i;
> >>
> >> + memset(oob_buf, 0, mtd->oobsize);
> >> +
> >> ret = meson_nfc_read_page_sub(nand, page, 0);
> >> if (ret)
> >> return ret;
> >
> >
> > Thanks,
> > Miquèl


Thanks,
Miquèl

2023-04-12 10:10:33

by David Laight

[permalink] [raw]
Subject: RE: [PATCH v1 2/5] mtd: rawnand: meson: replace GENMASK() macro with define

From: Arseniy Krasnov
> Sent: 12 April 2023 07:17
> Subject: [PATCH v1 2/5] mtd: rawnand: meson: replace GENMASK() macro with define
>
...
> len = mtd->writesize + mtd->oobsize;
> - cmd = (len & GENMASK(13, 0)) | scrambler | DMA_DIR(dir);
> + cmd = (len & NFC_CMD_RAW_LEN) | scrambler | DMA_DIR(dir);
> writel(cmd, nfc->reg_base + NFC_REG_CMD);

What is the point of the mask?
If the length is too big it just generates a different
'hard to debug' error.
If the length can't be too bit there is no point masking it.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2023-04-12 10:16:51

by Arseniy Krasnov

[permalink] [raw]
Subject: Re: [PATCH v1 2/5] mtd: rawnand: meson: replace GENMASK() macro with define



On 12.04.2023 13:06, David Laight wrote:
> From: Arseniy Krasnov
>> Sent: 12 April 2023 07:17
>> Subject: [PATCH v1 2/5] mtd: rawnand: meson: replace GENMASK() macro with define
>>
> ...
>> len = mtd->writesize + mtd->oobsize;
>> - cmd = (len & GENMASK(13, 0)) | scrambler | DMA_DIR(dir);
>> + cmd = (len & NFC_CMD_RAW_LEN) | scrambler | DMA_DIR(dir);
>> writel(cmd, nfc->reg_base + NFC_REG_CMD);
>
> What is the point of the mask?
> If the length is too big it just generates a different
> 'hard to debug' error.
> If the length can't be too bit there is no point masking it.

Exactly, thanks for pointing!

Thanks, Arseniy

>
> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>

2023-04-12 10:30:49

by Arseniy Krasnov

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read



On 12.04.2023 12:36, Miquel Raynal wrote:
> Hi Arseniy,
>
> [email protected] wrote on Wed, 12 Apr 2023 12:20:55 +0300:
>
>> On 12.04.2023 10:44, Miquel Raynal wrote:
>>> Hi Arseniy,
>>>
>>> [email protected] wrote on Wed, 12 Apr 2023 09:16:58 +0300:
>>>
>>>> This NAND reads only few user's bytes in ECC mode (not full OOB), so
>>>
>>> "This NAND reads" does not look right, do you mean "Subpage reads do
>>> not retrieve all the OOB bytes,"?
>>>
>>>> fill OOB buffer with zeroes to not return garbage from previous reads
>>>> to user.
>>>> Otherwise 'nanddump' utility prints something like this for just erased
>>>> page:
>>>>
>>>> ...
>>>> 0x000007f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>>>> OOB Data: ff ff ff ff 00 00 ff ff 80 cf 22 99 cb ad d3 be
>>>> OOB Data: 63 27 ae 06 16 0a 2f eb bb dd 46 74 41 8e 88 6e
>>>> OOB Data: 38 a1 2d e6 77 d4 05 06 f2 a5 7e 25 eb 34 7c ff
>>>> OOB Data: 38 ea de 14 10 de 9b 40 33 16 6a cc 9d aa 2f 5e
>>>>
>>>> Signed-off-by: Arseniy Krasnov <[email protected]>
>>>> ---
>>>> drivers/mtd/nand/raw/meson_nand.c | 5 +++++
>>>> 1 file changed, 5 insertions(+)
>>>>
>>>> diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
>>>> index f84a10238e4d..f2f2472cb511 100644
>>>> --- a/drivers/mtd/nand/raw/meson_nand.c
>>>> +++ b/drivers/mtd/nand/raw/meson_nand.c
>>>> @@ -858,9 +858,12 @@ static int meson_nfc_read_page_sub(struct nand_chip *nand,
>>>> static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
>>>> int oob_required, int page)
>>>> {
>>>> + struct mtd_info *mtd = nand_to_mtd(nand);
>>>> u8 *oob_buf = nand->oob_poi;
>>>> int ret;
>>>>
>>>> + memset(oob_buf, 0, mtd->oobsize);
>>>
>>> I'm surprised raw reads do not read the entire OOB?
>>
>> Yes! Seems in case of raw access (what i see in this driver) number of OOB bytes read
>> still depends on ECC parameters: for each portion of data covered with ECC code we can
>> read it's ECC code and "user bytes" from OOB - it is what i see by dumping DMA buffer by
>> printk(). For example I'm working with 2K NAND pages, each page has 2 x 1K ECC blocks.
>> For each ECC block I have 16 OOB bytes which I can access by read/write. Each 16 bytes
>> contains 2 bytes of user's data and 14 bytes ECC codes. So when I read page in raw mode
>> controller returns 32 bytes (2 x (2 + 14)) of OOB. While OOB is reported as 64 bytes.
>
> In all modes, when you read OOB, you should get the full OOB. The fact
> that ECC correction is enabled or disabled does not matter. If the NAND
> features OOB sections of 64 bytes, you should get the 64 bytes.
>
> What happens sometimes, is that some of the bytes are not protected
> against bitflips, but the policy is to return the full buffer.

Ok, so to clarify case for this NAND controller:
1) In both ECC and raw modes i need to return the same raw OOB data (e.g. user bytes
+ ECC codes)?
2) If I have access to only 32 bytes of OOB (in case above), I must report that size
of OOB is only 32 bytes during initialization?

Thanks, Arseniy

>
>>
>> Thanks, Arseniy
>>
>>>
>>>> +
>>>> ret = meson_nfc_read_page_sub(nand, page, 1);
>>>> if (ret)
>>>> return ret;
>>>> @@ -881,6 +884,8 @@ static int meson_nfc_read_page_hwecc(struct nand_chip *nand, u8 *buf,
>>>> u8 *oob_buf = nand->oob_poi;
>>>> int ret, i;
>>>>
>>>> + memset(oob_buf, 0, mtd->oobsize);
>>>> +
>>>> ret = meson_nfc_read_page_sub(nand, page, 0);
>>>> if (ret)
>>>> return ret;
>>>
>>>
>>> Thanks,
>>> Miquèl
>
>
> Thanks,
> Miquèl

2023-04-12 11:02:29

by Liang Yang

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi Arseniy and Miquel,

On 2023/4/12 18:14, Arseniy Krasnov wrote:
> [ EXTERNAL EMAIL ]
>
>
>
> On 12.04.2023 12:36, Miquel Raynal wrote:
>> Hi Arseniy,
>>
>> [email protected] wrote on Wed, 12 Apr 2023 12:20:55 +0300:
>>
>>> On 12.04.2023 10:44, Miquel Raynal wrote:
>>>> Hi Arseniy,
>>>>
>>>> [email protected] wrote on Wed, 12 Apr 2023 09:16:58 +0300:
>>>>
>>>>> This NAND reads only few user's bytes in ECC mode (not full OOB), so
>>>>
>>>> "This NAND reads" does not look right, do you mean "Subpage reads do
>>>> not retrieve all the OOB bytes,"?
>>>>
>>>>> fill OOB buffer with zeroes to not return garbage from previous reads
>>>>> to user.
>>>>> Otherwise 'nanddump' utility prints something like this for just erased
>>>>> page:
>>>>>
>>>>> ...
>>>>> 0x000007f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>>>>> OOB Data: ff ff ff ff 00 00 ff ff 80 cf 22 99 cb ad d3 be
>>>>> OOB Data: 63 27 ae 06 16 0a 2f eb bb dd 46 74 41 8e 88 6e
>>>>> OOB Data: 38 a1 2d e6 77 d4 05 06 f2 a5 7e 25 eb 34 7c ff
>>>>> OOB Data: 38 ea de 14 10 de 9b 40 33 16 6a cc 9d aa 2f 5e
>>>>>
>>>>> Signed-off-by: Arseniy Krasnov <[email protected]>
>>>>> ---
>>>>> drivers/mtd/nand/raw/meson_nand.c | 5 +++++
>>>>> 1 file changed, 5 insertions(+)
>>>>>
>>>>> diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
>>>>> index f84a10238e4d..f2f2472cb511 100644
>>>>> --- a/drivers/mtd/nand/raw/meson_nand.c
>>>>> +++ b/drivers/mtd/nand/raw/meson_nand.c
>>>>> @@ -858,9 +858,12 @@ static int meson_nfc_read_page_sub(struct nand_chip *nand,
>>>>> static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
>>>>> int oob_required, int page)
>>>>> {
>>>>> + struct mtd_info *mtd = nand_to_mtd(nand);
>>>>> u8 *oob_buf = nand->oob_poi;
>>>>> int ret;
>>>>>
>>>>> + memset(oob_buf, 0, mtd->oobsize);
>>>>
>>>> I'm surprised raw reads do not read the entire OOB?
>>>
>>> Yes! Seems in case of raw access (what i see in this driver) number of OOB bytes read
>>> still depends on ECC parameters: for each portion of data covered with ECC code we can
>>> read it's ECC code and "user bytes" from OOB - it is what i see by dumping DMA buffer by
>>> printk(). For example I'm working with 2K NAND pages, each page has 2 x 1K ECC blocks.
>>> For each ECC block I have 16 OOB bytes which I can access by read/write. Each 16 bytes
>>> contains 2 bytes of user's data and 14 bytes ECC codes. So when I read page in raw mode
>>> controller returns 32 bytes (2 x (2 + 14)) of OOB. While OOB is reported as 64 bytes.
>>
>> In all modes, when you read OOB, you should get the full OOB. The fact
>> that ECC correction is enabled or disabled does not matter. If the NAND
>> features OOB sections of 64 bytes, you should get the 64 bytes.
>>
>> What happens sometimes, is that some of the bytes are not protected
>> against bitflips, but the policy is to return the full buffer.
>
> Ok, so to clarify case for this NAND controller:
> 1) In both ECC and raw modes i need to return the same raw OOB data (e.g. user bytes
> + ECC codes)?
> 2) If I have access to only 32 bytes of OOB (in case above), I must report that size
> of OOB is only 32 bytes during initialization?
>
> Thanks, Arseniy

Yes. it should return all the OOB data. i make a mistake on raw read and
there is wrong code in meson_nfc_read_page_raw().
meson_nfc_get_data_oob(nand, buf, oob_buf);
changed to:
if (oob_required)
memcpy(oob_buf, buf + mtd->writesize, mtd->oobsize)

for the ECC mode, i define the meson_ooblayout_ops in host driver.

>
>>
>>>
>>> Thanks, Arseniy
>>>
>>>>
>>>>> +
>>>>> ret = meson_nfc_read_page_sub(nand, page, 1);
>>>>> if (ret)
>>>>> return ret;
>>>>> @@ -881,6 +884,8 @@ static int meson_nfc_read_page_hwecc(struct nand_chip *nand, u8 *buf,
>>>>> u8 *oob_buf = nand->oob_poi;
>>>>> int ret, i;
>>>>>
>>>>> + memset(oob_buf, 0, mtd->oobsize);
>>>>> +
>>>>> ret = meson_nfc_read_page_sub(nand, page, 0);
>>>>> if (ret)
>>>>> return ret;
>>>>
>>>>
>>>> Thanks,
>>>> Miquèl
>>
>>
>> Thanks,
>> Miquèl
>

--
Thanks,
Liang

2023-04-12 11:50:21

by Liang Yang

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi,

On 2023/4/12 18:51, Liang Yang wrote:
>>>>>> diff --git a/drivers/mtd/nand/raw/meson_nand.c
>>>>>> b/drivers/mtd/nand/raw/meson_nand.c
>>>>>> index f84a10238e4d..f2f2472cb511 100644
>>>>>> --- a/drivers/mtd/nand/raw/meson_nand.c
>>>>>> +++ b/drivers/mtd/nand/raw/meson_nand.c
>>>>>> @@ -858,9 +858,12 @@ static int meson_nfc_read_page_sub(struct
>>>>>> nand_chip *nand,
>>>>>>   static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
>>>>>>                      int oob_required, int page)
>>>>>>   {
>>>>>> +    struct mtd_info *mtd = nand_to_mtd(nand);
>>>>>>       u8 *oob_buf = nand->oob_poi;
>>>>>>       int ret;
>>>>>> +    memset(oob_buf, 0, mtd->oobsize);
>>>>>
>>>>> I'm surprised raw reads do not read the entire OOB?
>>>>
>>>> Yes! Seems in case of raw access (what i see in this driver) number
>>>> of OOB bytes read
>>>> still depends on ECC parameters: for each portion of data covered
>>>> with ECC code we can
>>>> read it's ECC code and "user bytes" from OOB - it is what i see by
>>>> dumping DMA buffer by
>>>> printk(). For example I'm working with 2K NAND pages, each page has
>>>> 2 x 1K ECC blocks.
>>>> For each ECC block I have 16 OOB bytes which I can access by
>>>> read/write. Each 16 bytes
>>>> contains 2 bytes of user's data and 14 bytes ECC codes. So when I
>>>> read page in raw mode
>>>> controller returns 32 bytes (2 x (2 + 14)) of OOB. While OOB is
>>>> reported as 64 bytes.
>>>
>>> In all modes, when you read OOB, you should get the full OOB. The fact
>>> that ECC correction is enabled or disabled does not matter. If the NAND
>>> features OOB sections of 64 bytes, you should get the 64 bytes.
>>>
>>> What happens sometimes, is that some of the bytes are not protected
>>> against bitflips, but the policy is to return the full buffer.
>>
>> Ok, so to clarify case for this NAND controller:
>> 1) In both ECC and raw modes i need to return the same raw OOB data
>> (e.g. user bytes
>>     + ECC codes)?
>> 2) If I have access to only 32 bytes of OOB (in case above), I must
>> report that size
>>     of OOB is only 32 bytes during initialization?
>>
>> Thanks, Arseniy
>
> Yes. it should return all the OOB data. i make a mistake on raw read and
> there is wrong code in meson_nfc_read_page_raw().
>     meson_nfc_get_data_oob(nand, buf, oob_buf);
> changed to:
>     if (oob_required)
>         memcpy(oob_buf, buf + mtd->writesize, mtd->oobsize)

Sorry, please ignore this. the previous code is right.

the controller changes the layout of one page; the physical layout is
2048 main data + 64 oob data. after writing into NAND page, it is stored
like this: 1024 main data + 2 user bytes + 14 ECC parity bytes + 1024
main data + 2 user bytes + 14 ECC parity bytes. so that is right we only
get 4 user bytes and 28 ECC parity bytes, total 32 bytes. that is the
behavior of the controller that transferring one ECC page(1KB) brings
back only 2 user bytes.

because layout is changed by controller, so go back to the function.
meson_nfc_get_data_oob(nand, buf, oob_buf) try to get the right user and
ecc parity bytes from the right pos. after that, the other oob bytes is
not reading from NAND flash.


--
Thanks,
Liang

2023-04-12 11:51:51

by Dmitry Rokosov

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hello Liang,

On Wed, Apr 12, 2023 at 07:36:30PM +0800, Liang Yang wrote:
> Hi,
>
> On 2023/4/12 18:51, Liang Yang wrote:
> > > > > > > diff --git a/drivers/mtd/nand/raw/meson_nand.c
> > > > > > > b/drivers/mtd/nand/raw/meson_nand.c
> > > > > > > index f84a10238e4d..f2f2472cb511 100644
> > > > > > > --- a/drivers/mtd/nand/raw/meson_nand.c
> > > > > > > +++ b/drivers/mtd/nand/raw/meson_nand.c
> > > > > > > @@ -858,9 +858,12 @@ static int
> > > > > > > meson_nfc_read_page_sub(struct nand_chip *nand,
> > > > > > >   static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
> > > > > > >                      int oob_required, int page)
> > > > > > >   {
> > > > > > > +    struct mtd_info *mtd = nand_to_mtd(nand);
> > > > > > >       u8 *oob_buf = nand->oob_poi;
> > > > > > >       int ret;
> > > > > > > +    memset(oob_buf, 0, mtd->oobsize);
> > > > > >
> > > > > > I'm surprised raw reads do not read the entire OOB?
> > > > >
> > > > > Yes! Seems in case of raw access (what i see in this driver)
> > > > > number of OOB bytes read
> > > > > still depends on ECC parameters: for each portion of data
> > > > > covered with ECC code we can
> > > > > read it's ECC code and "user bytes" from OOB - it is what i
> > > > > see by dumping DMA buffer by
> > > > > printk(). For example I'm working with 2K NAND pages, each
> > > > > page has 2 x 1K ECC blocks.
> > > > > For each ECC block I have 16 OOB bytes which I can access by
> > > > > read/write. Each 16 bytes
> > > > > contains 2 bytes of user's data and 14 bytes ECC codes. So
> > > > > when I read page in raw mode
> > > > > controller returns 32 bytes (2 x (2 + 14)) of OOB. While OOB
> > > > > is reported as 64 bytes.
> > > >
> > > > In all modes, when you read OOB, you should get the full OOB. The fact
> > > > that ECC correction is enabled or disabled does not matter. If the NAND
> > > > features OOB sections of 64 bytes, you should get the 64 bytes.
> > > >
> > > > What happens sometimes, is that some of the bytes are not protected
> > > > against bitflips, but the policy is to return the full buffer.
> > >
> > > Ok, so to clarify case for this NAND controller:
> > > 1) In both ECC and raw modes i need to return the same raw OOB data
> > > (e.g. user bytes
> > >     + ECC codes)?
> > > 2) If I have access to only 32 bytes of OOB (in case above), I must
> > > report that size
> > >     of OOB is only 32 bytes during initialization?
> > >
> > > Thanks, Arseniy
> >
> > Yes. it should return all the OOB data. i make a mistake on raw read and
> > there is wrong code in meson_nfc_read_page_raw().
> >     meson_nfc_get_data_oob(nand, buf, oob_buf);
> > changed to:
> >     if (oob_required)
> >         memcpy(oob_buf, buf + mtd->writesize, mtd->oobsize)
>
> Sorry, please ignore this. the previous code is right.
>
> the controller changes the layout of one page; the physical layout is 2048
> main data + 64 oob data. after writing into NAND page, it is stored
> like this: 1024 main data + 2 user bytes + 14 ECC parity bytes + 1024 main
> data + 2 user bytes + 14 ECC parity bytes. so that is right we only get 4
> user bytes and 28 ECC parity bytes, total 32 bytes. that is the behavior of
> the controller that transferring one ECC page(1KB) brings back only 2 user
> bytes.
>
> because layout is changed by controller, so go back to the function.
> meson_nfc_get_data_oob(nand, buf, oob_buf) try to get the right user and ecc
> parity bytes from the right pos. after that, the other oob bytes is not
> reading from NAND flash.

I have always been under the impression that NAND OOB layout falls under the
responsibility of the flash driver. Is this specific to the Amlogic NAND,
and does it map the flash layout to the internal controller layout?
For example, different OOB layouts exist between Macronix and ESMT.

Apologies for any confusion, and thank you in advance for any help in
clarifying this matter.

--
Thank you,
Dmitry

2023-04-12 11:54:01

by Arseniy Krasnov

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read



On 12.04.2023 14:43, Dmitry Rokosov wrote:
> Hello Liang,
>
> On Wed, Apr 12, 2023 at 07:36:30PM +0800, Liang Yang wrote:
>> Hi,
>>
>> On 2023/4/12 18:51, Liang Yang wrote:
>>>>>>>> diff --git a/drivers/mtd/nand/raw/meson_nand.c
>>>>>>>> b/drivers/mtd/nand/raw/meson_nand.c
>>>>>>>> index f84a10238e4d..f2f2472cb511 100644
>>>>>>>> --- a/drivers/mtd/nand/raw/meson_nand.c
>>>>>>>> +++ b/drivers/mtd/nand/raw/meson_nand.c
>>>>>>>> @@ -858,9 +858,12 @@ static int
>>>>>>>> meson_nfc_read_page_sub(struct nand_chip *nand,
>>>>>>>>   static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
>>>>>>>>                      int oob_required, int page)
>>>>>>>>   {
>>>>>>>> +    struct mtd_info *mtd = nand_to_mtd(nand);
>>>>>>>>       u8 *oob_buf = nand->oob_poi;
>>>>>>>>       int ret;
>>>>>>>> +    memset(oob_buf, 0, mtd->oobsize);
>>>>>>>
>>>>>>> I'm surprised raw reads do not read the entire OOB?
>>>>>>
>>>>>> Yes! Seems in case of raw access (what i see in this driver)
>>>>>> number of OOB bytes read
>>>>>> still depends on ECC parameters: for each portion of data
>>>>>> covered with ECC code we can
>>>>>> read it's ECC code and "user bytes" from OOB - it is what i
>>>>>> see by dumping DMA buffer by
>>>>>> printk(). For example I'm working with 2K NAND pages, each
>>>>>> page has 2 x 1K ECC blocks.
>>>>>> For each ECC block I have 16 OOB bytes which I can access by
>>>>>> read/write. Each 16 bytes
>>>>>> contains 2 bytes of user's data and 14 bytes ECC codes. So
>>>>>> when I read page in raw mode
>>>>>> controller returns 32 bytes (2 x (2 + 14)) of OOB. While OOB
>>>>>> is reported as 64 bytes.
>>>>>
>>>>> In all modes, when you read OOB, you should get the full OOB. The fact
>>>>> that ECC correction is enabled or disabled does not matter. If the NAND
>>>>> features OOB sections of 64 bytes, you should get the 64 bytes.
>>>>>
>>>>> What happens sometimes, is that some of the bytes are not protected
>>>>> against bitflips, but the policy is to return the full buffer.
>>>>
>>>> Ok, so to clarify case for this NAND controller:
>>>> 1) In both ECC and raw modes i need to return the same raw OOB data
>>>> (e.g. user bytes
>>>>     + ECC codes)?
>>>> 2) If I have access to only 32 bytes of OOB (in case above), I must
>>>> report that size
>>>>     of OOB is only 32 bytes during initialization?
>>>>
>>>> Thanks, Arseniy
>>>
>>> Yes. it should return all the OOB data. i make a mistake on raw read and
>>> there is wrong code in meson_nfc_read_page_raw().
>>>     meson_nfc_get_data_oob(nand, buf, oob_buf);
>>> changed to:
>>>     if (oob_required)
>>>         memcpy(oob_buf, buf + mtd->writesize, mtd->oobsize)
>>
>> Sorry, please ignore this. the previous code is right.
>>
>> the controller changes the layout of one page; the physical layout is 2048
>> main data + 64 oob data. after writing into NAND page, it is stored
>> like this: 1024 main data + 2 user bytes + 14 ECC parity bytes + 1024 main
>> data + 2 user bytes + 14 ECC parity bytes. so that is right we only get 4
>> user bytes and 28 ECC parity bytes, total 32 bytes. that is the behavior of
>> the controller that transferring one ECC page(1KB) brings back only 2 user
>> bytes.

So i perform the following test, using file with the following pattern:
[ 1024 x 0x11 ][ 1024 x 0x22 ][ 16 x 0x33 ][ 16 x 0x44 ][ 16 x 0x55 ][ 16 x 0x66 ]

Total 2112 bytes (2048 data + 64 oob)

I write it using 'nandwrite' tool with flags '--oob' and '--noecc' - e.g. raw mode.
meson_nfc_set_data_oob() places to DMA buffer only parts with 0x11, 0x22, 0x33, 0x44.
After write data bytes will be 0x11 and 0x22, first 32 bytes of OOB will be 0x33 and 0x44.
Parts with 0x55 and 0x66 will be obviously missed.

So it is Amlogic limitation for 32 bytes of OOB (in case of 1K ECC as You mentioned above)?

And there is no way to access to the whole 64 bytes of OOB even in raw mode (and therefore in
ECC mode)?

Thanks, Arseniy

>>
>> because layout is changed by controller, so go back to the function.
>> meson_nfc_get_data_oob(nand, buf, oob_buf) try to get the right user and ecc
>> parity bytes from the right pos. after that, the other oob bytes is not
>> reading from NAND flash.
>
> I have always been under the impression that NAND OOB layout falls under the
> responsibility of the flash driver. Is this specific to the Amlogic NAND,
> and does it map the flash layout to the internal controller layout?
> For example, different OOB layouts exist between Macronix and ESMT.
>
> Apologies for any confusion, and thank you in advance for any help in
> clarifying this matter.
>

2023-04-12 12:29:01

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi Arseniy,

[email protected] wrote on Wed, 12 Apr 2023 13:14:52 +0300:

> On 12.04.2023 12:36, Miquel Raynal wrote:
> > Hi Arseniy,
> >
> > [email protected] wrote on Wed, 12 Apr 2023 12:20:55 +0300:
> >
> >> On 12.04.2023 10:44, Miquel Raynal wrote:
> >>> Hi Arseniy,
> >>>
> >>> [email protected] wrote on Wed, 12 Apr 2023 09:16:58 +0300:
> >>>
> >>>> This NAND reads only few user's bytes in ECC mode (not full OOB), so
> >>>
> >>> "This NAND reads" does not look right, do you mean "Subpage reads do
> >>> not retrieve all the OOB bytes,"?
> >>>
> >>>> fill OOB buffer with zeroes to not return garbage from previous reads
> >>>> to user.
> >>>> Otherwise 'nanddump' utility prints something like this for just erased
> >>>> page:
> >>>>
> >>>> ...
> >>>> 0x000007f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> >>>> OOB Data: ff ff ff ff 00 00 ff ff 80 cf 22 99 cb ad d3 be
> >>>> OOB Data: 63 27 ae 06 16 0a 2f eb bb dd 46 74 41 8e 88 6e
> >>>> OOB Data: 38 a1 2d e6 77 d4 05 06 f2 a5 7e 25 eb 34 7c ff
> >>>> OOB Data: 38 ea de 14 10 de 9b 40 33 16 6a cc 9d aa 2f 5e
> >>>>
> >>>> Signed-off-by: Arseniy Krasnov <[email protected]>
> >>>> ---
> >>>> drivers/mtd/nand/raw/meson_nand.c | 5 +++++
> >>>> 1 file changed, 5 insertions(+)
> >>>>
> >>>> diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
> >>>> index f84a10238e4d..f2f2472cb511 100644
> >>>> --- a/drivers/mtd/nand/raw/meson_nand.c
> >>>> +++ b/drivers/mtd/nand/raw/meson_nand.c
> >>>> @@ -858,9 +858,12 @@ static int meson_nfc_read_page_sub(struct nand_chip *nand,
> >>>> static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
> >>>> int oob_required, int page)
> >>>> {
> >>>> + struct mtd_info *mtd = nand_to_mtd(nand);
> >>>> u8 *oob_buf = nand->oob_poi;
> >>>> int ret;
> >>>>
> >>>> + memset(oob_buf, 0, mtd->oobsize);
> >>>
> >>> I'm surprised raw reads do not read the entire OOB?
> >>
> >> Yes! Seems in case of raw access (what i see in this driver) number of OOB bytes read
> >> still depends on ECC parameters: for each portion of data covered with ECC code we can
> >> read it's ECC code and "user bytes" from OOB - it is what i see by dumping DMA buffer by
> >> printk(). For example I'm working with 2K NAND pages, each page has 2 x 1K ECC blocks.
> >> For each ECC block I have 16 OOB bytes which I can access by read/write. Each 16 bytes
> >> contains 2 bytes of user's data and 14 bytes ECC codes. So when I read page in raw mode
> >> controller returns 32 bytes (2 x (2 + 14)) of OOB. While OOB is reported as 64 bytes.
> >
> > In all modes, when you read OOB, you should get the full OOB. The fact
> > that ECC correction is enabled or disabled does not matter. If the NAND
> > features OOB sections of 64 bytes, you should get the 64 bytes.
> >
> > What happens sometimes, is that some of the bytes are not protected
> > against bitflips, but the policy is to return the full buffer.
>
> Ok, so to clarify case for this NAND controller:
> 1) In both ECC and raw modes i need to return the same raw OOB data (e.g. user bytes
> + ECC codes)?

Well, you need to cover the same amount of data, yes. But in the ECC
case the data won't be raw (at least not all of it).

> 2) If I have access to only 32 bytes of OOB (in case above), I must report that size
> of OOB is only 32 bytes during initialization?

I believe it's just an implementation error in the controller driver.
This controller can be used on NAND chips with other geometries, please
don't play with it.

Thanks,
Miquèl

2023-04-12 12:31:22

by Arseniy Krasnov

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read



On 12.04.2023 15:18, Miquel Raynal wrote:
> Hi Arseniy,
>
> [email protected] wrote on Wed, 12 Apr 2023 13:14:52 +0300:
>
>> On 12.04.2023 12:36, Miquel Raynal wrote:
>>> Hi Arseniy,
>>>
>>> [email protected] wrote on Wed, 12 Apr 2023 12:20:55 +0300:
>>>
>>>> On 12.04.2023 10:44, Miquel Raynal wrote:
>>>>> Hi Arseniy,
>>>>>
>>>>> [email protected] wrote on Wed, 12 Apr 2023 09:16:58 +0300:
>>>>>
>>>>>> This NAND reads only few user's bytes in ECC mode (not full OOB), so
>>>>>
>>>>> "This NAND reads" does not look right, do you mean "Subpage reads do
>>>>> not retrieve all the OOB bytes,"?
>>>>>
>>>>>> fill OOB buffer with zeroes to not return garbage from previous reads
>>>>>> to user.
>>>>>> Otherwise 'nanddump' utility prints something like this for just erased
>>>>>> page:
>>>>>>
>>>>>> ...
>>>>>> 0x000007f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>>>>>> OOB Data: ff ff ff ff 00 00 ff ff 80 cf 22 99 cb ad d3 be
>>>>>> OOB Data: 63 27 ae 06 16 0a 2f eb bb dd 46 74 41 8e 88 6e
>>>>>> OOB Data: 38 a1 2d e6 77 d4 05 06 f2 a5 7e 25 eb 34 7c ff
>>>>>> OOB Data: 38 ea de 14 10 de 9b 40 33 16 6a cc 9d aa 2f 5e
>>>>>>
>>>>>> Signed-off-by: Arseniy Krasnov <[email protected]>
>>>>>> ---
>>>>>> drivers/mtd/nand/raw/meson_nand.c | 5 +++++
>>>>>> 1 file changed, 5 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
>>>>>> index f84a10238e4d..f2f2472cb511 100644
>>>>>> --- a/drivers/mtd/nand/raw/meson_nand.c
>>>>>> +++ b/drivers/mtd/nand/raw/meson_nand.c
>>>>>> @@ -858,9 +858,12 @@ static int meson_nfc_read_page_sub(struct nand_chip *nand,
>>>>>> static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
>>>>>> int oob_required, int page)
>>>>>> {
>>>>>> + struct mtd_info *mtd = nand_to_mtd(nand);
>>>>>> u8 *oob_buf = nand->oob_poi;
>>>>>> int ret;
>>>>>>
>>>>>> + memset(oob_buf, 0, mtd->oobsize);
>>>>>
>>>>> I'm surprised raw reads do not read the entire OOB?
>>>>
>>>> Yes! Seems in case of raw access (what i see in this driver) number of OOB bytes read
>>>> still depends on ECC parameters: for each portion of data covered with ECC code we can
>>>> read it's ECC code and "user bytes" from OOB - it is what i see by dumping DMA buffer by
>>>> printk(). For example I'm working with 2K NAND pages, each page has 2 x 1K ECC blocks.
>>>> For each ECC block I have 16 OOB bytes which I can access by read/write. Each 16 bytes
>>>> contains 2 bytes of user's data and 14 bytes ECC codes. So when I read page in raw mode
>>>> controller returns 32 bytes (2 x (2 + 14)) of OOB. While OOB is reported as 64 bytes.
>>>
>>> In all modes, when you read OOB, you should get the full OOB. The fact
>>> that ECC correction is enabled or disabled does not matter. If the NAND
>>> features OOB sections of 64 bytes, you should get the 64 bytes.
>>>
>>> What happens sometimes, is that some of the bytes are not protected
>>> against bitflips, but the policy is to return the full buffer.
>>
>> Ok, so to clarify case for this NAND controller:
>> 1) In both ECC and raw modes i need to return the same raw OOB data (e.g. user bytes
>> + ECC codes)?
>
> Well, you need to cover the same amount of data, yes. But in the ECC
> case the data won't be raw (at least not all of it).

So "same amount of data", in ECC mode current implementation returns only user OOB bytes (e.g.
OOB data excluding ECC codes), in raw it returns user bytes + ECC codes. IIUC correct
behaviour is to always return user bytes + ECC codes as OOB data even in ECC mode ?

Thanks, Arseniy

>
>> 2) If I have access to only 32 bytes of OOB (in case above), I must report that size
>> of OOB is only 32 bytes during initialization?
>
> I believe it's just an implementation error in the controller driver.
> This controller can be used on NAND chips with other geometries, please
> don't play with it.
>
> Thanks,
> Miquèl

2023-04-12 12:34:29

by Liang Yang

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi Dmitry,

On 2023/4/12 19:43, Dmitry Rokosov wrote:
> [ EXTERNAL EMAIL ]
>
> Hello Liang,
>
> On Wed, Apr 12, 2023 at 07:36:30PM +0800, Liang Yang wrote:
>> Hi,
>>
>> On 2023/4/12 18:51, Liang Yang wrote:
>>>>>>>> diff --git a/drivers/mtd/nand/raw/meson_nand.c
>>>>>>>> b/drivers/mtd/nand/raw/meson_nand.c
>>>>>>>> index f84a10238e4d..f2f2472cb511 100644
>>>>>>>> --- a/drivers/mtd/nand/raw/meson_nand.c
>>>>>>>> +++ b/drivers/mtd/nand/raw/meson_nand.c
>>>>>>>> @@ -858,9 +858,12 @@ static int
>>>>>>>> meson_nfc_read_page_sub(struct nand_chip *nand,
>>>>>>>>   static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
>>>>>>>>                      int oob_required, int page)
>>>>>>>>   {
>>>>>>>> +    struct mtd_info *mtd = nand_to_mtd(nand);
>>>>>>>>       u8 *oob_buf = nand->oob_poi;
>>>>>>>>       int ret;
>>>>>>>> +    memset(oob_buf, 0, mtd->oobsize);
>>>>>>>
>>>>>>> I'm surprised raw reads do not read the entire OOB?
>>>>>>
>>>>>> Yes! Seems in case of raw access (what i see in this driver)
>>>>>> number of OOB bytes read
>>>>>> still depends on ECC parameters: for each portion of data
>>>>>> covered with ECC code we can
>>>>>> read it's ECC code and "user bytes" from OOB - it is what i
>>>>>> see by dumping DMA buffer by
>>>>>> printk(). For example I'm working with 2K NAND pages, each
>>>>>> page has 2 x 1K ECC blocks.
>>>>>> For each ECC block I have 16 OOB bytes which I can access by
>>>>>> read/write. Each 16 bytes
>>>>>> contains 2 bytes of user's data and 14 bytes ECC codes. So
>>>>>> when I read page in raw mode
>>>>>> controller returns 32 bytes (2 x (2 + 14)) of OOB. While OOB
>>>>>> is reported as 64 bytes.
>>>>>
>>>>> In all modes, when you read OOB, you should get the full OOB. The fact
>>>>> that ECC correction is enabled or disabled does not matter. If the NAND
>>>>> features OOB sections of 64 bytes, you should get the 64 bytes.
>>>>>
>>>>> What happens sometimes, is that some of the bytes are not protected
>>>>> against bitflips, but the policy is to return the full buffer.
>>>>
>>>> Ok, so to clarify case for this NAND controller:
>>>> 1) In both ECC and raw modes i need to return the same raw OOB data
>>>> (e.g. user bytes
>>>>     + ECC codes)?
>>>> 2) If I have access to only 32 bytes of OOB (in case above), I must
>>>> report that size
>>>>     of OOB is only 32 bytes during initialization?
>>>>
>>>> Thanks, Arseniy
>>>
>>> Yes. it should return all the OOB data. i make a mistake on raw read and
>>> there is wrong code in meson_nfc_read_page_raw().
>>>     meson_nfc_get_data_oob(nand, buf, oob_buf);
>>> changed to:
>>>     if (oob_required)
>>>         memcpy(oob_buf, buf + mtd->writesize, mtd->oobsize)
>>
>> Sorry, please ignore this. the previous code is right.
>>
>> the controller changes the layout of one page; the physical layout is 2048
>> main data + 64 oob data. after writing into NAND page, it is stored
>> like this: 1024 main data + 2 user bytes + 14 ECC parity bytes + 1024 main
>> data + 2 user bytes + 14 ECC parity bytes. so that is right we only get 4
>> user bytes and 28 ECC parity bytes, total 32 bytes. that is the behavior of
>> the controller that transferring one ECC page(1KB) brings back only 2 user
>> bytes.
>>
>> because layout is changed by controller, so go back to the function.
>> meson_nfc_get_data_oob(nand, buf, oob_buf) try to get the right user and ecc
>> parity bytes from the right pos. after that, the other oob bytes is not
>> reading from NAND flash.
>
> I have always been under the impression that NAND OOB layout falls under the
> responsibility of the flash driver. Is this specific to the Amlogic NAND,
> and does it map the flash layout to the internal controller layout?
> For example, different OOB layouts exist between Macronix and ESMT.
>
> Apologies for any confusion, and thank you in advance for any help in
> clarifying this matter.
>

This is the behavior of Amlogic NAND controller, not related to NAND
device.

this design of Amlogic NAND controller is reducing the fifo(sram) size
inside the controller.

To the NAND device with <2KB(main) + 64B(oob)> page, usually if the
controller place user bytes and ECC parity bytes after 2048(column
address). to finish the ECC calculation and checking, the controller has
to read the data and its ECC parity bytes. if they are not placed
continuously, the controllers have to read the whole page data including
the oob data together or send random read command with column address.
so consider that the controller needs a fifo of at least (2048 + 64)
bytes inside. if the NAND device has bigger size of NAND page, the size
of the fifo should be bigger.
in Amlogic NAND controller, after changing the data layout of the NAND
page, it just needs <1KB + 16B> fifo size. the controller firstly sends
only <1024 + 16> clocks and get 1st ECC page data and its ECC parity
bytes, and then it can start ECC calculation and checking. after that or
at the same time, loop next ECC page data and its ECC parity.

--
Thanks,
Liang

2023-04-12 13:07:23

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi Arseniy,

[email protected] wrote on Wed, 12 Apr 2023 15:22:26 +0300:

> On 12.04.2023 15:18, Miquel Raynal wrote:
> > Hi Arseniy,
> >
> > [email protected] wrote on Wed, 12 Apr 2023 13:14:52 +0300:
> >
> >> On 12.04.2023 12:36, Miquel Raynal wrote:
> >>> Hi Arseniy,
> >>>
> >>> [email protected] wrote on Wed, 12 Apr 2023 12:20:55 +0300:
> >>>
> >>>> On 12.04.2023 10:44, Miquel Raynal wrote:
> >>>>> Hi Arseniy,
> >>>>>
> >>>>> [email protected] wrote on Wed, 12 Apr 2023 09:16:58 +0300:
> >>>>>
> >>>>>> This NAND reads only few user's bytes in ECC mode (not full OOB), so
> >>>>>
> >>>>> "This NAND reads" does not look right, do you mean "Subpage reads do
> >>>>> not retrieve all the OOB bytes,"?
> >>>>>
> >>>>>> fill OOB buffer with zeroes to not return garbage from previous reads
> >>>>>> to user.
> >>>>>> Otherwise 'nanddump' utility prints something like this for just erased
> >>>>>> page:
> >>>>>>
> >>>>>> ...
> >>>>>> 0x000007f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> >>>>>> OOB Data: ff ff ff ff 00 00 ff ff 80 cf 22 99 cb ad d3 be
> >>>>>> OOB Data: 63 27 ae 06 16 0a 2f eb bb dd 46 74 41 8e 88 6e
> >>>>>> OOB Data: 38 a1 2d e6 77 d4 05 06 f2 a5 7e 25 eb 34 7c ff
> >>>>>> OOB Data: 38 ea de 14 10 de 9b 40 33 16 6a cc 9d aa 2f 5e
> >>>>>>
> >>>>>> Signed-off-by: Arseniy Krasnov <[email protected]>
> >>>>>> ---
> >>>>>> drivers/mtd/nand/raw/meson_nand.c | 5 +++++
> >>>>>> 1 file changed, 5 insertions(+)
> >>>>>>
> >>>>>> diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
> >>>>>> index f84a10238e4d..f2f2472cb511 100644
> >>>>>> --- a/drivers/mtd/nand/raw/meson_nand.c
> >>>>>> +++ b/drivers/mtd/nand/raw/meson_nand.c
> >>>>>> @@ -858,9 +858,12 @@ static int meson_nfc_read_page_sub(struct nand_chip *nand,
> >>>>>> static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
> >>>>>> int oob_required, int page)
> >>>>>> {
> >>>>>> + struct mtd_info *mtd = nand_to_mtd(nand);
> >>>>>> u8 *oob_buf = nand->oob_poi;
> >>>>>> int ret;
> >>>>>>
> >>>>>> + memset(oob_buf, 0, mtd->oobsize);
> >>>>>
> >>>>> I'm surprised raw reads do not read the entire OOB?
> >>>>
> >>>> Yes! Seems in case of raw access (what i see in this driver) number of OOB bytes read
> >>>> still depends on ECC parameters: for each portion of data covered with ECC code we can
> >>>> read it's ECC code and "user bytes" from OOB - it is what i see by dumping DMA buffer by
> >>>> printk(). For example I'm working with 2K NAND pages, each page has 2 x 1K ECC blocks.
> >>>> For each ECC block I have 16 OOB bytes which I can access by read/write. Each 16 bytes
> >>>> contains 2 bytes of user's data and 14 bytes ECC codes. So when I read page in raw mode
> >>>> controller returns 32 bytes (2 x (2 + 14)) of OOB. While OOB is reported as 64 bytes.
> >>>
> >>> In all modes, when you read OOB, you should get the full OOB. The fact
> >>> that ECC correction is enabled or disabled does not matter. If the NAND
> >>> features OOB sections of 64 bytes, you should get the 64 bytes.
> >>>
> >>> What happens sometimes, is that some of the bytes are not protected
> >>> against bitflips, but the policy is to return the full buffer.
> >>
> >> Ok, so to clarify case for this NAND controller:
> >> 1) In both ECC and raw modes i need to return the same raw OOB data (e.g. user bytes
> >> + ECC codes)?
> >
> > Well, you need to cover the same amount of data, yes. But in the ECC
> > case the data won't be raw (at least not all of it).
>
> So "same amount of data", in ECC mode current implementation returns only user OOB bytes (e.g.
> OOB data excluding ECC codes), in raw it returns user bytes + ECC codes. IIUC correct
> behaviour is to always return user bytes + ECC codes as OOB data even in ECC mode ?

If the page are 2k+64B you should read 2k+64B when OOB are requested.

If the controller only returns 2k+32B, then perform a random read to
just move the read pointer to mtd->size + mtd->oobsize - 32 and
retrieve the missing 32 bytes?

This applies to the two modes, the only difference is:
- with correction (commonly named "ECC mode"): the user bytes and ECC
bytes should be fixed if there are any bitflips
- without correction (commonly referred as "raw mode"): no correction
applies, if there are bitflips, give them

Please mind the raw mode can be slow, it's meant for debugging and
testing, mainly. Page reads however should be fast, so if just moving
the column pointer works, then do it, otherwise we'll consider
returning FFs.

Thanks,
Miquèl

2023-04-12 14:12:34

by Liang Yang

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi Miquel and Arseniy,

On 2023/4/12 20:57, Miquel Raynal wrote:
> [ EXTERNAL EMAIL ]
>
> Hi Arseniy,
>
> [email protected] wrote on Wed, 12 Apr 2023 15:22:26 +0300:
>
>> On 12.04.2023 15:18, Miquel Raynal wrote:
>>> Hi Arseniy,
>>>
>>> [email protected] wrote on Wed, 12 Apr 2023 13:14:52 +0300:
>>>
>>>> On 12.04.2023 12:36, Miquel Raynal wrote:
>>>>> Hi Arseniy,
>>>>>
>>>>> [email protected] wrote on Wed, 12 Apr 2023 12:20:55 +0300:
>>>>>
>>>>>> On 12.04.2023 10:44, Miquel Raynal wrote:
>>>>>>> Hi Arseniy,
>>>>>>>
>>>>>>> [email protected] wrote on Wed, 12 Apr 2023 09:16:58 +0300:
>>>>>>>
>>>>>>>> This NAND reads only few user's bytes in ECC mode (not full OOB), so
>>>>>>>
>>>>>>> "This NAND reads" does not look right, do you mean "Subpage reads do
>>>>>>> not retrieve all the OOB bytes,"?
>>>>>>>
>>>>>>>> fill OOB buffer with zeroes to not return garbage from previous reads
>>>>>>>> to user.
>>>>>>>> Otherwise 'nanddump' utility prints something like this for just erased
>>>>>>>> page:
>>>>>>>>
>>>>>>>> ...
>>>>>>>> 0x000007f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>>>>>>>> OOB Data: ff ff ff ff 00 00 ff ff 80 cf 22 99 cb ad d3 be
>>>>>>>> OOB Data: 63 27 ae 06 16 0a 2f eb bb dd 46 74 41 8e 88 6e
>>>>>>>> OOB Data: 38 a1 2d e6 77 d4 05 06 f2 a5 7e 25 eb 34 7c ff
>>>>>>>> OOB Data: 38 ea de 14 10 de 9b 40 33 16 6a cc 9d aa 2f 5e
>>>>>>>>
>>>>>>>> Signed-off-by: Arseniy Krasnov <[email protected]>
>>>>>>>> ---
>>>>>>>> drivers/mtd/nand/raw/meson_nand.c | 5 +++++
>>>>>>>> 1 file changed, 5 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
>>>>>>>> index f84a10238e4d..f2f2472cb511 100644
>>>>>>>> --- a/drivers/mtd/nand/raw/meson_nand.c
>>>>>>>> +++ b/drivers/mtd/nand/raw/meson_nand.c
>>>>>>>> @@ -858,9 +858,12 @@ static int meson_nfc_read_page_sub(struct nand_chip *nand,
>>>>>>>> static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
>>>>>>>> int oob_required, int page)
>>>>>>>> {
>>>>>>>> + struct mtd_info *mtd = nand_to_mtd(nand);
>>>>>>>> u8 *oob_buf = nand->oob_poi;
>>>>>>>> int ret;
>>>>>>>>
>>>>>>>> + memset(oob_buf, 0, mtd->oobsize);
>>>>>>>
>>>>>>> I'm surprised raw reads do not read the entire OOB?
>>>>>>
>>>>>> Yes! Seems in case of raw access (what i see in this driver) number of OOB bytes read
>>>>>> still depends on ECC parameters: for each portion of data covered with ECC code we can
>>>>>> read it's ECC code and "user bytes" from OOB - it is what i see by dumping DMA buffer by
>>>>>> printk(). For example I'm working with 2K NAND pages, each page has 2 x 1K ECC blocks.
>>>>>> For each ECC block I have 16 OOB bytes which I can access by read/write. Each 16 bytes
>>>>>> contains 2 bytes of user's data and 14 bytes ECC codes. So when I read page in raw mode
>>>>>> controller returns 32 bytes (2 x (2 + 14)) of OOB. While OOB is reported as 64 bytes.
>>>>>
>>>>> In all modes, when you read OOB, you should get the full OOB. The fact
>>>>> that ECC correction is enabled or disabled does not matter. If the NAND
>>>>> features OOB sections of 64 bytes, you should get the 64 bytes.
>>>>>
>>>>> What happens sometimes, is that some of the bytes are not protected
>>>>> against bitflips, but the policy is to return the full buffer.
>>>>
>>>> Ok, so to clarify case for this NAND controller:
>>>> 1) In both ECC and raw modes i need to return the same raw OOB data (e.g. user bytes
>>>> + ECC codes)?
>>>
>>> Well, you need to cover the same amount of data, yes. But in the ECC
>>> case the data won't be raw (at least not all of it).
>>
>> So "same amount of data", in ECC mode current implementation returns only user OOB bytes (e.g.
>> OOB data excluding ECC codes), in raw it returns user bytes + ECC codes. IIUC correct
>> behaviour is to always return user bytes + ECC codes as OOB data even in ECC mode ?
>
> If the page are 2k+64B you should read 2k+64B when OOB are requested.
>
> If the controller only returns 2k+32B, then perform a random read to
> just move the read pointer to mtd->size + mtd->oobsize - 32 and
> retrieve the missing 32 bytes?

1) raw read can read out the whole page data 2k+64B, decided by the len
in the controller raw read command:
cmd = (len & GENMASK(5, 0)) | scrambler | DMA_DIR(dir);
after that, the missing oob bytes(not used) can be copied from
meson_chip->data_buf. so the implementation of meson_nfc_read_page_raw()
is like this if need.
{
......
meson_nfc_read_page_sub(nand, page, 1);
meson_nfc_get_data_oob(nand, buf, oob_buf);
oob_len = (nand->ecc.bytes + 2) * nand->ecc.steps;
memcpy(oob_buf + oob_len, meson_chip->data_buf + oob_len, mtd->oobsize
- oob_len);

}
2) In ECC mode, the controller can't bring back the missing OOB bytes.
it can read out the user bytes and ecc bytes per meson_ooblayout_ops define.

>
> This applies to the two modes, the only difference is:
> - with correction (commonly named "ECC mode"): the user bytes and ECC
> bytes should be fixed if there are any bitflips
> - without correction (commonly referred as "raw mode"): no correction
> applies, if there are bitflips, give them
>
> Please mind the raw mode can be slow, it's meant for debugging and
> testing, mainly. Page reads however should be fast, so if just moving
> the column pointer works, then do it, otherwise we'll consider
> returning FFs.
>
> Thanks,
> Miquèl
>

--
Thanks,
Liang

2023-04-12 14:34:53

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hello,

[email protected] wrote on Wed, 12 Apr 2023 22:04:28 +0800:

> Hi Miquel and Arseniy,
>
> On 2023/4/12 20:57, Miquel Raynal wrote:
> > [ EXTERNAL EMAIL ]
> >
> > Hi Arseniy,
> >
> > [email protected] wrote on Wed, 12 Apr 2023 15:22:26 +0300:
> >
> >> On 12.04.2023 15:18, Miquel Raynal wrote:
> >>> Hi Arseniy,
> >>>
> >>> [email protected] wrote on Wed, 12 Apr 2023 13:14:52 +0300:
> >>> >>>> On 12.04.2023 12:36, Miquel Raynal wrote:
> >>>>> Hi Arseniy,
> >>>>>
> >>>>> [email protected] wrote on Wed, 12 Apr 2023 12:20:55 +0300:
> >>>>> >>>>>> On 12.04.2023 10:44, Miquel Raynal wrote:
> >>>>>>> Hi Arseniy,
> >>>>>>>
> >>>>>>> [email protected] wrote on Wed, 12 Apr 2023 09:16:58 +0300:
> >>>>>>> >>>>>>>> This NAND reads only few user's bytes in ECC mode (not full OOB), so
> >>>>>>>
> >>>>>>> "This NAND reads" does not look right, do you mean "Subpage reads do
> >>>>>>> not retrieve all the OOB bytes,"?
> >>>>>>> >>>>>>>> fill OOB buffer with zeroes to not return garbage from previous reads
> >>>>>>>> to user.
> >>>>>>>> Otherwise 'nanddump' utility prints something like this for just erased
> >>>>>>>> page:
> >>>>>>>>
> >>>>>>>> ...
> >>>>>>>> 0x000007f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> >>>>>>>> OOB Data: ff ff ff ff 00 00 ff ff 80 cf 22 99 cb ad d3 be
> >>>>>>>> OOB Data: 63 27 ae 06 16 0a 2f eb bb dd 46 74 41 8e 88 6e
> >>>>>>>> OOB Data: 38 a1 2d e6 77 d4 05 06 f2 a5 7e 25 eb 34 7c ff
> >>>>>>>> OOB Data: 38 ea de 14 10 de 9b 40 33 16 6a cc 9d aa 2f 5e
> >>>>>>>>
> >>>>>>>> Signed-off-by: Arseniy Krasnov <[email protected]>
> >>>>>>>> ---
> >>>>>>>> drivers/mtd/nand/raw/meson_nand.c | 5 +++++
> >>>>>>>> 1 file changed, 5 insertions(+)
> >>>>>>>>
> >>>>>>>> diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
> >>>>>>>> index f84a10238e4d..f2f2472cb511 100644
> >>>>>>>> --- a/drivers/mtd/nand/raw/meson_nand.c
> >>>>>>>> +++ b/drivers/mtd/nand/raw/meson_nand.c
> >>>>>>>> @@ -858,9 +858,12 @@ static int meson_nfc_read_page_sub(struct nand_chip *nand,
> >>>>>>>> static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
> >>>>>>>> int oob_required, int page)
> >>>>>>>> {
> >>>>>>>> + struct mtd_info *mtd = nand_to_mtd(nand);
> >>>>>>>> u8 *oob_buf = nand->oob_poi;
> >>>>>>>> int ret;
> >>>>>>>> >>>>>>>> + memset(oob_buf, 0, mtd->oobsize);
> >>>>>>>
> >>>>>>> I'm surprised raw reads do not read the entire OOB?
> >>>>>>
> >>>>>> Yes! Seems in case of raw access (what i see in this driver) number of OOB bytes read
> >>>>>> still depends on ECC parameters: for each portion of data covered with ECC code we can
> >>>>>> read it's ECC code and "user bytes" from OOB - it is what i see by dumping DMA buffer by
> >>>>>> printk(). For example I'm working with 2K NAND pages, each page has 2 x 1K ECC blocks.
> >>>>>> For each ECC block I have 16 OOB bytes which I can access by read/write. Each 16 bytes
> >>>>>> contains 2 bytes of user's data and 14 bytes ECC codes. So when I read page in raw mode
> >>>>>> controller returns 32 bytes (2 x (2 + 14)) of OOB. While OOB is reported as 64 bytes.
> >>>>>
> >>>>> In all modes, when you read OOB, you should get the full OOB. The fact
> >>>>> that ECC correction is enabled or disabled does not matter. If the NAND
> >>>>> features OOB sections of 64 bytes, you should get the 64 bytes.
> >>>>>
> >>>>> What happens sometimes, is that some of the bytes are not protected
> >>>>> against bitflips, but the policy is to return the full buffer.
> >>>>
> >>>> Ok, so to clarify case for this NAND controller:
> >>>> 1) In both ECC and raw modes i need to return the same raw OOB data (e.g. user bytes
> >>>> + ECC codes)?
> >>>
> >>> Well, you need to cover the same amount of data, yes. But in the ECC
> >>> case the data won't be raw (at least not all of it).
> >>
> >> So "same amount of data", in ECC mode current implementation returns only user OOB bytes (e.g.
> >> OOB data excluding ECC codes), in raw it returns user bytes + ECC codes. IIUC correct
> >> behaviour is to always return user bytes + ECC codes as OOB data even in ECC mode ?
> >
> > If the page are 2k+64B you should read 2k+64B when OOB are requested.
> >
> > If the controller only returns 2k+32B, then perform a random read to
> > just move the read pointer to mtd->size + mtd->oobsize - 32 and
> > retrieve the missing 32 bytes?
>
> 1) raw read can read out the whole page data 2k+64B, decided by the len in the controller raw read command:
> cmd = (len & GENMASK(5, 0)) | scrambler | DMA_DIR(dir);
> after that, the missing oob bytes(not used) can be copied from meson_chip->data_buf. so the implementation of meson_nfc_read_page_raw() is like this if need.
> {
> ......
> meson_nfc_read_page_sub(nand, page, 1);
> meson_nfc_get_data_oob(nand, buf, oob_buf);
> oob_len = (nand->ecc.bytes + 2) * nand->ecc.steps;
> memcpy(oob_buf + oob_len, meson_chip->data_buf + oob_len, mtd->oobsize - oob_len);
>
> }
> 2) In ECC mode, the controller can't bring back the missing OOB bytes. it can read out the user bytes and ecc bytes per meson_ooblayout_ops define.

And then (if oob_required) you can bring the missing bytes with
something along:
nand_change_read_column_op(chip, mtd->writesize + oob_len,
oob_buf + oob_len,
mtd->oobsize - oob_len,
false);
Should not be a huge performance hit.

>
> >
> > This applies to the two modes, the only difference is:
> > - with correction (commonly named "ECC mode"): the user bytes and ECC
> > bytes should be fixed if there are any bitflips
> > - without correction (commonly referred as "raw mode"): no correction
> > applies, if there are bitflips, give them
> >
> > Please mind the raw mode can be slow, it's meant for debugging and
> > testing, mainly. Page reads however should be fast, so if just moving
> > the column pointer works, then do it, otherwise we'll consider
> > returning FFs.
> >
> > Thanks,
> > Miquèl
> >
>


Thanks,
Miquèl

2023-04-12 19:19:17

by Dmitry Rokosov

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

On Wed, Apr 12, 2023 at 10:04:28PM +0800, Liang Yang wrote:
> Hi Miquel and Arseniy,
>
> On 2023/4/12 20:57, Miquel Raynal wrote:
> > [ EXTERNAL EMAIL ]
> >
> > Hi Arseniy,
> >
> > [email protected] wrote on Wed, 12 Apr 2023 15:22:26 +0300:
> >
> > > On 12.04.2023 15:18, Miquel Raynal wrote:
> > > > Hi Arseniy,
> > > >
> > > > [email protected] wrote on Wed, 12 Apr 2023 13:14:52 +0300:
> > > > > On 12.04.2023 12:36, Miquel Raynal wrote:
> > > > > > Hi Arseniy,
> > > > > >
> > > > > > [email protected] wrote on Wed, 12 Apr 2023 12:20:55 +0300:
> > > > > > > On 12.04.2023 10:44, Miquel Raynal wrote:
> > > > > > > > Hi Arseniy,
> > > > > > > >
> > > > > > > > [email protected] wrote on Wed, 12 Apr 2023 09:16:58 +0300:
> > > > > > > > > This NAND reads only few user's bytes in ECC mode (not full OOB), so
> > > > > > > >
> > > > > > > > "This NAND reads" does not look right, do you mean "Subpage reads do
> > > > > > > > not retrieve all the OOB bytes,"?
> > > > > > > > > fill OOB buffer with zeroes to not return garbage from previous reads
> > > > > > > > > to user.
> > > > > > > > > Otherwise 'nanddump' utility prints something like this for just erased
> > > > > > > > > page:
> > > > > > > > >
> > > > > > > > > ...
> > > > > > > > > 0x000007f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> > > > > > > > > OOB Data: ff ff ff ff 00 00 ff ff 80 cf 22 99 cb ad d3 be
> > > > > > > > > OOB Data: 63 27 ae 06 16 0a 2f eb bb dd 46 74 41 8e 88 6e
> > > > > > > > > OOB Data: 38 a1 2d e6 77 d4 05 06 f2 a5 7e 25 eb 34 7c ff
> > > > > > > > > OOB Data: 38 ea de 14 10 de 9b 40 33 16 6a cc 9d aa 2f 5e
> > > > > > > > >
> > > > > > > > > Signed-off-by: Arseniy Krasnov <[email protected]>
> > > > > > > > > ---
> > > > > > > > > drivers/mtd/nand/raw/meson_nand.c | 5 +++++
> > > > > > > > > 1 file changed, 5 insertions(+)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
> > > > > > > > > index f84a10238e4d..f2f2472cb511 100644
> > > > > > > > > --- a/drivers/mtd/nand/raw/meson_nand.c
> > > > > > > > > +++ b/drivers/mtd/nand/raw/meson_nand.c
> > > > > > > > > @@ -858,9 +858,12 @@ static int meson_nfc_read_page_sub(struct nand_chip *nand,
> > > > > > > > > static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
> > > > > > > > > int oob_required, int page)
> > > > > > > > > {
> > > > > > > > > + struct mtd_info *mtd = nand_to_mtd(nand);
> > > > > > > > > u8 *oob_buf = nand->oob_poi;
> > > > > > > > > int ret;
> > > > > > > > > + memset(oob_buf, 0, mtd->oobsize);
> > > > > > > >
> > > > > > > > I'm surprised raw reads do not read the entire OOB?
> > > > > > >
> > > > > > > Yes! Seems in case of raw access (what i see in this driver) number of OOB bytes read
> > > > > > > still depends on ECC parameters: for each portion of data covered with ECC code we can
> > > > > > > read it's ECC code and "user bytes" from OOB - it is what i see by dumping DMA buffer by
> > > > > > > printk(). For example I'm working with 2K NAND pages, each page has 2 x 1K ECC blocks.
> > > > > > > For each ECC block I have 16 OOB bytes which I can access by read/write. Each 16 bytes
> > > > > > > contains 2 bytes of user's data and 14 bytes ECC codes. So when I read page in raw mode
> > > > > > > controller returns 32 bytes (2 x (2 + 14)) of OOB. While OOB is reported as 64 bytes.
> > > > > >
> > > > > > In all modes, when you read OOB, you should get the full OOB. The fact
> > > > > > that ECC correction is enabled or disabled does not matter. If the NAND
> > > > > > features OOB sections of 64 bytes, you should get the 64 bytes.
> > > > > >
> > > > > > What happens sometimes, is that some of the bytes are not protected
> > > > > > against bitflips, but the policy is to return the full buffer.
> > > > >
> > > > > Ok, so to clarify case for this NAND controller:
> > > > > 1) In both ECC and raw modes i need to return the same raw OOB data (e.g. user bytes
> > > > > + ECC codes)?
> > > >
> > > > Well, you need to cover the same amount of data, yes. But in the ECC
> > > > case the data won't be raw (at least not all of it).
> > >
> > > So "same amount of data", in ECC mode current implementation returns only user OOB bytes (e.g.
> > > OOB data excluding ECC codes), in raw it returns user bytes + ECC codes. IIUC correct
> > > behaviour is to always return user bytes + ECC codes as OOB data even in ECC mode ?
> >
> > If the page are 2k+64B you should read 2k+64B when OOB are requested.
> >
> > If the controller only returns 2k+32B, then perform a random read to
> > just move the read pointer to mtd->size + mtd->oobsize - 32 and
> > retrieve the missing 32 bytes?
>
> 1) raw read can read out the whole page data 2k+64B, decided by the len in
> the controller raw read command:
> cmd = (len & GENMASK(5, 0)) | scrambler | DMA_DIR(dir);
> after that, the missing oob bytes(not used) can be copied from
> meson_chip->data_buf. so the implementation of meson_nfc_read_page_raw() is
> like this if need.
> {
> ......
> meson_nfc_read_page_sub(nand, page, 1);
> meson_nfc_get_data_oob(nand, buf, oob_buf);
> oob_len = (nand->ecc.bytes + 2) * nand->ecc.steps;
> memcpy(oob_buf + oob_len, meson_chip->data_buf + oob_len, mtd->oobsize -
> oob_len);
>
> }
> 2) In ECC mode, the controller can't bring back the missing OOB bytes. it
> can read out the user bytes and ecc bytes per meson_ooblayout_ops define.
>

How does the Meson controller know the actual NAND flash layout when the
OOB is split into protected and unprotected areas, such as Free and ECC
areas? If the controller has a static OOB layout, where is the mapping
located?

> >
> > This applies to the two modes, the only difference is:
> > - with correction (commonly named "ECC mode"): the user bytes and ECC
> > bytes should be fixed if there are any bitflips
> > - without correction (commonly referred as "raw mode"): no correction
> > applies, if there are bitflips, give them
> >
> > Please mind the raw mode can be slow, it's meant for debugging and
> > testing, mainly. Page reads however should be fast, so if just moving
> > the column pointer works, then do it, otherwise we'll consider
> > returning FFs.

--
Thank you,
Dmitry

2023-04-12 21:04:59

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi Dmitry,

[email protected] wrote on Wed, 12 Apr 2023 22:15:48 +0300:

> On Wed, Apr 12, 2023 at 10:04:28PM +0800, Liang Yang wrote:
> > Hi Miquel and Arseniy,
> >
> > On 2023/4/12 20:57, Miquel Raynal wrote:
> > > [ EXTERNAL EMAIL ]
> > >
> > > Hi Arseniy,
> > >
> > > [email protected] wrote on Wed, 12 Apr 2023 15:22:26 +0300:
> > >
> > > > On 12.04.2023 15:18, Miquel Raynal wrote:
> > > > > Hi Arseniy,
> > > > >
> > > > > [email protected] wrote on Wed, 12 Apr 2023 13:14:52 +0300:
> > > > > > On 12.04.2023 12:36, Miquel Raynal wrote:
> > > > > > > Hi Arseniy,
> > > > > > >
> > > > > > > [email protected] wrote on Wed, 12 Apr 2023 12:20:55 +0300:
> > > > > > > > On 12.04.2023 10:44, Miquel Raynal wrote:
> > > > > > > > > Hi Arseniy,
> > > > > > > > >
> > > > > > > > > [email protected] wrote on Wed, 12 Apr 2023 09:16:58 +0300:
> > > > > > > > > > This NAND reads only few user's bytes in ECC mode (not full OOB), so
> > > > > > > > >
> > > > > > > > > "This NAND reads" does not look right, do you mean "Subpage reads do
> > > > > > > > > not retrieve all the OOB bytes,"?
> > > > > > > > > > fill OOB buffer with zeroes to not return garbage from previous reads
> > > > > > > > > > to user.
> > > > > > > > > > Otherwise 'nanddump' utility prints something like this for just erased
> > > > > > > > > > page:
> > > > > > > > > >
> > > > > > > > > > ...
> > > > > > > > > > 0x000007f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> > > > > > > > > > OOB Data: ff ff ff ff 00 00 ff ff 80 cf 22 99 cb ad d3 be
> > > > > > > > > > OOB Data: 63 27 ae 06 16 0a 2f eb bb dd 46 74 41 8e 88 6e
> > > > > > > > > > OOB Data: 38 a1 2d e6 77 d4 05 06 f2 a5 7e 25 eb 34 7c ff
> > > > > > > > > > OOB Data: 38 ea de 14 10 de 9b 40 33 16 6a cc 9d aa 2f 5e
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Arseniy Krasnov <[email protected]>
> > > > > > > > > > ---
> > > > > > > > > > drivers/mtd/nand/raw/meson_nand.c | 5 +++++
> > > > > > > > > > 1 file changed, 5 insertions(+)
> > > > > > > > > >
> > > > > > > > > > diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
> > > > > > > > > > index f84a10238e4d..f2f2472cb511 100644
> > > > > > > > > > --- a/drivers/mtd/nand/raw/meson_nand.c
> > > > > > > > > > +++ b/drivers/mtd/nand/raw/meson_nand.c
> > > > > > > > > > @@ -858,9 +858,12 @@ static int meson_nfc_read_page_sub(struct nand_chip *nand,
> > > > > > > > > > static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
> > > > > > > > > > int oob_required, int page)
> > > > > > > > > > {
> > > > > > > > > > + struct mtd_info *mtd = nand_to_mtd(nand);
> > > > > > > > > > u8 *oob_buf = nand->oob_poi;
> > > > > > > > > > int ret;
> > > > > > > > > > + memset(oob_buf, 0, mtd->oobsize);
> > > > > > > > >
> > > > > > > > > I'm surprised raw reads do not read the entire OOB?
> > > > > > > >
> > > > > > > > Yes! Seems in case of raw access (what i see in this driver) number of OOB bytes read
> > > > > > > > still depends on ECC parameters: for each portion of data covered with ECC code we can
> > > > > > > > read it's ECC code and "user bytes" from OOB - it is what i see by dumping DMA buffer by
> > > > > > > > printk(). For example I'm working with 2K NAND pages, each page has 2 x 1K ECC blocks.
> > > > > > > > For each ECC block I have 16 OOB bytes which I can access by read/write. Each 16 bytes
> > > > > > > > contains 2 bytes of user's data and 14 bytes ECC codes. So when I read page in raw mode
> > > > > > > > controller returns 32 bytes (2 x (2 + 14)) of OOB. While OOB is reported as 64 bytes.
> > > > > > >
> > > > > > > In all modes, when you read OOB, you should get the full OOB. The fact
> > > > > > > that ECC correction is enabled or disabled does not matter. If the NAND
> > > > > > > features OOB sections of 64 bytes, you should get the 64 bytes.
> > > > > > >
> > > > > > > What happens sometimes, is that some of the bytes are not protected
> > > > > > > against bitflips, but the policy is to return the full buffer.
> > > > > >
> > > > > > Ok, so to clarify case for this NAND controller:
> > > > > > 1) In both ECC and raw modes i need to return the same raw OOB data (e.g. user bytes
> > > > > > + ECC codes)?
> > > > >
> > > > > Well, you need to cover the same amount of data, yes. But in the ECC
> > > > > case the data won't be raw (at least not all of it).
> > > >
> > > > So "same amount of data", in ECC mode current implementation returns only user OOB bytes (e.g.
> > > > OOB data excluding ECC codes), in raw it returns user bytes + ECC codes. IIUC correct
> > > > behaviour is to always return user bytes + ECC codes as OOB data even in ECC mode ?
> > >
> > > If the page are 2k+64B you should read 2k+64B when OOB are requested.
> > >
> > > If the controller only returns 2k+32B, then perform a random read to
> > > just move the read pointer to mtd->size + mtd->oobsize - 32 and
> > > retrieve the missing 32 bytes?
> >
> > 1) raw read can read out the whole page data 2k+64B, decided by the len in
> > the controller raw read command:
> > cmd = (len & GENMASK(5, 0)) | scrambler | DMA_DIR(dir);
> > after that, the missing oob bytes(not used) can be copied from
> > meson_chip->data_buf. so the implementation of meson_nfc_read_page_raw() is
> > like this if need.
> > {
> > ......
> > meson_nfc_read_page_sub(nand, page, 1);
> > meson_nfc_get_data_oob(nand, buf, oob_buf);
> > oob_len = (nand->ecc.bytes + 2) * nand->ecc.steps;
> > memcpy(oob_buf + oob_len, meson_chip->data_buf + oob_len, mtd->oobsize -
> > oob_len);
> >
> > }
> > 2) In ECC mode, the controller can't bring back the missing OOB bytes. it
> > can read out the user bytes and ecc bytes per meson_ooblayout_ops define.
> >
>
> How does the Meson controller know the actual NAND flash layout when the
> OOB is split into protected and unprotected areas, such as Free and ECC
> areas? If the controller has a static OOB layout, where is the mapping
> located?

It's usually a set of values hardcoded in the driver. It's a per
geometry set.

>
> > >
> > > This applies to the two modes, the only difference is:
> > > - with correction (commonly named "ECC mode"): the user bytes and ECC
> > > bytes should be fixed if there are any bitflips
> > > - without correction (commonly referred as "raw mode"): no correction
> > > applies, if there are bitflips, give them
> > >
> > > Please mind the raw mode can be slow, it's meant for debugging and
> > > testing, mainly. Page reads however should be fast, so if just moving
> > > the column pointer works, then do it, otherwise we'll consider
> > > returning FFs.
>


Thanks,
Miquèl

2023-04-13 05:42:30

by Liang Yang

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi Miquel,

On 2023/4/12 22:32, Miquel Raynal wrote:
> [ EXTERNAL EMAIL ]
>
> Hello,
>
> [email protected] wrote on Wed, 12 Apr 2023 22:04:28 +0800:
>
>> Hi Miquel and Arseniy,
>>
>> On 2023/4/12 20:57, Miquel Raynal wrote:
>>> [ EXTERNAL EMAIL ]
>>>
>>> Hi Arseniy,
>>>
>>> [email protected] wrote on Wed, 12 Apr 2023 15:22:26 +0300:
>>>
>>>> On 12.04.2023 15:18, Miquel Raynal wrote:
>>>>> Hi Arseniy,
>>>>>
>>>>> [email protected] wrote on Wed, 12 Apr 2023 13:14:52 +0300:
>>>>> >>>> On 12.04.2023 12:36, Miquel Raynal wrote:
>>>>>>> Hi Arseniy,
>>>>>>>
>>>>>>> [email protected] wrote on Wed, 12 Apr 2023 12:20:55 +0300:
>>>>>>> >>>>>> On 12.04.2023 10:44, Miquel Raynal wrote:
>>>>>>>>> Hi Arseniy,
>>>>>>>>>
>>>>>>>>> [email protected] wrote on Wed, 12 Apr 2023 09:16:58 +0300:
>>>>>>>>> >>>>>>>> This NAND reads only few user's bytes in ECC mode (not full OOB), so
>>>>>>>>>
>>>>>>>>> "This NAND reads" does not look right, do you mean "Subpage reads do
>>>>>>>>> not retrieve all the OOB bytes,"?
>>>>>>>>> >>>>>>>> fill OOB buffer with zeroes to not return garbage from previous reads
>>>>>>>>>> to user.
>>>>>>>>>> Otherwise 'nanddump' utility prints something like this for just erased
>>>>>>>>>> page:
>>>>>>>>>>
>>>>>>>>>> ...
>>>>>>>>>> 0x000007f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>>>>>>>>>> OOB Data: ff ff ff ff 00 00 ff ff 80 cf 22 99 cb ad d3 be
>>>>>>>>>> OOB Data: 63 27 ae 06 16 0a 2f eb bb dd 46 74 41 8e 88 6e
>>>>>>>>>> OOB Data: 38 a1 2d e6 77 d4 05 06 f2 a5 7e 25 eb 34 7c ff
>>>>>>>>>> OOB Data: 38 ea de 14 10 de 9b 40 33 16 6a cc 9d aa 2f 5e
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Arseniy Krasnov <[email protected]>
>>>>>>>>>> ---
>>>>>>>>>> drivers/mtd/nand/raw/meson_nand.c | 5 +++++
>>>>>>>>>> 1 file changed, 5 insertions(+)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
>>>>>>>>>> index f84a10238e4d..f2f2472cb511 100644
>>>>>>>>>> --- a/drivers/mtd/nand/raw/meson_nand.c
>>>>>>>>>> +++ b/drivers/mtd/nand/raw/meson_nand.c
>>>>>>>>>> @@ -858,9 +858,12 @@ static int meson_nfc_read_page_sub(struct nand_chip *nand,
>>>>>>>>>> static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
>>>>>>>>>> int oob_required, int page)
>>>>>>>>>> {
>>>>>>>>>> + struct mtd_info *mtd = nand_to_mtd(nand);
>>>>>>>>>> u8 *oob_buf = nand->oob_poi;
>>>>>>>>>> int ret;
>>>>>>>>>> >>>>>>>> + memset(oob_buf, 0, mtd->oobsize);
>>>>>>>>>
>>>>>>>>> I'm surprised raw reads do not read the entire OOB?
>>>>>>>>
>>>>>>>> Yes! Seems in case of raw access (what i see in this driver) number of OOB bytes read
>>>>>>>> still depends on ECC parameters: for each portion of data covered with ECC code we can
>>>>>>>> read it's ECC code and "user bytes" from OOB - it is what i see by dumping DMA buffer by
>>>>>>>> printk(). For example I'm working with 2K NAND pages, each page has 2 x 1K ECC blocks.
>>>>>>>> For each ECC block I have 16 OOB bytes which I can access by read/write. Each 16 bytes
>>>>>>>> contains 2 bytes of user's data and 14 bytes ECC codes. So when I read page in raw mode
>>>>>>>> controller returns 32 bytes (2 x (2 + 14)) of OOB. While OOB is reported as 64 bytes.
>>>>>>>
>>>>>>> In all modes, when you read OOB, you should get the full OOB. The fact
>>>>>>> that ECC correction is enabled or disabled does not matter. If the NAND
>>>>>>> features OOB sections of 64 bytes, you should get the 64 bytes.
>>>>>>>
>>>>>>> What happens sometimes, is that some of the bytes are not protected
>>>>>>> against bitflips, but the policy is to return the full buffer.
>>>>>>
>>>>>> Ok, so to clarify case for this NAND controller:
>>>>>> 1) In both ECC and raw modes i need to return the same raw OOB data (e.g. user bytes
>>>>>> + ECC codes)?
>>>>>
>>>>> Well, you need to cover the same amount of data, yes. But in the ECC
>>>>> case the data won't be raw (at least not all of it).
>>>>
>>>> So "same amount of data", in ECC mode current implementation returns only user OOB bytes (e.g.
>>>> OOB data excluding ECC codes), in raw it returns user bytes + ECC codes. IIUC correct
>>>> behaviour is to always return user bytes + ECC codes as OOB data even in ECC mode ?
>>>
>>> If the page are 2k+64B you should read 2k+64B when OOB are requested.
>>>
>>> If the controller only returns 2k+32B, then perform a random read to
>>> just move the read pointer to mtd->size + mtd->oobsize - 32 and
>>> retrieve the missing 32 bytes?
>>
>> 1) raw read can read out the whole page data 2k+64B, decided by the len in the controller raw read command:
>> cmd = (len & GENMASK(5, 0)) | scrambler | DMA_DIR(dir);
>> after that, the missing oob bytes(not used) can be copied from meson_chip->data_buf. so the implementation of meson_nfc_read_page_raw() is like this if need.
>> {
>> ......
>> meson_nfc_read_page_sub(nand, page, 1);
>> meson_nfc_get_data_oob(nand, buf, oob_buf);
>> oob_len = (nand->ecc.bytes + 2) * nand->ecc.steps;
>> memcpy(oob_buf + oob_len, meson_chip->data_buf + oob_len, mtd->oobsize - oob_len);
>>
>> }
>> 2) In ECC mode, the controller can't bring back the missing OOB bytes. it can read out the user bytes and ecc bytes per meson_ooblayout_ops define.
>
> And then (if oob_required) you can bring the missing bytes with
> something along:
> nand_change_read_column_op(chip, mtd->writesize + oob_len,
> oob_buf + oob_len,
> mtd->oobsize - oob_len,
> false);
> Should not be a huge performance hit.

After finishing ECC mode reading, the column address internal in NAND
device should be the right pos; it doesn't need to change the column
again. so adding controller raw read for the missing bytes after ECC
reading may works.

>
>>
>>>
>>> This applies to the two modes, the only difference is:
>>> - with correction (commonly named "ECC mode"): the user bytes and ECC
>>> bytes should be fixed if there are any bitflips
>>> - without correction (commonly referred as "raw mode"): no correction
>>> applies, if there are bitflips, give them
>>>
>>> Please mind the raw mode can be slow, it's meant for debugging and
>>> testing, mainly. Page reads however should be fast, so if just moving
>>> the column pointer works, then do it, otherwise we'll consider
>>> returning FFs.
>>>
>>> Thanks,
>>> Miquèl
>>>
>>
>
>
> Thanks,
> Miquèl
>

--
Thanks,
Liang

2023-04-13 06:15:41

by Liang Yang

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read


On 2023/4/13 13:32, Liang Yang wrote:
> Hi Miquel,
>
> On 2023/4/12 22:32, Miquel Raynal wrote:
>> [ EXTERNAL EMAIL ]
>>
>> Hello,
>>
>> [email protected] wrote on Wed, 12 Apr 2023 22:04:28 +0800:
>>
>>> Hi Miquel and Arseniy,
>>>
>>> On 2023/4/12 20:57, Miquel Raynal wrote:
>>>> [ EXTERNAL EMAIL ]
>>>>
>>>> Hi Arseniy,
>>>>
>>>> [email protected] wrote on Wed, 12 Apr 2023 15:22:26 +0300:
>>>>> On 12.04.2023 15:18, Miquel Raynal wrote:
>>>>>> Hi Arseniy,
>>>>>>
>>>>>> [email protected] wrote on Wed, 12 Apr 2023 13:14:52 +0300:
>>>>>>     >>>> On 12.04.2023 12:36, Miquel Raynal wrote:
>>>>>>>> Hi Arseniy,
>>>>>>>>
>>>>>>>> [email protected] wrote on Wed, 12 Apr 2023 12:20:55 +0300:
>>>>>>>>       >>>>>> On 12.04.2023 10:44, Miquel Raynal wrote:
>>>>>>>>>> Hi Arseniy,
>>>>>>>>>>
>>>>>>>>>> [email protected] wrote on Wed, 12 Apr 2023 09:16:58
>>>>>>>>>> +0300:
>>>>>>>>>>         >>>>>>>> This NAND reads only few user's bytes in ECC
>>>>>>>>>> mode (not full OOB), so
>>>>>>>>>>
>>>>>>>>>> "This NAND reads" does not look right, do you mean "Subpage
>>>>>>>>>> reads do
>>>>>>>>>> not retrieve all the OOB bytes,"?
>>>>>>>>>>         >>>>>>>> fill OOB buffer with zeroes to not return
>>>>>>>>>> garbage from previous reads
>>>>>>>>>>> to user.
>>>>>>>>>>> Otherwise 'nanddump' utility prints something like this for
>>>>>>>>>>> just erased
>>>>>>>>>>> page:
>>>>>>>>>>>
>>>>>>>>>>> ...
>>>>>>>>>>> 0x000007f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>>>>>>>>>>>     OOB Data: ff ff ff ff 00 00 ff ff 80 cf 22 99 cb ad d3 be
>>>>>>>>>>>     OOB Data: 63 27 ae 06 16 0a 2f eb bb dd 46 74 41 8e 88 6e
>>>>>>>>>>>     OOB Data: 38 a1 2d e6 77 d4 05 06 f2 a5 7e 25 eb 34 7c ff
>>>>>>>>>>>     OOB Data: 38 ea de 14 10 de 9b 40 33 16 6a cc 9d aa 2f 5e
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Arseniy Krasnov <[email protected]>
>>>>>>>>>>> ---
>>>>>>>>>>>    drivers/mtd/nand/raw/meson_nand.c | 5 +++++
>>>>>>>>>>>    1 file changed, 5 insertions(+)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/drivers/mtd/nand/raw/meson_nand.c
>>>>>>>>>>> b/drivers/mtd/nand/raw/meson_nand.c
>>>>>>>>>>> index f84a10238e4d..f2f2472cb511 100644
>>>>>>>>>>> --- a/drivers/mtd/nand/raw/meson_nand.c
>>>>>>>>>>> +++ b/drivers/mtd/nand/raw/meson_nand.c
>>>>>>>>>>> @@ -858,9 +858,12 @@ static int
>>>>>>>>>>> meson_nfc_read_page_sub(struct nand_chip *nand,
>>>>>>>>>>>    static int meson_nfc_read_page_raw(struct nand_chip *nand,
>>>>>>>>>>> u8 *buf,
>>>>>>>>>>>                       int oob_required, int page)
>>>>>>>>>>>    {
>>>>>>>>>>> +    struct mtd_info *mtd = nand_to_mtd(nand);
>>>>>>>>>>>        u8 *oob_buf = nand->oob_poi;
>>>>>>>>>>>        int ret;
>>>>>>>>>>>    >>>>>>>> +    memset(oob_buf, 0, mtd->oobsize);
>>>>>>>>>>
>>>>>>>>>> I'm surprised raw reads do not read the entire OOB?
>>>>>>>>>
>>>>>>>>> Yes! Seems in case of raw access (what i see in this driver)
>>>>>>>>> number of OOB bytes read
>>>>>>>>> still depends on ECC parameters: for each portion of data
>>>>>>>>> covered with ECC code we can
>>>>>>>>> read it's ECC code and "user bytes" from OOB - it is what i see
>>>>>>>>> by dumping DMA buffer by
>>>>>>>>> printk(). For example I'm working with 2K NAND pages, each page
>>>>>>>>> has 2 x 1K ECC blocks.
>>>>>>>>> For each ECC block I have 16 OOB bytes which I can access by
>>>>>>>>> read/write. Each 16 bytes
>>>>>>>>> contains 2 bytes of user's data and 14 bytes ECC codes. So when
>>>>>>>>> I read page in raw mode
>>>>>>>>> controller returns 32 bytes (2 x (2 + 14)) of OOB. While OOB is
>>>>>>>>> reported as 64 bytes.
>>>>>>>>
>>>>>>>> In all modes, when you read OOB, you should get the full OOB.
>>>>>>>> The fact
>>>>>>>> that ECC correction is enabled or disabled does not matter. If
>>>>>>>> the NAND
>>>>>>>> features OOB sections of 64 bytes, you should get the 64 bytes.
>>>>>>>>
>>>>>>>> What happens sometimes, is that some of the bytes are not protected
>>>>>>>> against bitflips, but the policy is to return the full buffer.
>>>>>>>
>>>>>>> Ok, so to clarify case for this NAND controller:
>>>>>>> 1) In both ECC and raw modes i need to return the same raw OOB
>>>>>>> data (e.g. user bytes
>>>>>>>      + ECC codes)?
>>>>>>
>>>>>> Well, you need to cover the same amount of data, yes. But in the ECC
>>>>>> case the data won't be raw (at least not all of it).
>>>>>
>>>>> So "same amount of data", in ECC mode current implementation
>>>>> returns only user OOB bytes (e.g.
>>>>> OOB data excluding ECC codes), in raw it returns user bytes + ECC
>>>>> codes. IIUC correct
>>>>> behaviour is to always return user bytes + ECC codes as OOB data
>>>>> even in ECC mode ?
>>>>
>>>> If the page are 2k+64B you should read 2k+64B when OOB are requested.
>>>>
>>>> If the controller only returns 2k+32B, then perform a random read to
>>>> just move the read pointer to mtd->size + mtd->oobsize - 32 and
>>>> retrieve the missing 32 bytes?
>>>
>>> 1) raw read can read out the whole page data 2k+64B, decided by the
>>> len in the controller raw read command:
>>>     cmd = (len & GENMASK(5, 0)) | scrambler | DMA_DIR(dir);
>>> after that, the missing oob bytes(not used) can be copied from
>>> meson_chip->data_buf. so the implementation of
>>> meson_nfc_read_page_raw() is like this if need.
>>>     {
>>>         ......
>>>         meson_nfc_read_page_sub(nand, page, 1);
>>>         meson_nfc_get_data_oob(nand, buf, oob_buf);
>>>         oob_len = (nand->ecc.bytes + 2) * nand->ecc.steps;
>>>         memcpy(oob_buf + oob_len, meson_chip->data_buf + oob_len,
>>> mtd->oobsize - oob_len);
>>>
>>>     }
>>> 2) In ECC mode, the controller can't bring back the missing OOB
>>> bytes. it can read out the user bytes and ecc bytes per
>>> meson_ooblayout_ops define.
>>
>> And then (if oob_required) you can bring the missing bytes with
>> something along:
>> nand_change_read_column_op(chip, mtd->writesize + oob_len,
>>                oob_buf + oob_len,
>>                mtd->oobsize - oob_len,
>>                false);
>> Should not be a huge performance hit.
>
> After finishing ECC mode reading, the column address internal in NAND
> device should be the right pos; it doesn't need to change the column
> again. so adding controller raw read for the missing bytes after ECC
> reading may works.
>
use raw read for the missing bytes, but they are not protected by host
ECC. to the NAND type of storage, is it ok or missing bytes better to be
filled with 0xff?

>>
>>>
>>>>
>>>> This applies to the two modes, the only difference is:
>>>> - with correction (commonly named "ECC mode"): the user bytes and ECC
>>>>     bytes should be fixed if there are any bitflips
>>>> - without correction (commonly referred as "raw mode"): no correction
>>>>     applies, if there are bitflips, give them
>>>>
>>>> Please mind the raw mode can be slow, it's meant for debugging and
>>>> testing, mainly. Page reads however should be fast, so if just moving
>>>> the column pointer works, then do it, otherwise we'll consider
>>>> returning FFs.
>>>>
>>>> Thanks,
>>>> Miquèl
>>>
>>
>>
>> Thanks,
>> Miquèl
>>
>

--
Thanks,
Liang

2023-04-13 08:36:13

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi Arseniy,

[email protected] wrote on Thu, 13 Apr 2023 10:00:24 +0300:

> On 13.04.2023 09:11, Liang Yang wrote:
> >
> > On 2023/4/13 13:32, Liang Yang wrote:
> >> Hi Miquel,
> >>
> >> On 2023/4/12 22:32, Miquel Raynal wrote:
> >>> [ EXTERNAL EMAIL ]
> >>>
> >>> Hello,
> >>>
> >>> [email protected] wrote on Wed, 12 Apr 2023 22:04:28 +0800:
> >>>
> >>>> Hi Miquel and Arseniy,
> >>>>
> >>>> On 2023/4/12 20:57, Miquel Raynal wrote:
> >>>>> [ EXTERNAL EMAIL ]
> >>>>>
> >>>>> Hi Arseniy,
> >>>>>
> >>>>> [email protected] wrote on Wed, 12 Apr 2023 15:22:26 +0300:
> >>>>>> On 12.04.2023 15:18, Miquel Raynal wrote:
> >>>>>>> Hi Arseniy,
> >>>>>>>
> >>>>>>> [email protected] wrote on Wed, 12 Apr 2023 13:14:52 +0300:
> >>>>>>>     >>>> On 12.04.2023 12:36, Miquel Raynal wrote:
> >>>>>>>>> Hi Arseniy,
> >>>>>>>>>
> >>>>>>>>> [email protected] wrote on Wed, 12 Apr 2023 12:20:55 +0300:
> >>>>>>>>>       >>>>>> On 12.04.2023 10:44, Miquel Raynal wrote:
> >>>>>>>>>>> Hi Arseniy,
> >>>>>>>>>>>
> >>>>>>>>>>> [email protected] wrote on Wed, 12 Apr 2023 09:16:58 +0300:
> >>>>>>>>>>>         >>>>>>>> This NAND reads only few user's bytes in ECC mode (not full OOB), so
> >>>>>>>>>>>
> >>>>>>>>>>> "This NAND reads" does not look right, do you mean "Subpage reads do
> >>>>>>>>>>> not retrieve all the OOB bytes,"?
> >>>>>>>>>>>         >>>>>>>> fill OOB buffer with zeroes to not return garbage from previous reads
> >>>>>>>>>>>> to user.
> >>>>>>>>>>>> Otherwise 'nanddump' utility prints something like this for just erased
> >>>>>>>>>>>> page:
> >>>>>>>>>>>>
> >>>>>>>>>>>> ...
> >>>>>>>>>>>> 0x000007f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> >>>>>>>>>>>>     OOB Data: ff ff ff ff 00 00 ff ff 80 cf 22 99 cb ad d3 be
> >>>>>>>>>>>>     OOB Data: 63 27 ae 06 16 0a 2f eb bb dd 46 74 41 8e 88 6e
> >>>>>>>>>>>>     OOB Data: 38 a1 2d e6 77 d4 05 06 f2 a5 7e 25 eb 34 7c ff
> >>>>>>>>>>>>     OOB Data: 38 ea de 14 10 de 9b 40 33 16 6a cc 9d aa 2f 5e
> >>>>>>>>>>>>
> >>>>>>>>>>>> Signed-off-by: Arseniy Krasnov <[email protected]>
> >>>>>>>>>>>> ---
> >>>>>>>>>>>>    drivers/mtd/nand/raw/meson_nand.c | 5 +++++
> >>>>>>>>>>>>    1 file changed, 5 insertions(+)
> >>>>>>>>>>>>
> >>>>>>>>>>>> diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
> >>>>>>>>>>>> index f84a10238e4d..f2f2472cb511 100644
> >>>>>>>>>>>> --- a/drivers/mtd/nand/raw/meson_nand.c
> >>>>>>>>>>>> +++ b/drivers/mtd/nand/raw/meson_nand.c
> >>>>>>>>>>>> @@ -858,9 +858,12 @@ static int meson_nfc_read_page_sub(struct nand_chip *nand,
> >>>>>>>>>>>>    static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
> >>>>>>>>>>>>                       int oob_required, int page)
> >>>>>>>>>>>>    {
> >>>>>>>>>>>> +    struct mtd_info *mtd = nand_to_mtd(nand);
> >>>>>>>>>>>>        u8 *oob_buf = nand->oob_poi;
> >>>>>>>>>>>>        int ret;
> >>>>>>>>>>>>    >>>>>>>> +    memset(oob_buf, 0, mtd->oobsize);
> >>>>>>>>>>>
> >>>>>>>>>>> I'm surprised raw reads do not read the entire OOB?
> >>>>>>>>>>
> >>>>>>>>>> Yes! Seems in case of raw access (what i see in this driver) number of OOB bytes read
> >>>>>>>>>> still depends on ECC parameters: for each portion of data covered with ECC code we can
> >>>>>>>>>> read it's ECC code and "user bytes" from OOB - it is what i see by dumping DMA buffer by
> >>>>>>>>>> printk(). For example I'm working with 2K NAND pages, each page has 2 x 1K ECC blocks.
> >>>>>>>>>> For each ECC block I have 16 OOB bytes which I can access by read/write. Each 16 bytes
> >>>>>>>>>> contains 2 bytes of user's data and 14 bytes ECC codes. So when I read page in raw mode
> >>>>>>>>>> controller returns 32 bytes (2 x (2 + 14)) of OOB. While OOB is reported as 64 bytes.
> >>>>>>>>>
> >>>>>>>>> In all modes, when you read OOB, you should get the full OOB. The fact
> >>>>>>>>> that ECC correction is enabled or disabled does not matter. If the NAND
> >>>>>>>>> features OOB sections of 64 bytes, you should get the 64 bytes.
> >>>>>>>>>
> >>>>>>>>> What happens sometimes, is that some of the bytes are not protected
> >>>>>>>>> against bitflips, but the policy is to return the full buffer.
> >>>>>>>>
> >>>>>>>> Ok, so to clarify case for this NAND controller:
> >>>>>>>> 1) In both ECC and raw modes i need to return the same raw OOB data (e.g. user bytes
> >>>>>>>>      + ECC codes)?
> >>>>>>>
> >>>>>>> Well, you need to cover the same amount of data, yes. But in the ECC
> >>>>>>> case the data won't be raw (at least not all of it).
> >>>>>>
> >>>>>> So "same amount of data", in ECC mode current implementation returns only user OOB bytes (e.g.
> >>>>>> OOB data excluding ECC codes), in raw it returns user bytes + ECC codes. IIUC correct
> >>>>>> behaviour is to always return user bytes + ECC codes as OOB data even in ECC mode ?
> >>>>>
> >>>>> If the page are 2k+64B you should read 2k+64B when OOB are requested.
> >>>>>
> >>>>> If the controller only returns 2k+32B, then perform a random read to
> >>>>> just move the read pointer to mtd->size + mtd->oobsize - 32 and
> >>>>> retrieve the missing 32 bytes?
> >>>>
> >>>> 1) raw read can read out the whole page data 2k+64B, decided by the len in the controller raw read command:
> >>>>     cmd = (len & GENMASK(5, 0)) | scrambler | DMA_DIR(dir);
> >>>> after that, the missing oob bytes(not used) can be copied from meson_chip->data_buf. so the implementation of meson_nfc_read_page_raw() is like this if need.
> >>>>     {
> >>>>         ......
> >>>>         meson_nfc_read_page_sub(nand, page, 1);
> >>>>         meson_nfc_get_data_oob(nand, buf, oob_buf);
> >>>>         oob_len = (nand->ecc.bytes + 2) * nand->ecc.steps;
> >>>>         memcpy(oob_buf + oob_len, meson_chip->data_buf + oob_len, mtd->oobsize - oob_len);
> >>>>
> >>>>     }
> >>>> 2) In ECC mode, the controller can't bring back the missing OOB bytes. it can read out the user bytes and ecc bytes per meson_ooblayout_ops define.
> >>>
> >>> And then (if oob_required) you can bring the missing bytes with
> >>> something along:
> >>> nand_change_read_column_op(chip, mtd->writesize + oob_len,
> >>>                oob_buf + oob_len,
> >>>                mtd->oobsize - oob_len,
> >>>                false);
> >>> Should not be a huge performance hit.
> >>
> >> After finishing ECC mode reading, the column address internal in NAND device should be the right pos; it doesn't need to change the column again. so adding controller raw read for the missing bytes after ECC reading may works.
> >>
> > use raw read for the missing bytes, but they are not protected by host ECC. to the NAND type of storage, is it ok or missing bytes better to be filled with 0xff?
>
> IIUC Miquèl's reply, valid behaviour is to return full OOB data in both modes. So in:
> ECC mode it is user bytes(corrected by ECC, read from info buffer) + ECC + missing bytes. ECC and missing bytes read in RAW mode.

I believe the ECC bytes you'll get will be corrected.
You can check this by using nandflipbits in mtd-utils.

> RAW mode it is user bytes(not corrected by ECC) + ECC + missing bytes
>
>
> Also @Liang, is this valid code (drivers/mtd/nand/raw/meson_nand.c)?
>
> ret = nand_check_erased_ecc_chunk(data, ecc->size,
> oob, ecc->bytes + 2,
> NULL, 0,
> ecc->strength);
>
> It confused me, because 'oob' buffer contains both user bytes and ECC code,
> 'ecc->bytes + 2' is 16. May be it should be:
>
> ret = nand_check_erased_ecc_chunk(data, ecc->size,
> oob + 2, ecc->bytes,
> NULL, 0,
> ecc->strength);

When you check for an erased chunk you should probably check the whole
OOB area.

>
> For example let's look on Tegra's driver (drivers/mtd/nand/raw/tegra_nand.c):
>
> u8 *oob = chip->oob_poi + nand->ecc.offset +
> (chip->ecc.bytes * bit);
>
> ret = nand_check_erased_ecc_chunk(data, chip->ecc.size,
> oob, chip->ecc.bytes,
> NULL, 0,
> chip->ecc.strength);
>
> 'oob' contains 'nand->ecc.offset', and ECC length does not account user bytes length
> (e.g. 2) - it is just 'chip->ecc.bytes'

I haven't looked carefully, but be aware there are two user bytes
reserved at the beginning of the OOB area for marking bad blocks. There
*may* be a confusion somewhere. I am not saying there is one, just a
hint if you can't find an explanation.


Thanks,
Miquèl

2023-04-13 08:36:52

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi Liang,

[email protected] wrote on Thu, 13 Apr 2023 13:32:03 +0800:

> Hi Miquel,
>
> On 2023/4/12 22:32, Miquel Raynal wrote:
> > [ EXTERNAL EMAIL ]
> >
> > Hello,
> >
> > [email protected] wrote on Wed, 12 Apr 2023 22:04:28 +0800:
> >
> >> Hi Miquel and Arseniy,
> >>
> >> On 2023/4/12 20:57, Miquel Raynal wrote:
> >>> [ EXTERNAL EMAIL ]
> >>>
> >>> Hi Arseniy,
> >>>
> >>> [email protected] wrote on Wed, 12 Apr 2023 15:22:26 +0300:
> >>> >>>> On 12.04.2023 15:18, Miquel Raynal wrote:
> >>>>> Hi Arseniy,
> >>>>>
> >>>>> [email protected] wrote on Wed, 12 Apr 2023 13:14:52 +0300:
> >>>>> >>>> On 12.04.2023 12:36, Miquel Raynal wrote:
> >>>>>>> Hi Arseniy,
> >>>>>>>
> >>>>>>> [email protected] wrote on Wed, 12 Apr 2023 12:20:55 +0300:
> >>>>>>> >>>>>> On 12.04.2023 10:44, Miquel Raynal wrote:
> >>>>>>>>> Hi Arseniy,
> >>>>>>>>>
> >>>>>>>>> [email protected] wrote on Wed, 12 Apr 2023 09:16:58 +0300:
> >>>>>>>>> >>>>>>>> This NAND reads only few user's bytes in ECC mode (not full OOB), so
> >>>>>>>>>
> >>>>>>>>> "This NAND reads" does not look right, do you mean "Subpage reads do
> >>>>>>>>> not retrieve all the OOB bytes,"?
> >>>>>>>>> >>>>>>>> fill OOB buffer with zeroes to not return garbage from previous reads
> >>>>>>>>>> to user.
> >>>>>>>>>> Otherwise 'nanddump' utility prints something like this for just erased
> >>>>>>>>>> page:
> >>>>>>>>>>
> >>>>>>>>>> ...
> >>>>>>>>>> 0x000007f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> >>>>>>>>>> OOB Data: ff ff ff ff 00 00 ff ff 80 cf 22 99 cb ad d3 be
> >>>>>>>>>> OOB Data: 63 27 ae 06 16 0a 2f eb bb dd 46 74 41 8e 88 6e
> >>>>>>>>>> OOB Data: 38 a1 2d e6 77 d4 05 06 f2 a5 7e 25 eb 34 7c ff
> >>>>>>>>>> OOB Data: 38 ea de 14 10 de 9b 40 33 16 6a cc 9d aa 2f 5e
> >>>>>>>>>>
> >>>>>>>>>> Signed-off-by: Arseniy Krasnov <[email protected]>
> >>>>>>>>>> ---
> >>>>>>>>>> drivers/mtd/nand/raw/meson_nand.c | 5 +++++
> >>>>>>>>>> 1 file changed, 5 insertions(+)
> >>>>>>>>>>
> >>>>>>>>>> diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
> >>>>>>>>>> index f84a10238e4d..f2f2472cb511 100644
> >>>>>>>>>> --- a/drivers/mtd/nand/raw/meson_nand.c
> >>>>>>>>>> +++ b/drivers/mtd/nand/raw/meson_nand.c
> >>>>>>>>>> @@ -858,9 +858,12 @@ static int meson_nfc_read_page_sub(struct nand_chip *nand,
> >>>>>>>>>> static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
> >>>>>>>>>> int oob_required, int page)
> >>>>>>>>>> {
> >>>>>>>>>> + struct mtd_info *mtd = nand_to_mtd(nand);
> >>>>>>>>>> u8 *oob_buf = nand->oob_poi;
> >>>>>>>>>> int ret;
> >>>>>>>>>> >>>>>>>> + memset(oob_buf, 0, mtd->oobsize);
> >>>>>>>>>
> >>>>>>>>> I'm surprised raw reads do not read the entire OOB?
> >>>>>>>>
> >>>>>>>> Yes! Seems in case of raw access (what i see in this driver) number of OOB bytes read
> >>>>>>>> still depends on ECC parameters: for each portion of data covered with ECC code we can
> >>>>>>>> read it's ECC code and "user bytes" from OOB - it is what i see by dumping DMA buffer by
> >>>>>>>> printk(). For example I'm working with 2K NAND pages, each page has 2 x 1K ECC blocks.
> >>>>>>>> For each ECC block I have 16 OOB bytes which I can access by read/write. Each 16 bytes
> >>>>>>>> contains 2 bytes of user's data and 14 bytes ECC codes. So when I read page in raw mode
> >>>>>>>> controller returns 32 bytes (2 x (2 + 14)) of OOB. While OOB is reported as 64 bytes.
> >>>>>>>
> >>>>>>> In all modes, when you read OOB, you should get the full OOB. The fact
> >>>>>>> that ECC correction is enabled or disabled does not matter. If the NAND
> >>>>>>> features OOB sections of 64 bytes, you should get the 64 bytes.
> >>>>>>>
> >>>>>>> What happens sometimes, is that some of the bytes are not protected
> >>>>>>> against bitflips, but the policy is to return the full buffer.
> >>>>>>
> >>>>>> Ok, so to clarify case for this NAND controller:
> >>>>>> 1) In both ECC and raw modes i need to return the same raw OOB data (e.g. user bytes
> >>>>>> + ECC codes)?
> >>>>>
> >>>>> Well, you need to cover the same amount of data, yes. But in the ECC
> >>>>> case the data won't be raw (at least not all of it).
> >>>>
> >>>> So "same amount of data", in ECC mode current implementation returns only user OOB bytes (e.g.
> >>>> OOB data excluding ECC codes), in raw it returns user bytes + ECC codes. IIUC correct
> >>>> behaviour is to always return user bytes + ECC codes as OOB data even in ECC mode ?
> >>>
> >>> If the page are 2k+64B you should read 2k+64B when OOB are requested.
> >>>
> >>> If the controller only returns 2k+32B, then perform a random read to
> >>> just move the read pointer to mtd->size + mtd->oobsize - 32 and
> >>> retrieve the missing 32 bytes?
> >>
> >> 1) raw read can read out the whole page data 2k+64B, decided by the len in the controller raw read command:
> >> cmd = (len & GENMASK(5, 0)) | scrambler | DMA_DIR(dir);
> >> after that, the missing oob bytes(not used) can be copied from meson_chip->data_buf. so the implementation of meson_nfc_read_page_raw() is like this if need.
> >> {
> >> ......
> >> meson_nfc_read_page_sub(nand, page, 1);
> >> meson_nfc_get_data_oob(nand, buf, oob_buf);
> >> oob_len = (nand->ecc.bytes + 2) * nand->ecc.steps;
> >> memcpy(oob_buf + oob_len, meson_chip->data_buf + oob_len, mtd->oobsize - oob_len);
> >>
> >> }
> >> 2) In ECC mode, the controller can't bring back the missing OOB bytes. it can read out the user bytes and ecc bytes per meson_ooblayout_ops define.
> >
> > And then (if oob_required) you can bring the missing bytes with
> > something along:
> > nand_change_read_column_op(chip, mtd->writesize + oob_len,
> > oob_buf + oob_len,
> > mtd->oobsize - oob_len,
> > false);
> > Should not be a huge performance hit.
>
> After finishing ECC mode reading, the column address internal in NAND device should be the right pos; it doesn't need to change the column again. so adding controller raw read for the missing bytes after ECC reading may works.

Yes, if the last byte read is the one right before the "missing" bytes,
then that will work.

Thanks,
Miquèl

2023-04-13 09:30:21

by Dmitry Rokosov

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

On Wed, Apr 12, 2023 at 10:56:03PM +0200, Miquel Raynal wrote:
> Hi Dmitry,
>
> [email protected] wrote on Wed, 12 Apr 2023 22:15:48 +0300:
>
> > On Wed, Apr 12, 2023 at 10:04:28PM +0800, Liang Yang wrote:
> > > Hi Miquel and Arseniy,
> > >
> > > On 2023/4/12 20:57, Miquel Raynal wrote:
> > > > [ EXTERNAL EMAIL ]
> > > >
> > > > Hi Arseniy,
> > > >
> > > > [email protected] wrote on Wed, 12 Apr 2023 15:22:26 +0300:
> > > >
> > > > > On 12.04.2023 15:18, Miquel Raynal wrote:
> > > > > > Hi Arseniy,
> > > > > >
> > > > > > [email protected] wrote on Wed, 12 Apr 2023 13:14:52 +0300:
> > > > > > > On 12.04.2023 12:36, Miquel Raynal wrote:
> > > > > > > > Hi Arseniy,
> > > > > > > >
> > > > > > > > [email protected] wrote on Wed, 12 Apr 2023 12:20:55 +0300:
> > > > > > > > > On 12.04.2023 10:44, Miquel Raynal wrote:
> > > > > > > > > > Hi Arseniy,
> > > > > > > > > >
> > > > > > > > > > [email protected] wrote on Wed, 12 Apr 2023 09:16:58 +0300:
> > > > > > > > > > > This NAND reads only few user's bytes in ECC mode (not full OOB), so
> > > > > > > > > >
> > > > > > > > > > "This NAND reads" does not look right, do you mean "Subpage reads do
> > > > > > > > > > not retrieve all the OOB bytes,"?
> > > > > > > > > > > fill OOB buffer with zeroes to not return garbage from previous reads
> > > > > > > > > > > to user.
> > > > > > > > > > > Otherwise 'nanddump' utility prints something like this for just erased
> > > > > > > > > > > page:
> > > > > > > > > > >
> > > > > > > > > > > ...
> > > > > > > > > > > 0x000007f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> > > > > > > > > > > OOB Data: ff ff ff ff 00 00 ff ff 80 cf 22 99 cb ad d3 be
> > > > > > > > > > > OOB Data: 63 27 ae 06 16 0a 2f eb bb dd 46 74 41 8e 88 6e
> > > > > > > > > > > OOB Data: 38 a1 2d e6 77 d4 05 06 f2 a5 7e 25 eb 34 7c ff
> > > > > > > > > > > OOB Data: 38 ea de 14 10 de 9b 40 33 16 6a cc 9d aa 2f 5e
> > > > > > > > > > >
> > > > > > > > > > > Signed-off-by: Arseniy Krasnov <[email protected]>
> > > > > > > > > > > ---
> > > > > > > > > > > drivers/mtd/nand/raw/meson_nand.c | 5 +++++
> > > > > > > > > > > 1 file changed, 5 insertions(+)
> > > > > > > > > > >
> > > > > > > > > > > diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
> > > > > > > > > > > index f84a10238e4d..f2f2472cb511 100644
> > > > > > > > > > > --- a/drivers/mtd/nand/raw/meson_nand.c
> > > > > > > > > > > +++ b/drivers/mtd/nand/raw/meson_nand.c
> > > > > > > > > > > @@ -858,9 +858,12 @@ static int meson_nfc_read_page_sub(struct nand_chip *nand,
> > > > > > > > > > > static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
> > > > > > > > > > > int oob_required, int page)
> > > > > > > > > > > {
> > > > > > > > > > > + struct mtd_info *mtd = nand_to_mtd(nand);
> > > > > > > > > > > u8 *oob_buf = nand->oob_poi;
> > > > > > > > > > > int ret;
> > > > > > > > > > > + memset(oob_buf, 0, mtd->oobsize);
> > > > > > > > > >
> > > > > > > > > > I'm surprised raw reads do not read the entire OOB?
> > > > > > > > >
> > > > > > > > > Yes! Seems in case of raw access (what i see in this driver) number of OOB bytes read
> > > > > > > > > still depends on ECC parameters: for each portion of data covered with ECC code we can
> > > > > > > > > read it's ECC code and "user bytes" from OOB - it is what i see by dumping DMA buffer by
> > > > > > > > > printk(). For example I'm working with 2K NAND pages, each page has 2 x 1K ECC blocks.
> > > > > > > > > For each ECC block I have 16 OOB bytes which I can access by read/write. Each 16 bytes
> > > > > > > > > contains 2 bytes of user's data and 14 bytes ECC codes. So when I read page in raw mode
> > > > > > > > > controller returns 32 bytes (2 x (2 + 14)) of OOB. While OOB is reported as 64 bytes.
> > > > > > > >
> > > > > > > > In all modes, when you read OOB, you should get the full OOB. The fact
> > > > > > > > that ECC correction is enabled or disabled does not matter. If the NAND
> > > > > > > > features OOB sections of 64 bytes, you should get the 64 bytes.
> > > > > > > >
> > > > > > > > What happens sometimes, is that some of the bytes are not protected
> > > > > > > > against bitflips, but the policy is to return the full buffer.
> > > > > > >
> > > > > > > Ok, so to clarify case for this NAND controller:
> > > > > > > 1) In both ECC and raw modes i need to return the same raw OOB data (e.g. user bytes
> > > > > > > + ECC codes)?
> > > > > >
> > > > > > Well, you need to cover the same amount of data, yes. But in the ECC
> > > > > > case the data won't be raw (at least not all of it).
> > > > >
> > > > > So "same amount of data", in ECC mode current implementation returns only user OOB bytes (e.g.
> > > > > OOB data excluding ECC codes), in raw it returns user bytes + ECC codes. IIUC correct
> > > > > behaviour is to always return user bytes + ECC codes as OOB data even in ECC mode ?
> > > >
> > > > If the page are 2k+64B you should read 2k+64B when OOB are requested.
> > > >
> > > > If the controller only returns 2k+32B, then perform a random read to
> > > > just move the read pointer to mtd->size + mtd->oobsize - 32 and
> > > > retrieve the missing 32 bytes?
> > >
> > > 1) raw read can read out the whole page data 2k+64B, decided by the len in
> > > the controller raw read command:
> > > cmd = (len & GENMASK(5, 0)) | scrambler | DMA_DIR(dir);
> > > after that, the missing oob bytes(not used) can be copied from
> > > meson_chip->data_buf. so the implementation of meson_nfc_read_page_raw() is
> > > like this if need.
> > > {
> > > ......
> > > meson_nfc_read_page_sub(nand, page, 1);
> > > meson_nfc_get_data_oob(nand, buf, oob_buf);
> > > oob_len = (nand->ecc.bytes + 2) * nand->ecc.steps;
> > > memcpy(oob_buf + oob_len, meson_chip->data_buf + oob_len, mtd->oobsize -
> > > oob_len);
> > >
> > > }
> > > 2) In ECC mode, the controller can't bring back the missing OOB bytes. it
> > > can read out the user bytes and ecc bytes per meson_ooblayout_ops define.
> > >
> >
> > How does the Meson controller know the actual NAND flash layout when the
> > OOB is split into protected and unprotected areas, such as Free and ECC
> > areas? If the controller has a static OOB layout, where is the mapping
> > located?
>
> It's usually a set of values hardcoded in the driver. It's a per
> geometry set.
>

Sorry, I'm still confused. Before I developed spinand drivers, the OOB layout
was located on the flash driver side.
Do you mean if the OOB geometry in the rawnand subsystem is under the
responsibility of the controller driver?

> >
> > > >
> > > > This applies to the two modes, the only difference is:
> > > > - with correction (commonly named "ECC mode"): the user bytes and ECC
> > > > bytes should be fixed if there are any bitflips
> > > > - without correction (commonly referred as "raw mode"): no correction
> > > > applies, if there are bitflips, give them
> > > >
> > > > Please mind the raw mode can be slow, it's meant for debugging and
> > > > testing, mainly. Page reads however should be fast, so if just moving
> > > > the column pointer works, then do it, otherwise we'll consider
> > > > returning FFs.
> >

--
Thank you,
Dmitry

2023-04-13 10:29:38

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi Arseniy,

[email protected] wrote on Thu, 13 Apr 2023 12:36:24 +0300:

> On 13.04.2023 11:22, Miquel Raynal wrote:
> > Hi Arseniy,
> >
> > [email protected] wrote on Thu, 13 Apr 2023 10:00:24 +0300:
> >
> >> On 13.04.2023 09:11, Liang Yang wrote:
> >>>
> >>> On 2023/4/13 13:32, Liang Yang wrote:
> >>>> Hi Miquel,
> >>>>
> >>>> On 2023/4/12 22:32, Miquel Raynal wrote:
> >>>>> [ EXTERNAL EMAIL ]
> >>>>>
> >>>>> Hello,
> >>>>>
> >>>>> [email protected] wrote on Wed, 12 Apr 2023 22:04:28 +0800:
> >>>>>
> >>>>>> Hi Miquel and Arseniy,
> >>>>>>
> >>>>>> On 2023/4/12 20:57, Miquel Raynal wrote:
> >>>>>>> [ EXTERNAL EMAIL ]
> >>>>>>>
> >>>>>>> Hi Arseniy,
> >>>>>>>
> >>>>>>> [email protected] wrote on Wed, 12 Apr 2023 15:22:26 +0300:
> >>>>>>>> On 12.04.2023 15:18, Miquel Raynal wrote:
> >>>>>>>>> Hi Arseniy,
> >>>>>>>>>
> >>>>>>>>> [email protected] wrote on Wed, 12 Apr 2023 13:14:52 +0300:
> >>>>>>>>>     >>>> On 12.04.2023 12:36, Miquel Raynal wrote:
> >>>>>>>>>>> Hi Arseniy,
> >>>>>>>>>>>
> >>>>>>>>>>> [email protected] wrote on Wed, 12 Apr 2023 12:20:55 +0300:
> >>>>>>>>>>>       >>>>>> On 12.04.2023 10:44, Miquel Raynal wrote:
> >>>>>>>>>>>>> Hi Arseniy,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> [email protected] wrote on Wed, 12 Apr 2023 09:16:58 +0300:
> >>>>>>>>>>>>>         >>>>>>>> This NAND reads only few user's bytes in ECC mode (not full OOB), so
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> "This NAND reads" does not look right, do you mean "Subpage reads do
> >>>>>>>>>>>>> not retrieve all the OOB bytes,"?
> >>>>>>>>>>>>>         >>>>>>>> fill OOB buffer with zeroes to not return garbage from previous reads
> >>>>>>>>>>>>>> to user.
> >>>>>>>>>>>>>> Otherwise 'nanddump' utility prints something like this for just erased
> >>>>>>>>>>>>>> page:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> ...
> >>>>>>>>>>>>>> 0x000007f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> >>>>>>>>>>>>>>     OOB Data: ff ff ff ff 00 00 ff ff 80 cf 22 99 cb ad d3 be
> >>>>>>>>>>>>>>     OOB Data: 63 27 ae 06 16 0a 2f eb bb dd 46 74 41 8e 88 6e
> >>>>>>>>>>>>>>     OOB Data: 38 a1 2d e6 77 d4 05 06 f2 a5 7e 25 eb 34 7c ff
> >>>>>>>>>>>>>>     OOB Data: 38 ea de 14 10 de 9b 40 33 16 6a cc 9d aa 2f 5e
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Signed-off-by: Arseniy Krasnov <[email protected]>
> >>>>>>>>>>>>>> ---
> >>>>>>>>>>>>>>    drivers/mtd/nand/raw/meson_nand.c | 5 +++++
> >>>>>>>>>>>>>>    1 file changed, 5 insertions(+)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
> >>>>>>>>>>>>>> index f84a10238e4d..f2f2472cb511 100644
> >>>>>>>>>>>>>> --- a/drivers/mtd/nand/raw/meson_nand.c
> >>>>>>>>>>>>>> +++ b/drivers/mtd/nand/raw/meson_nand.c
> >>>>>>>>>>>>>> @@ -858,9 +858,12 @@ static int meson_nfc_read_page_sub(struct nand_chip *nand,
> >>>>>>>>>>>>>>    static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
> >>>>>>>>>>>>>>                       int oob_required, int page)
> >>>>>>>>>>>>>>    {
> >>>>>>>>>>>>>> +    struct mtd_info *mtd = nand_to_mtd(nand);
> >>>>>>>>>>>>>>        u8 *oob_buf = nand->oob_poi;
> >>>>>>>>>>>>>>        int ret;
> >>>>>>>>>>>>>>    >>>>>>>> +    memset(oob_buf, 0, mtd->oobsize);
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I'm surprised raw reads do not read the entire OOB?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Yes! Seems in case of raw access (what i see in this driver) number of OOB bytes read
> >>>>>>>>>>>> still depends on ECC parameters: for each portion of data covered with ECC code we can
> >>>>>>>>>>>> read it's ECC code and "user bytes" from OOB - it is what i see by dumping DMA buffer by
> >>>>>>>>>>>> printk(). For example I'm working with 2K NAND pages, each page has 2 x 1K ECC blocks.
> >>>>>>>>>>>> For each ECC block I have 16 OOB bytes which I can access by read/write. Each 16 bytes
> >>>>>>>>>>>> contains 2 bytes of user's data and 14 bytes ECC codes. So when I read page in raw mode
> >>>>>>>>>>>> controller returns 32 bytes (2 x (2 + 14)) of OOB. While OOB is reported as 64 bytes.
> >>>>>>>>>>>
> >>>>>>>>>>> In all modes, when you read OOB, you should get the full OOB. The fact
> >>>>>>>>>>> that ECC correction is enabled or disabled does not matter. If the NAND
> >>>>>>>>>>> features OOB sections of 64 bytes, you should get the 64 bytes.
> >>>>>>>>>>>
> >>>>>>>>>>> What happens sometimes, is that some of the bytes are not protected
> >>>>>>>>>>> against bitflips, but the policy is to return the full buffer.
> >>>>>>>>>>
> >>>>>>>>>> Ok, so to clarify case for this NAND controller:
> >>>>>>>>>> 1) In both ECC and raw modes i need to return the same raw OOB data (e.g. user bytes
> >>>>>>>>>>      + ECC codes)?
> >>>>>>>>>
> >>>>>>>>> Well, you need to cover the same amount of data, yes. But in the ECC
> >>>>>>>>> case the data won't be raw (at least not all of it).
> >>>>>>>>
> >>>>>>>> So "same amount of data", in ECC mode current implementation returns only user OOB bytes (e.g.
> >>>>>>>> OOB data excluding ECC codes), in raw it returns user bytes + ECC codes. IIUC correct
> >>>>>>>> behaviour is to always return user bytes + ECC codes as OOB data even in ECC mode ?
> >>>>>>>
> >>>>>>> If the page are 2k+64B you should read 2k+64B when OOB are requested.
> >>>>>>>
> >>>>>>> If the controller only returns 2k+32B, then perform a random read to
> >>>>>>> just move the read pointer to mtd->size + mtd->oobsize - 32 and
> >>>>>>> retrieve the missing 32 bytes?
> >>>>>>
> >>>>>> 1) raw read can read out the whole page data 2k+64B, decided by the len in the controller raw read command:
> >>>>>>     cmd = (len & GENMASK(5, 0)) | scrambler | DMA_DIR(dir);
> >>>>>> after that, the missing oob bytes(not used) can be copied from meson_chip->data_buf. so the implementation of meson_nfc_read_page_raw() is like this if need.
> >>>>>>     {
> >>>>>>         ......
> >>>>>>         meson_nfc_read_page_sub(nand, page, 1);
> >>>>>>         meson_nfc_get_data_oob(nand, buf, oob_buf);
> >>>>>>         oob_len = (nand->ecc.bytes + 2) * nand->ecc.steps;
> >>>>>>         memcpy(oob_buf + oob_len, meson_chip->data_buf + oob_len, mtd->oobsize - oob_len);
> >>>>>>
> >>>>>>     }
> >>>>>> 2) In ECC mode, the controller can't bring back the missing OOB bytes. it can read out the user bytes and ecc bytes per meson_ooblayout_ops define.
> >>>>>
> >>>>> And then (if oob_required) you can bring the missing bytes with
> >>>>> something along:
> >>>>> nand_change_read_column_op(chip, mtd->writesize + oob_len,
> >>>>>                oob_buf + oob_len,
> >>>>>                mtd->oobsize - oob_len,
> >>>>>                false);
> >>>>> Should not be a huge performance hit.
> >>>>
> >>>> After finishing ECC mode reading, the column address internal in NAND device should be the right pos; it doesn't need to change the column again. so adding controller raw read for the missing bytes after ECC reading may works.
> >>>>
> >>> use raw read for the missing bytes, but they are not protected by host ECC. to the NAND type of storage, is it ok or missing bytes better to be filled with 0xff?
> >>
> >> IIUC Miquèl's reply, valid behaviour is to return full OOB data in both modes. So in:
> >> ECC mode it is user bytes(corrected by ECC, read from info buffer) + ECC + missing bytes. ECC and missing bytes read in RAW mode.
> >
> > I believe the ECC bytes you'll get will be corrected.
> > You can check this by using nandflipbits in mtd-utils.
>
> Sorry, didn't get it, i'm new in NAND area. Bytes of ECC codes are available only in raw mode (at least in this NAND
> driver) also as missing bytes of OOB.

Gasp. Yeah that's a controller limitation, okay.

> So IIUC ECC codes are metadata to correct data bytes, and thus
> couldn't be corrected.

We consider them metadata, but they are fully part of the ECC scheme
and thus their correction is part of the process, bitflips in the ECC
bytes will count as data bitflips actually.

I talked a bit about ECC engines at a previous conference if it can
help:
https://elinux.org/ELC_Europe_2020_Presentations
'Understand ECC Support for NAND Flash Devices in Linux'
And also wrote a blog post with a chapter about ECC engines:
https://bootlin.com/blog/supporting-a-misbehaving-nand-ecc-engine/

Thanks,
Miquèl

2023-04-13 10:35:26

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi Dmitry,

[email protected] wrote on Thu, 13 Apr 2023 12:27:06 +0300:

> On Wed, Apr 12, 2023 at 10:56:03PM +0200, Miquel Raynal wrote:
> > Hi Dmitry,
> >
> > [email protected] wrote on Wed, 12 Apr 2023 22:15:48 +0300:
> >
> > > On Wed, Apr 12, 2023 at 10:04:28PM +0800, Liang Yang wrote:
> > > > Hi Miquel and Arseniy,
> > > >
> > > > On 2023/4/12 20:57, Miquel Raynal wrote:
> > > > > [ EXTERNAL EMAIL ]
> > > > >
> > > > > Hi Arseniy,
> > > > >
> > > > > [email protected] wrote on Wed, 12 Apr 2023 15:22:26 +0300:
> > > > >
> > > > > > On 12.04.2023 15:18, Miquel Raynal wrote:
> > > > > > > Hi Arseniy,
> > > > > > >
> > > > > > > [email protected] wrote on Wed, 12 Apr 2023 13:14:52 +0300:
> > > > > > > > On 12.04.2023 12:36, Miquel Raynal wrote:
> > > > > > > > > Hi Arseniy,
> > > > > > > > >
> > > > > > > > > [email protected] wrote on Wed, 12 Apr 2023 12:20:55 +0300:
> > > > > > > > > > On 12.04.2023 10:44, Miquel Raynal wrote:
> > > > > > > > > > > Hi Arseniy,
> > > > > > > > > > >
> > > > > > > > > > > [email protected] wrote on Wed, 12 Apr 2023 09:16:58 +0300:
> > > > > > > > > > > > This NAND reads only few user's bytes in ECC mode (not full OOB), so
> > > > > > > > > > >
> > > > > > > > > > > "This NAND reads" does not look right, do you mean "Subpage reads do
> > > > > > > > > > > not retrieve all the OOB bytes,"?
> > > > > > > > > > > > fill OOB buffer with zeroes to not return garbage from previous reads
> > > > > > > > > > > > to user.
> > > > > > > > > > > > Otherwise 'nanddump' utility prints something like this for just erased
> > > > > > > > > > > > page:
> > > > > > > > > > > >
> > > > > > > > > > > > ...
> > > > > > > > > > > > 0x000007f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> > > > > > > > > > > > OOB Data: ff ff ff ff 00 00 ff ff 80 cf 22 99 cb ad d3 be
> > > > > > > > > > > > OOB Data: 63 27 ae 06 16 0a 2f eb bb dd 46 74 41 8e 88 6e
> > > > > > > > > > > > OOB Data: 38 a1 2d e6 77 d4 05 06 f2 a5 7e 25 eb 34 7c ff
> > > > > > > > > > > > OOB Data: 38 ea de 14 10 de 9b 40 33 16 6a cc 9d aa 2f 5e
> > > > > > > > > > > >
> > > > > > > > > > > > Signed-off-by: Arseniy Krasnov <[email protected]>
> > > > > > > > > > > > ---
> > > > > > > > > > > > drivers/mtd/nand/raw/meson_nand.c | 5 +++++
> > > > > > > > > > > > 1 file changed, 5 insertions(+)
> > > > > > > > > > > >
> > > > > > > > > > > > diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
> > > > > > > > > > > > index f84a10238e4d..f2f2472cb511 100644
> > > > > > > > > > > > --- a/drivers/mtd/nand/raw/meson_nand.c
> > > > > > > > > > > > +++ b/drivers/mtd/nand/raw/meson_nand.c
> > > > > > > > > > > > @@ -858,9 +858,12 @@ static int meson_nfc_read_page_sub(struct nand_chip *nand,
> > > > > > > > > > > > static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
> > > > > > > > > > > > int oob_required, int page)
> > > > > > > > > > > > {
> > > > > > > > > > > > + struct mtd_info *mtd = nand_to_mtd(nand);
> > > > > > > > > > > > u8 *oob_buf = nand->oob_poi;
> > > > > > > > > > > > int ret;
> > > > > > > > > > > > + memset(oob_buf, 0, mtd->oobsize);
> > > > > > > > > > >
> > > > > > > > > > > I'm surprised raw reads do not read the entire OOB?
> > > > > > > > > >
> > > > > > > > > > Yes! Seems in case of raw access (what i see in this driver) number of OOB bytes read
> > > > > > > > > > still depends on ECC parameters: for each portion of data covered with ECC code we can
> > > > > > > > > > read it's ECC code and "user bytes" from OOB - it is what i see by dumping DMA buffer by
> > > > > > > > > > printk(). For example I'm working with 2K NAND pages, each page has 2 x 1K ECC blocks.
> > > > > > > > > > For each ECC block I have 16 OOB bytes which I can access by read/write. Each 16 bytes
> > > > > > > > > > contains 2 bytes of user's data and 14 bytes ECC codes. So when I read page in raw mode
> > > > > > > > > > controller returns 32 bytes (2 x (2 + 14)) of OOB. While OOB is reported as 64 bytes.
> > > > > > > > >
> > > > > > > > > In all modes, when you read OOB, you should get the full OOB. The fact
> > > > > > > > > that ECC correction is enabled or disabled does not matter. If the NAND
> > > > > > > > > features OOB sections of 64 bytes, you should get the 64 bytes.
> > > > > > > > >
> > > > > > > > > What happens sometimes, is that some of the bytes are not protected
> > > > > > > > > against bitflips, but the policy is to return the full buffer.
> > > > > > > >
> > > > > > > > Ok, so to clarify case for this NAND controller:
> > > > > > > > 1) In both ECC and raw modes i need to return the same raw OOB data (e.g. user bytes
> > > > > > > > + ECC codes)?
> > > > > > >
> > > > > > > Well, you need to cover the same amount of data, yes. But in the ECC
> > > > > > > case the data won't be raw (at least not all of it).
> > > > > >
> > > > > > So "same amount of data", in ECC mode current implementation returns only user OOB bytes (e.g.
> > > > > > OOB data excluding ECC codes), in raw it returns user bytes + ECC codes. IIUC correct
> > > > > > behaviour is to always return user bytes + ECC codes as OOB data even in ECC mode ?
> > > > >
> > > > > If the page are 2k+64B you should read 2k+64B when OOB are requested.
> > > > >
> > > > > If the controller only returns 2k+32B, then perform a random read to
> > > > > just move the read pointer to mtd->size + mtd->oobsize - 32 and
> > > > > retrieve the missing 32 bytes?
> > > >
> > > > 1) raw read can read out the whole page data 2k+64B, decided by the len in
> > > > the controller raw read command:
> > > > cmd = (len & GENMASK(5, 0)) | scrambler | DMA_DIR(dir);
> > > > after that, the missing oob bytes(not used) can be copied from
> > > > meson_chip->data_buf. so the implementation of meson_nfc_read_page_raw() is
> > > > like this if need.
> > > > {
> > > > ......
> > > > meson_nfc_read_page_sub(nand, page, 1);
> > > > meson_nfc_get_data_oob(nand, buf, oob_buf);
> > > > oob_len = (nand->ecc.bytes + 2) * nand->ecc.steps;
> > > > memcpy(oob_buf + oob_len, meson_chip->data_buf + oob_len, mtd->oobsize -
> > > > oob_len);
> > > >
> > > > }
> > > > 2) In ECC mode, the controller can't bring back the missing OOB bytes. it
> > > > can read out the user bytes and ecc bytes per meson_ooblayout_ops define.
> > > >
> > >
> > > How does the Meson controller know the actual NAND flash layout when the
> > > OOB is split into protected and unprotected areas, such as Free and ECC
> > > areas? If the controller has a static OOB layout, where is the mapping
> > > located?
> >
> > It's usually a set of values hardcoded in the driver. It's a per
> > geometry set.
> >
>
> Sorry, I'm still confused. Before I developed spinand drivers, the OOB layout
> was located on the flash driver side.

The spinand subsystem is different, most of the chips have on-die ECC,
which means the ECC layout is most of the time per-chip.

In the raw NAND layer, most of the time the ECC engine is merged with
the NAND controller and thus the list of available layouts depend on
the controller. But these layouts often adapt to the NAND chip
geometry, so you can only now which layout to use after identifying the
chip (there is an ->attach_chip() hook to handle things related to
geometry after controller base initialization.

> Do you mean if the OOB geometry in the rawnand subsystem is under the
> responsibility of the controller driver?

The geometry is NAND chip specific.
The OOB layout depends on the chip geometry, the ECC engine
capabilities and configuration (like the strength).
The discovery of the geometry is performed by the core (using raw NAND
controller driver callbacks of course).
If on-host ECC engine is picked (the default in the raw NAND
subsystem), then the controller driver picks the correct OOB layout and
refuses to probe otherwise.

Hope its clearer now :)

Thanks,
Miquèl

2023-04-13 14:13:50

by Dmitry Rokosov

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

On Thu, Apr 13, 2023 at 12:29:16PM +0200, Miquel Raynal wrote:
> Hi Dmitry,
>
> [email protected] wrote on Thu, 13 Apr 2023 12:27:06 +0300:
>
> > On Wed, Apr 12, 2023 at 10:56:03PM +0200, Miquel Raynal wrote:
> > > Hi Dmitry,
> > >
> > > [email protected] wrote on Wed, 12 Apr 2023 22:15:48 +0300:
> > >
> > > > On Wed, Apr 12, 2023 at 10:04:28PM +0800, Liang Yang wrote:
> > > > > Hi Miquel and Arseniy,
> > > > >
> > > > > On 2023/4/12 20:57, Miquel Raynal wrote:
> > > > > > [ EXTERNAL EMAIL ]
> > > > > >
> > > > > > Hi Arseniy,
> > > > > >
> > > > > > [email protected] wrote on Wed, 12 Apr 2023 15:22:26 +0300:
> > > > > >
> > > > > > > On 12.04.2023 15:18, Miquel Raynal wrote:
> > > > > > > > Hi Arseniy,
> > > > > > > >
> > > > > > > > [email protected] wrote on Wed, 12 Apr 2023 13:14:52 +0300:
> > > > > > > > > On 12.04.2023 12:36, Miquel Raynal wrote:
> > > > > > > > > > Hi Arseniy,
> > > > > > > > > >
> > > > > > > > > > [email protected] wrote on Wed, 12 Apr 2023 12:20:55 +0300:
> > > > > > > > > > > On 12.04.2023 10:44, Miquel Raynal wrote:
> > > > > > > > > > > > Hi Arseniy,
> > > > > > > > > > > >
> > > > > > > > > > > > [email protected] wrote on Wed, 12 Apr 2023 09:16:58 +0300:
> > > > > > > > > > > > > This NAND reads only few user's bytes in ECC mode (not full OOB), so
> > > > > > > > > > > >
> > > > > > > > > > > > "This NAND reads" does not look right, do you mean "Subpage reads do
> > > > > > > > > > > > not retrieve all the OOB bytes,"?
> > > > > > > > > > > > > fill OOB buffer with zeroes to not return garbage from previous reads
> > > > > > > > > > > > > to user.
> > > > > > > > > > > > > Otherwise 'nanddump' utility prints something like this for just erased
> > > > > > > > > > > > > page:
> > > > > > > > > > > > >
> > > > > > > > > > > > > ...
> > > > > > > > > > > > > 0x000007f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> > > > > > > > > > > > > OOB Data: ff ff ff ff 00 00 ff ff 80 cf 22 99 cb ad d3 be
> > > > > > > > > > > > > OOB Data: 63 27 ae 06 16 0a 2f eb bb dd 46 74 41 8e 88 6e
> > > > > > > > > > > > > OOB Data: 38 a1 2d e6 77 d4 05 06 f2 a5 7e 25 eb 34 7c ff
> > > > > > > > > > > > > OOB Data: 38 ea de 14 10 de 9b 40 33 16 6a cc 9d aa 2f 5e
> > > > > > > > > > > > >
> > > > > > > > > > > > > Signed-off-by: Arseniy Krasnov <[email protected]>
> > > > > > > > > > > > > ---
> > > > > > > > > > > > > drivers/mtd/nand/raw/meson_nand.c | 5 +++++
> > > > > > > > > > > > > 1 file changed, 5 insertions(+)
> > > > > > > > > > > > >
> > > > > > > > > > > > > diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
> > > > > > > > > > > > > index f84a10238e4d..f2f2472cb511 100644
> > > > > > > > > > > > > --- a/drivers/mtd/nand/raw/meson_nand.c
> > > > > > > > > > > > > +++ b/drivers/mtd/nand/raw/meson_nand.c
> > > > > > > > > > > > > @@ -858,9 +858,12 @@ static int meson_nfc_read_page_sub(struct nand_chip *nand,
> > > > > > > > > > > > > static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
> > > > > > > > > > > > > int oob_required, int page)
> > > > > > > > > > > > > {
> > > > > > > > > > > > > + struct mtd_info *mtd = nand_to_mtd(nand);
> > > > > > > > > > > > > u8 *oob_buf = nand->oob_poi;
> > > > > > > > > > > > > int ret;
> > > > > > > > > > > > > + memset(oob_buf, 0, mtd->oobsize);
> > > > > > > > > > > >
> > > > > > > > > > > > I'm surprised raw reads do not read the entire OOB?
> > > > > > > > > > >
> > > > > > > > > > > Yes! Seems in case of raw access (what i see in this driver) number of OOB bytes read
> > > > > > > > > > > still depends on ECC parameters: for each portion of data covered with ECC code we can
> > > > > > > > > > > read it's ECC code and "user bytes" from OOB - it is what i see by dumping DMA buffer by
> > > > > > > > > > > printk(). For example I'm working with 2K NAND pages, each page has 2 x 1K ECC blocks.
> > > > > > > > > > > For each ECC block I have 16 OOB bytes which I can access by read/write. Each 16 bytes
> > > > > > > > > > > contains 2 bytes of user's data and 14 bytes ECC codes. So when I read page in raw mode
> > > > > > > > > > > controller returns 32 bytes (2 x (2 + 14)) of OOB. While OOB is reported as 64 bytes.
> > > > > > > > > >
> > > > > > > > > > In all modes, when you read OOB, you should get the full OOB. The fact
> > > > > > > > > > that ECC correction is enabled or disabled does not matter. If the NAND
> > > > > > > > > > features OOB sections of 64 bytes, you should get the 64 bytes.
> > > > > > > > > >
> > > > > > > > > > What happens sometimes, is that some of the bytes are not protected
> > > > > > > > > > against bitflips, but the policy is to return the full buffer.
> > > > > > > > >
> > > > > > > > > Ok, so to clarify case for this NAND controller:
> > > > > > > > > 1) In both ECC and raw modes i need to return the same raw OOB data (e.g. user bytes
> > > > > > > > > + ECC codes)?
> > > > > > > >
> > > > > > > > Well, you need to cover the same amount of data, yes. But in the ECC
> > > > > > > > case the data won't be raw (at least not all of it).
> > > > > > >
> > > > > > > So "same amount of data", in ECC mode current implementation returns only user OOB bytes (e.g.
> > > > > > > OOB data excluding ECC codes), in raw it returns user bytes + ECC codes. IIUC correct
> > > > > > > behaviour is to always return user bytes + ECC codes as OOB data even in ECC mode ?
> > > > > >
> > > > > > If the page are 2k+64B you should read 2k+64B when OOB are requested.
> > > > > >
> > > > > > If the controller only returns 2k+32B, then perform a random read to
> > > > > > just move the read pointer to mtd->size + mtd->oobsize - 32 and
> > > > > > retrieve the missing 32 bytes?
> > > > >
> > > > > 1) raw read can read out the whole page data 2k+64B, decided by the len in
> > > > > the controller raw read command:
> > > > > cmd = (len & GENMASK(5, 0)) | scrambler | DMA_DIR(dir);
> > > > > after that, the missing oob bytes(not used) can be copied from
> > > > > meson_chip->data_buf. so the implementation of meson_nfc_read_page_raw() is
> > > > > like this if need.
> > > > > {
> > > > > ......
> > > > > meson_nfc_read_page_sub(nand, page, 1);
> > > > > meson_nfc_get_data_oob(nand, buf, oob_buf);
> > > > > oob_len = (nand->ecc.bytes + 2) * nand->ecc.steps;
> > > > > memcpy(oob_buf + oob_len, meson_chip->data_buf + oob_len, mtd->oobsize -
> > > > > oob_len);
> > > > >
> > > > > }
> > > > > 2) In ECC mode, the controller can't bring back the missing OOB bytes. it
> > > > > can read out the user bytes and ecc bytes per meson_ooblayout_ops define.
> > > > >
> > > >
> > > > How does the Meson controller know the actual NAND flash layout when the
> > > > OOB is split into protected and unprotected areas, such as Free and ECC
> > > > areas? If the controller has a static OOB layout, where is the mapping
> > > > located?
> > >
> > > It's usually a set of values hardcoded in the driver. It's a per
> > > geometry set.
> > >
> >
> > Sorry, I'm still confused. Before I developed spinand drivers, the OOB layout
> > was located on the flash driver side.
>
> The spinand subsystem is different, most of the chips have on-die ECC,
> which means the ECC layout is most of the time per-chip.
>
> In the raw NAND layer, most of the time the ECC engine is merged with
> the NAND controller and thus the list of available layouts depend on
> the controller. But these layouts often adapt to the NAND chip
> geometry, so you can only now which layout to use after identifying the
> chip (there is an ->attach_chip() hook to handle things related to
> geometry after controller base initialization.
>
> > Do you mean if the OOB geometry in the rawnand subsystem is under the
> > responsibility of the controller driver?
>
> The geometry is NAND chip specific.
> The OOB layout depends on the chip geometry, the ECC engine
> capabilities and configuration (like the strength).
> The discovery of the geometry is performed by the core (using raw NAND
> controller driver callbacks of course).
> If on-host ECC engine is picked (the default in the raw NAND
> subsystem), then the controller driver picks the correct OOB layout and
> refuses to probe otherwise.
>
> Hope its clearer now :)

Yeah, pretty clear!

I wanted to express my special thanks to you for sharing such detailed
information about the raw NAND layer! Your insights will be incredibly
helpful for me and will provide valuable context that I would not have
otherwise had.

Thank you so much for taking the time to share your expertise with me!

--
Thank you,
Dmitry

2023-04-18 12:25:22

by Liang Yang

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi Arseniy and Miquel,

On 2023/4/18 13:12, Arseniy Krasnov wrote:
> [ EXTERNAL EMAIL ]
>
>
>
> On 13.04.2023 13:35, Arseniy Krasnov wrote:
>>
>>
>> On 13.04.2023 13:22, Miquel Raynal wrote:
>>> Hi Arseniy,
>>>
>>> [email protected] wrote on Thu, 13 Apr 2023 12:36:24 +0300:
>>>
>>>> On 13.04.2023 11:22, Miquel Raynal wrote:
>>>>> Hi Arseniy,
>>>>>
>>>>> [email protected] wrote on Thu, 13 Apr 2023 10:00:24 +0300:
>>>>>
>>>>>> On 13.04.2023 09:11, Liang Yang wrote:
>>>>>>>
>>>>>>> On 2023/4/13 13:32, Liang Yang wrote:
>>>>>>>> Hi Miquel,
>>>>>>>>
>>>>>>>> On 2023/4/12 22:32, Miquel Raynal wrote:
>>>>>>>>> [ EXTERNAL EMAIL ]
>>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> [email protected] wrote on Wed, 12 Apr 2023 22:04:28 +0800:
>>>>>>>>>
>>>>>>>>>> Hi Miquel and Arseniy,
>>>>>>>>>>
>>>>>>>>>> On 2023/4/12 20:57, Miquel Raynal wrote:
>>>>>>>>>>> [ EXTERNAL EMAIL ]
>>>>>>>>>>>
>>>>>>>>>>> Hi Arseniy,
>>>>>>>>>>>
>>>>>>>>>>> [email protected] wrote on Wed, 12 Apr 2023 15:22:26 +0300:
>>>>>>>>>>>> On 12.04.2023 15:18, Miquel Raynal wrote:
>>>>>>>>>>>>> Hi Arseniy,
>>>>>>>>>>>>>
>>>>>>>>>>>>> [email protected] wrote on Wed, 12 Apr 2023 13:14:52 +0300:
>>>>>>>>>>>>>     >>>> On 12.04.2023 12:36, Miquel Raynal wrote:
>>>>>>>>>>>>>>> Hi Arseniy,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [email protected] wrote on Wed, 12 Apr 2023 12:20:55 +0300:
>>>>>>>>>>>>>>>       >>>>>> On 12.04.2023 10:44, Miquel Raynal wrote:
>>>>>>>>>>>>>>>>> Hi Arseniy,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [email protected] wrote on Wed, 12 Apr 2023 09:16:58 +0300:
>>>>>>>>>>>>>>>>>         >>>>>>>> This NAND reads only few user's bytes in ECC mode (not full OOB), so
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> "This NAND reads" does not look right, do you mean "Subpage reads do
>>>>>>>>>>>>>>>>> not retrieve all the OOB bytes,"?
>>>>>>>>>>>>>>>>>         >>>>>>>> fill OOB buffer with zeroes to not return garbage from previous reads
>>>>>>>>>>>>>>>>>> to user.
>>>>>>>>>>>>>>>>>> Otherwise 'nanddump' utility prints something like this for just erased
>>>>>>>>>>>>>>>>>> page:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>>>>> 0x000007f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>>>>>>>>>>>>>>>>>>     OOB Data: ff ff ff ff 00 00 ff ff 80 cf 22 99 cb ad d3 be
>>>>>>>>>>>>>>>>>>     OOB Data: 63 27 ae 06 16 0a 2f eb bb dd 46 74 41 8e 88 6e
>>>>>>>>>>>>>>>>>>     OOB Data: 38 a1 2d e6 77 d4 05 06 f2 a5 7e 25 eb 34 7c ff
>>>>>>>>>>>>>>>>>>     OOB Data: 38 ea de 14 10 de 9b 40 33 16 6a cc 9d aa 2f 5e
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Signed-off-by: Arseniy Krasnov <[email protected]>
>>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>>>    drivers/mtd/nand/raw/meson_nand.c | 5 +++++
>>>>>>>>>>>>>>>>>>    1 file changed, 5 insertions(+)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
>>>>>>>>>>>>>>>>>> index f84a10238e4d..f2f2472cb511 100644
>>>>>>>>>>>>>>>>>> --- a/drivers/mtd/nand/raw/meson_nand.c
>>>>>>>>>>>>>>>>>> +++ b/drivers/mtd/nand/raw/meson_nand.c
>>>>>>>>>>>>>>>>>> @@ -858,9 +858,12 @@ static int meson_nfc_read_page_sub(struct nand_chip *nand,
>>>>>>>>>>>>>>>>>>    static int meson_nfc_read_page_raw(struct nand_chip *nand, u8 *buf,
>>>>>>>>>>>>>>>>>>                       int oob_required, int page)
>>>>>>>>>>>>>>>>>>    {
>>>>>>>>>>>>>>>>>> +    struct mtd_info *mtd = nand_to_mtd(nand);
>>>>>>>>>>>>>>>>>>        u8 *oob_buf = nand->oob_poi;
>>>>>>>>>>>>>>>>>>        int ret;
>>>>>>>>>>>>>>>>>>    >>>>>>>> +    memset(oob_buf, 0, mtd->oobsize);
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'm surprised raw reads do not read the entire OOB?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yes! Seems in case of raw access (what i see in this driver) number of OOB bytes read
>>>>>>>>>>>>>>>> still depends on ECC parameters: for each portion of data covered with ECC code we can
>>>>>>>>>>>>>>>> read it's ECC code and "user bytes" from OOB - it is what i see by dumping DMA buffer by
>>>>>>>>>>>>>>>> printk(). For example I'm working with 2K NAND pages, each page has 2 x 1K ECC blocks.
>>>>>>>>>>>>>>>> For each ECC block I have 16 OOB bytes which I can access by read/write. Each 16 bytes
>>>>>>>>>>>>>>>> contains 2 bytes of user's data and 14 bytes ECC codes. So when I read page in raw mode
>>>>>>>>>>>>>>>> controller returns 32 bytes (2 x (2 + 14)) of OOB. While OOB is reported as 64 bytes.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In all modes, when you read OOB, you should get the full OOB. The fact
>>>>>>>>>>>>>>> that ECC correction is enabled or disabled does not matter. If the NAND
>>>>>>>>>>>>>>> features OOB sections of 64 bytes, you should get the 64 bytes.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> What happens sometimes, is that some of the bytes are not protected
>>>>>>>>>>>>>>> against bitflips, but the policy is to return the full buffer.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Ok, so to clarify case for this NAND controller:
>>>>>>>>>>>>>> 1) In both ECC and raw modes i need to return the same raw OOB data (e.g. user bytes
>>>>>>>>>>>>>>      + ECC codes)?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Well, you need to cover the same amount of data, yes. But in the ECC
>>>>>>>>>>>>> case the data won't be raw (at least not all of it).
>>>>>>>>>>>>
>>>>>>>>>>>> So "same amount of data", in ECC mode current implementation returns only user OOB bytes (e.g.
>>>>>>>>>>>> OOB data excluding ECC codes), in raw it returns user bytes + ECC codes. IIUC correct
>>>>>>>>>>>> behaviour is to always return user bytes + ECC codes as OOB data even in ECC mode ?
>>>>>>>>>>>
>>>>>>>>>>> If the page are 2k+64B you should read 2k+64B when OOB are requested.
>>>>>>>>>>>
>>>>>>>>>>> If the controller only returns 2k+32B, then perform a random read to
>>>>>>>>>>> just move the read pointer to mtd->size + mtd->oobsize - 32 and
>>>>>>>>>>> retrieve the missing 32 bytes?
>>>>>>>>>>
>>>>>>>>>> 1) raw read can read out the whole page data 2k+64B, decided by the len in the controller raw read command:
>>>>>>>>>>     cmd = (len & GENMASK(5, 0)) | scrambler | DMA_DIR(dir);
>>>>>>>>>> after that, the missing oob bytes(not used) can be copied from meson_chip->data_buf. so the implementation of meson_nfc_read_page_raw() is like this if need.
>>>>>>>>>>     {
>>>>>>>>>>         ......
>>>>>>>>>>         meson_nfc_read_page_sub(nand, page, 1);
>>>>>>>>>>         meson_nfc_get_data_oob(nand, buf, oob_buf);
>>>>>>>>>>         oob_len = (nand->ecc.bytes + 2) * nand->ecc.steps;
>>>>>>>>>>         memcpy(oob_buf + oob_len, meson_chip->data_buf + oob_len, mtd->oobsize - oob_len);
>>>>>>>>>>
>>>>>>>>>>     }
>>>>>>>>>> 2) In ECC mode, the controller can't bring back the missing OOB bytes. it can read out the user bytes and ecc bytes per meson_ooblayout_ops define.
>>>>>>>>>
>>>>>>>>> And then (if oob_required) you can bring the missing bytes with
>>>>>>>>> something along:
>>>>>>>>> nand_change_read_column_op(chip, mtd->writesize + oob_len,
>>>>>>>>>                oob_buf + oob_len,
>>>>>>>>>                mtd->oobsize - oob_len,
>>>>>>>>>                false);
>>>>>>>>> Should not be a huge performance hit.
>>>>>>>>
>>>>>>>> After finishing ECC mode reading, the column address internal in NAND device should be the right pos; it doesn't need to change the column again. so adding controller raw read for the missing bytes after ECC reading may works.
>>>>>>>>
>>>>>>> use raw read for the missing bytes, but they are not protected by host ECC. to the NAND type of storage, is it ok or missing bytes better to be filled with 0xff?
>>>>>>
>>>>>> IIUC Miquèl's reply, valid behaviour is to return full OOB data in both modes. So in:
>>>>>> ECC mode it is user bytes(corrected by ECC, read from info buffer) + ECC + missing bytes. ECC and missing bytes read in RAW mode.
>>>>>
>>>>> I believe the ECC bytes you'll get will be corrected.
>>>>> You can check this by using nandflipbits in mtd-utils.
>>>>
>>>> Sorry, didn't get it, i'm new in NAND area. Bytes of ECC codes are available only in raw mode (at least in this NAND
>>>> driver) also as missing bytes of OOB.
>>>
>>> Gasp. Yeah that's a controller limitation, okay.
>>>
>>>> So IIUC ECC codes are metadata to correct data bytes, and thus
>>>> couldn't be corrected.
>>>
>>> We consider them metadata, but they are fully part of the ECC scheme
>>> and thus their correction is part of the process, bitflips in the ECC
>>> bytes will count as data bitflips actually.
>>>
>>> I talked a bit about ECC engines at a previous conference if it can
>>> help:
>>> https://elinux.org/ELC_Europe_2020_Presentations
>>> 'Understand ECC Support for NAND Flash Devices in Linux'
>>> And also wrote a blog post with a chapter about ECC engines:
>>> https://bootlin.com/blog/supporting-a-misbehaving-nand-ecc-engine/
>>
>>
>> Thanks for this!
>>
>> Thanks, Arseniy
>
> Hello again @Liang @Miquel!
>
> One more question about OOB access, as I can see current driver uses the following
> callbacks:
>
> nand->ecc.write_oob_raw = nand_write_oob_std;
> nand->ecc.write_oob = nand_write_oob_std;
>
>
> Function 'nand_write_oob_std()' writes data to the end of the page. But as I
> can see by dumping 'data_buf' during read, physical layout of each page is the
> following (1KB ECC):
>
> 0x000: [ 1 KB of data ]
> 0x400: [ 2B user data] [ 14B ECC code]
> 0x410: [ 1 KB of data ] (A)
> 0x810: [ 2B user data] [ 14B ECC code]
> 0x820: [ 32B unused ]
>
>
>
> So, after 'nand_write_oob_std()' (let data be sequence from [0x0 ... 0x3f]),
> page will look like this:
>
> 0x000: [ 0xFF ]
> 0x400: [ ........ ]
> 0x7f0: [ 0xFF ]
> 0x800: [ 00 ....................... ]
> 0x830: [ ........................ 3f ]
>
> Here we have two problems:
> 1) Attempt to display raw data by 'nanddump' utility produces a little bit
> invalid output, as driver relies on layout (A) from above. E.g. OOB data
> is at 0x400 and 0x810. Here is an example (attempt to write 0x11 0x22 0x33 0x44):
>
> 0x000007f0: 11 22 ff ff ff ff ff ff ff ff ff ff ff ff ff ff |."..............|
> OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
> OOB Data: 33 44 ff ff ff ff ff ff ff ff ff ff ff ff ff ff |3D..............|
> OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
> OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
>
Hi Arseniy,

I realized the write_oob_raw() and write_oob() are wrong in
meson_nand.c. I suggest both of them should be reworked and follow the
format of meson nand controller. i.e. firstly format the data in Layout
(A) and then write. reading is firstly reading the data of layout (A)
and then compost the layout (B).


>
> 2) Attempt to read data in ECC mode will fail, because IIUC page is in dirty
> state (I mean was written at least once) and NAND controller tries to use
> ECC codes at 0x400 and 0x810, which are obviously broken in this case. Thus

As i said above, write_oob_raw() and write_oob() should be reworked.
i don't know what do you mean page was written at least once. anyway the
page should be written once, even just write_oob_raw().

> we have strange situation: OOB seems written without any errors, but we can't
> read this page. First idea is to write OOB data to 0x400 and 0x810 in raw mode,
> but this does not work - if there is some data, NAND controller will try to
> use ECC engine to check these user bytes on next attempt to read this page. But
> as these 4 bytes were written in raw mode, ECC codes are missed.
>
> We suggest the following thing: use only area at 0x820 for OOB (see A) - it is not covered
> by ECC engine, so write to this zone won't conflict with ECC in future. In this case
> we change 'meson_ooblayout_free()' function which instead of describing 2 user bytes
> for each ECC block, will return 16 tail bytes for each ECC block.
>
> What to You think?

the key point is that the data 0x820-0x840 is not protected by Host
ECC.so i don't think we have to change it.

Thanks,
Liang
>
>
>>
>>>
>>> Thanks,
>>> Miquèl
>

2023-04-18 13:37:10

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi Arseniy,

> >> Hello again @Liang @Miquel!
> >>
> >> One more question about OOB access, as I can see current driver uses the following
> >> callbacks:
> >>
> >>     nand->ecc.write_oob_raw = nand_write_oob_std;
> >>     nand->ecc.write_oob = nand_write_oob_std;
> >>
> >>
> >> Function 'nand_write_oob_std()' writes data to the end of the page. But as I
> >> can see by dumping 'data_buf' during read, physical layout of each page is the
> >> following (1KB ECC):
> >>
> >> 0x000: [         1 KB of data        ]
> >> 0x400: [ 2B user data] [ 14B ECC code]
> >> 0x410: [         1 KB of data        ]    (A)
> >> 0x810: [ 2B user data] [ 14B ECC code]
> >> 0x820: [        32B unused           ]
> >>
> >>
> >>
> >> So, after 'nand_write_oob_std()' (let data be sequence from [0x0 ... 0x3f]),
> >> page will look like this:
> >>
> >> 0x000: [             0xFF            ]
> >> 0x400: [           ........          ]
> >> 0x7f0: [             0xFF            ]
> >> 0x800: [ 00 .......................  ]
> >> 0x830: [ ........................ 3f ]
> >>
> >> Here we have two problems:
> >> 1) Attempt to display raw data by 'nanddump' utility produces a little bit
> >>     invalid output, as driver relies on layout (A) from above. E.g. OOB data
> >>     is at 0x400 and 0x810. Here is an example (attempt to write 0x11 0x22 0x33 0x44):
> >>
> >> 0x000007f0: 11 22 ff ff ff ff ff ff ff ff ff ff ff ff ff ff  |."..............|
> >>    OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  |................|
> >>    OOB Data: 33 44 ff ff ff ff ff ff ff ff ff ff ff ff ff ff  |3D..............|
> >>    OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  |................|
> >>    OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  |................|
> >>
> > Hi Arseniy,
> >
> > I realized the write_oob_raw() and write_oob() are wrong in meson_nand.c. I suggest both of them should be reworked and follow the format of meson nand controller. i.e. firstly format the data in Layout (A) and then write. reading is firstly reading the data of layout (A) and then compost the layout (B).
>
> IIUC after such writing only OOB (e.g. user bytes) according layout (A), hw will also write ECC codes, so
> it will be impossible to write data to this page later, because we cannot update ECC codes properly for the newly
> written data (we can't update bits from 0 to 1).
>
> >
> >
> >>
> >> 2) Attempt to read data in ECC mode will fail, because IIUC page is in dirty
> >>     state (I mean was written at least once) and NAND controller tries to use
> >>     ECC codes at 0x400 and 0x810, which are obviously broken in this case. Thus
> >
> > As i said above, write_oob_raw() and write_oob() should be reworked.
> > i don't know what do you mean page was written at least once. anyway the page should be written once, even just write_oob_raw().
>
> Sorry, You mean that after OOB write, we cannot write to the data area (e.g. 0x0 .. 0x810) until page will be erased? For example
> JFFS2 writes to OOB own markers, then it tries to write to the data area of such page.

A page is written after two steps:
- loading the data into the NAND chip cache (that's when you use the
bus)
- programming the NAND array with the data loaded in cache (that's when
you wait)

In theory it does not matter where you write in the cache, it's regular
DRAM, you can make random writes there with the appropriate NAND
commands. Of course when using embedded hardware ECC engines, the
controllers usually expect to be fed in a certain way in order to
produce the ECC bytes and put them at the right location in cache.

And then, when you actually send the "program" command, the NAND cells
actually get programmed based on what has been loaded in cache.

Thanks,
Miquèl

2023-04-19 03:15:48

by Liang Yang

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi Arseniy,

On 2023/4/18 22:57, Arseniy Krasnov wrote:
> [ EXTERNAL EMAIL ]
>
>
>
> On 18.04.2023 16:25, Miquel Raynal wrote:
>> Hi Arseniy,
>>
>>>>> Hello again @Liang @Miquel!
>>>>>
>>>>> One more question about OOB access, as I can see current driver uses the following
>>>>> callbacks:
>>>>>
>>>>>     nand->ecc.write_oob_raw = nand_write_oob_std;
>>>>>     nand->ecc.write_oob = nand_write_oob_std;
>>>>>
>>>>>
>>>>> Function 'nand_write_oob_std()' writes data to the end of the page. But as I
>>>>> can see by dumping 'data_buf' during read, physical layout of each page is the
>>>>> following (1KB ECC):
>>>>>
>>>>> 0x000: [         1 KB of data        ]
>>>>> 0x400: [ 2B user data] [ 14B ECC code]
>>>>> 0x410: [         1 KB of data        ]    (A)
>>>>> 0x810: [ 2B user data] [ 14B ECC code]
>>>>> 0x820: [        32B unused           ]
>>>>>
>>>>>
>>>>>
>>>>> So, after 'nand_write_oob_std()' (let data be sequence from [0x0 ... 0x3f]),
>>>>> page will look like this:
>>>>>
>>>>> 0x000: [             0xFF            ]
>>>>> 0x400: [           ........          ]
>>>>> 0x7f0: [             0xFF            ]
>>>>> 0x800: [ 00 .......................  ]
>>>>> 0x830: [ ........................ 3f ]
>>>>>
>>>>> Here we have two problems:
>>>>> 1) Attempt to display raw data by 'nanddump' utility produces a little bit
>>>>>     invalid output, as driver relies on layout (A) from above. E.g. OOB data
>>>>>     is at 0x400 and 0x810. Here is an example (attempt to write 0x11 0x22 0x33 0x44):
>>>>>
>>>>> 0x000007f0: 11 22 ff ff ff ff ff ff ff ff ff ff ff ff ff ff  |."..............|
>>>>>    OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  |................|
>>>>>    OOB Data: 33 44 ff ff ff ff ff ff ff ff ff ff ff ff ff ff  |3D..............|
>>>>>    OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  |................|
>>>>>    OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  |................|
>>>>>
>>>> Hi Arseniy,
>>>>
>>>> I realized the write_oob_raw() and write_oob() are wrong in meson_nand.c. I suggest both of them should be reworked and follow the format of meson nand controller. i.e. firstly format the data in Layout (A) and then write. reading is firstly reading the data of layout (A) and then compost the layout (B).
>>>
>>> IIUC after such writing only OOB (e.g. user bytes) according layout (A), hw will also write ECC codes, so
>>> it will be impossible to write data to this page later, because we cannot update ECC codes properly for the newly
>>> written data (we can't update bits from 0 to 1).
>>>
>>>>
>>>>
>>>>>
>>>>> 2) Attempt to read data in ECC mode will fail, because IIUC page is in dirty
>>>>>     state (I mean was written at least once) and NAND controller tries to use
>>>>>     ECC codes at 0x400 and 0x810, which are obviously broken in this case. Thus
>>>>
>>>> As i said above, write_oob_raw() and write_oob() should be reworked.
>>>> i don't know what do you mean page was written at least once. anyway the page should be written once, even just write_oob_raw().
>>>
>>> Sorry, You mean that after OOB write, we cannot write to the data area (e.g. 0x0 .. 0x810) until page will be erased? For example
>>> JFFS2 writes to OOB own markers, then it tries to write to the data area of such page.
>
> @Liang, I'll describe current test case in details:
> 1) I have erased page, I can read it in both raw and ecc modes - no problem (it is full of 0xFF).
> 2) I (JFFS2 for example) want to write only OOB - let it be clean markers.
> 3) I use raw write to the needed page (please correct me if i'm wrong). Four bytes
> at 0x400 and 0x810 are updated. All other bytes still 0xff.
> 4) Now, when i'm trying to read this page in ECC mode, I get ECC errors: IIUC this
> happens because from controller point of view ECC codes are invalid for current
> data (all ECCs are 0xff). Is this behaviour is ok?

Yes, it is exactly reported ECC errors.

> 5) Ok, don't care on these ECC errors, let's go further.
> 6) I'm going to write same page in ECC mode - how to do it correctly? There is already
> 4 OOB bytes, considered to be covered by ECC (but in fact now - ECC area is FFed).

If step 4 has excuted "program" command at the page
(nand_write_oob_std() does), it can't be written again before erasing
the page(block). so we have to read the whole page in the ddr and change
the content, erase block, write it again.

I don't think Jffs2 has the same steps (1-6) as you said above. are you
sure that happes on Jffs2 or just an example?

>
> That's why I asked Your opinion about moving OOB data to nonprotected by ECC area (and
> leave user's bytes untouched). In this case OOB access is free and not linked with ECC
> codes which also covers data.
>
> Thanks, Arseniy
>
>
>>
>> A page is written after two steps:
>> - loading the data into the NAND chip cache (that's when you use the
>> bus)
>> - programming the NAND array with the data loaded in cache (that's when
>> you wait)
>>
>> In theory it does not matter where you write in the cache, it's regular
>> DRAM, you can make random writes there with the appropriate NAND
>> commands. Of course when using embedded hardware ECC engines, the
>> controllers usually expect to be fed in a certain way in order to
>> produce the ECC bytes and put them at the right location in cache.
>>
>> And then, when you actually send the "program" command, the NAND cells
>> actually get programmed based on what has been loaded in cache.
>
> Thanks for this details! Very interesting!
>
>
>>
>> Thanks,
>> Miquèl
>

2023-04-26 14:11:00

by Liang Yang

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi Arseniy,

On 2023/4/20 17:37, Arseniy Krasnov wrote:
> [ EXTERNAL EMAIL ]
>
> On 19.04.2023 09:41, Arseniy Krasnov wrote:
>>
>>
>> On 19.04.2023 06:05, Liang Yang wrote:
>>> Hi Arseniy,
>>>
>>> On 2023/4/18 22:57, Arseniy Krasnov wrote:
>>>> [ EXTERNAL EMAIL ]
>>>>
>>>>
>>>>
>>>> On 18.04.2023 16:25, Miquel Raynal wrote:
>>>>> Hi Arseniy,
>>>>>
>>>>>>>> Hello again @Liang @Miquel!
>>>>>>>>
>>>>>>>> One more question about OOB access, as I can see current driver uses the following
>>>>>>>> callbacks:
>>>>>>>>
>>>>>>>> nand->ecc.write_oob_raw = nand_write_oob_std;
>>>>>>>> nand->ecc.write_oob = nand_write_oob_std;
>>>>>>>>
>>>>>>>>
>>>>>>>> Function 'nand_write_oob_std()' writes data to the end of the page. But as I
>>>>>>>> can see by dumping 'data_buf' during read, physical layout of each page is the
>>>>>>>> following (1KB ECC):
>>>>>>>>
>>>>>>>> 0x000: [         1 KB of data ]
>>>>>>>> 0x400: [ 2B user data] [ 14B ECC code]
>>>>>>>> 0x410: [         1 KB of data ]    (A)
>>>>>>>> 0x810: [ 2B user data] [ 14B ECC code]
>>>>>>>> 0x820: [        32B unused ]
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> So, after 'nand_write_oob_std()' (let data be sequence from [0x0 ... 0x3f]),
>>>>>>>> page will look like this:
>>>>>>>>
>>>>>>>> 0x000: [             0xFF ]
>>>>>>>> 0x400: [           ........ ]
>>>>>>>> 0x7f0: [             0xFF ]
>>>>>>>> 0x800: [ 00 ....................... ]
>>>>>>>> 0x830: [ ........................ 3f ]
>>>>>>>>
>>>>>>>> Here we have two problems:
>>>>>>>> 1) Attempt to display raw data by 'nanddump' utility produces a little bit
>>>>>>>> invalid output, as driver relies on layout (A) from above. E.g. OOB data
>>>>>>>> is at 0x400 and 0x810. Here is an example (attempt to write 0x11 0x22 0x33 0x44):
>>>>>>>>
>>>>>>>> 0x000007f0: 11 22 ff ff ff ff ff ff ff ff ff ff ff ff ff ff |."..............|
>>>>>>>> OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
>>>>>>>> OOB Data: 33 44 ff ff ff ff ff ff ff ff ff ff ff ff ff ff |3D..............|
>>>>>>>> OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
>>>>>>>> OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
>>>>>>>>
>>>>>>> Hi Arseniy,
>>>>>>>
>>>>>>> I realized the write_oob_raw() and write_oob() are wrong in meson_nand.c. I suggest both of them should be reworked and follow the format of meson nand controller. i.e. firstly format the data in Layout (A) and then write. reading is firstly reading the data of layout (A) and then compost the layout (B).
>>>>>>
>>>>>> IIUC after such writing only OOB (e.g. user bytes) according layout (A), hw will also write ECC codes, so
>>>>>> it will be impossible to write data to this page later, because we cannot update ECC codes properly for the newly
>>>>>> written data (we can't update bits from 0 to 1).
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> 2) Attempt to read data in ECC mode will fail, because IIUC page is in dirty
>>>>>>>> state (I mean was written at least once) and NAND controller tries to use
>>>>>>>> ECC codes at 0x400 and 0x810, which are obviously broken in this case. Thus
>>>>>>>
>>>>>>> As i said above, write_oob_raw() and write_oob() should be reworked.
>>>>>>> i don't know what do you mean page was written at least once. anyway the page should be written once, even just write_oob_raw().
>>>>>>
>>>>>> Sorry, You mean that after OOB write, we cannot write to the data area (e.g. 0x0 .. 0x810) until page will be erased? For example
>>>>>> JFFS2 writes to OOB own markers, then it tries to write to the data area of such page.
>>>>
>>>> @Liang, I'll describe current test case in details:
>>>> 1) I have erased page, I can read it in both raw and ecc modes - no problem (it is full of 0xFF).
>>>> 2) I (JFFS2 for example) want to write only OOB - let it be clean markers.
>>>> 3) I use raw write to the needed page (please correct me if i'm wrong). Four bytes
>>>> at 0x400 and 0x810 are updated. All other bytes still 0xff.
>>>> 4) Now, when i'm trying to read this page in ECC mode, I get ECC errors: IIUC this
>>>> happens because from controller point of view ECC codes are invalid for current
>>>> data (all ECCs are 0xff). Is this behaviour is ok?
>>>
>>> Yes, it is exactly reported ECC errors.
>>
>> I see, so if we write OOB (e.g. using raw mode), there is no way to read this page in ECC mode later? And the

Of course, there is no ECC parity bytes in it; or raw write the data
with the ECC parity bytes per the layout (A) you describe above.

>> only way to make it readable is to write it in ECC mode, but before this write, we need to read it's
>> user's byte (from previous OOB write) in raw mode, put it to info buf (as user's bytes) and write this page. In this
>> case NAND controller will generate ECC codes including user's byte and page become readable in ECC mode
>> again.

yes, you are right.

>>
>>>
>>>> 5) Ok, don't care on these ECC errors, let's go further.
>>>> 6) I'm going to write same page in ECC mode - how to do it correctly? There is already
>>>> 4 OOB bytes, considered to be covered by ECC (but in fact now - ECC area is FFed).
>>>
>>> If step 4 has excuted "program" command at the page (nand_write_oob_std() does), it can't be written again before erasing the page(block). so we have to read the whole page in the ddr and change the content, erase block, write it again.
>>>
>>> I don't think Jffs2 has the same steps (1-6) as you said above. are you sure that happes on Jffs2 or just an example?
>
>
>>
>> I just checked JFFS2 mount/umount again, here is what i see:
>> 0) First attempt to mount JFFS2.
>> 1) It writes OOB to page N (i'm using raw write). It is cleanmarker value 0x85 0x19 0x03 0x20. Mount is done.
>> 2) Umount JFFS2. Done.
>> 3) Second attempt to mount JFFS2.
>> 4) It reads OOB from page N (i'm using raw read). Value is 0x85 0x19 0x03 0x20. Done.
>> 5) It reads page N in ECC mode, and i get:
>> jffs2: mtd->read(0x100 bytes from N) returned ECC error
>> 6) Mount failed.
>>
>> We already had problem which looks like this on another device. Solution was to use OOB area which is
>> not covered by ECC for JFFS2 cleanmarkers.

ok, so there is not ECC parity bytes and mtd->read() returns ECC error.
does it have to use raw write/read on step 1) and 4)?

>>
>> Thanks, Arseniy
>>
>
> @Liang,
>
> Small addition, if i'm trying to implement OOB read/write in ECC mode, then step 5) will success,
> but later, JFFS2 tries to write this page (in ECC mode of course), and in this case ECC codes will
> be broken, because we can't update them properly without erasing whole page.
>
> Please take a look at this patch from my colleagues:
> https://lore.kernel.org/all/[email protected]
> It is related with "We already had problem which looks like this on another device" from above:
> in 'f50l1g41lb_ooblayout_free()' we reserve 2 bytes in non-ECC area for bad block markers.

It is about the SPI NAND and use another controller called spifc.
but in meson nfc, it is heavily depended on the pre-created BBT in NAND
device.


>
> Thanks, Arseniy
>
>
>>>
>>>>
>>>> That's why I asked Your opinion about moving OOB data to nonprotected by ECC area (and
>>>> leave user's bytes untouched). In this case OOB access is free and not linked with ECC
>>>> codes which also covers data.
>>>>
>>>> Thanks, Arseniy
>>>>
>>>>>
>>>>> A page is written after two steps:
>>>>> - loading the data into the NAND chip cache (that's when you use the
>>>>> bus)
>>>>> - programming the NAND array with the data loaded in cache (that's when
>>>>> you wait)
>>>>>
>>>>> In theory it does not matter where you write in the cache, it's regular
>>>>> DRAM, you can make random writes there with the appropriate NAND
>>>>> commands. Of course when using embedded hardware ECC engines, the
>>>>> controllers usually expect to be fed in a certain way in order to
>>>>> produce the ECC bytes and put them at the right location in cache.
>>>>>
>>>>> And then, when you actually send the "program" command, the NAND cells
>>>>> actually get programmed based on what has been loaded in cache.
>>>>
>>>> Thanks for this details! Very interesting!
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> Miquèl
>>>>

2023-05-02 10:07:21

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi Arseniy,

[email protected] wrote on Wed, 26 Apr 2023 17:46:19 +0300:

> On 26.04.2023 16:51, Liang Yang wrote:
> > Hi Arseniy,
> >
> > On 2023/4/20 17:37, Arseniy Krasnov wrote:
> >> [ EXTERNAL EMAIL ]
> >>
> >> On 19.04.2023 09:41, Arseniy Krasnov wrote:
> >>>
> >>>
> >>> On 19.04.2023 06:05, Liang Yang wrote:
> >>>> Hi Arseniy,
> >>>>
> >>>> On 2023/4/18 22:57, Arseniy Krasnov wrote:
> >>>>> [ EXTERNAL EMAIL ]
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 18.04.2023 16:25, Miquel Raynal wrote:
> >>>>>> Hi Arseniy,
> >>>>>>
> >>>>>>>>> Hello again @Liang @Miquel!
> >>>>>>>>>
> >>>>>>>>> One more question about OOB access, as I can see current driver uses the following
> >>>>>>>>> callbacks:
> >>>>>>>>>
> >>>>>>>>>       nand->ecc.write_oob_raw = nand_write_oob_std;
> >>>>>>>>>       nand->ecc.write_oob = nand_write_oob_std;
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Function 'nand_write_oob_std()' writes data to the end of the page. But as I
> >>>>>>>>> can see by dumping 'data_buf' during read, physical layout of each page is the
> >>>>>>>>> following (1KB ECC):
> >>>>>>>>>
> >>>>>>>>> 0x000: [         1 KB of data        ]
> >>>>>>>>> 0x400: [ 2B user data] [ 14B ECC code]
> >>>>>>>>> 0x410: [         1 KB of data        ]    (A)
> >>>>>>>>> 0x810: [ 2B user data] [ 14B ECC code]
> >>>>>>>>> 0x820: [        32B unused           ]
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> So, after 'nand_write_oob_std()' (let data be sequence from [0x0 ... 0x3f]),
> >>>>>>>>> page will look like this:
> >>>>>>>>>
> >>>>>>>>> 0x000: [             0xFF            ]
> >>>>>>>>> 0x400: [           ........          ]
> >>>>>>>>> 0x7f0: [             0xFF            ]
> >>>>>>>>> 0x800: [ 00 .......................  ]
> >>>>>>>>> 0x830: [ ........................ 3f ]
> >>>>>>>>>
> >>>>>>>>> Here we have two problems:
> >>>>>>>>> 1) Attempt to display raw data by 'nanddump' utility produces a little bit
> >>>>>>>>>       invalid output, as driver relies on layout (A) from above. E.g. OOB data
> >>>>>>>>>       is at 0x400 and 0x810. Here is an example (attempt to write 0x11 0x22 0x33 0x44):
> >>>>>>>>>
> >>>>>>>>> 0x000007f0: 11 22 ff ff ff ff ff ff ff ff ff ff ff ff ff ff  |."..............|
> >>>>>>>>>      OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  |................|
> >>>>>>>>>      OOB Data: 33 44 ff ff ff ff ff ff ff ff ff ff ff ff ff ff  |3D..............|
> >>>>>>>>>      OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  |................|
> >>>>>>>>>      OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  |................|
> >>>>>>>>>
> >>>>>>>> Hi Arseniy,
> >>>>>>>>
> >>>>>>>> I realized the write_oob_raw() and write_oob() are wrong in meson_nand.c. I suggest both of them should be reworked and follow the format of meson nand controller. i.e. firstly format the data in Layout (A) and then write. reading is firstly reading the data of layout (A) and then compost the layout (B).
> >>>>>>>
> >>>>>>> IIUC after such writing only OOB (e.g. user bytes) according layout (A), hw will also write ECC codes, so
> >>>>>>> it will be impossible to write data to this page later, because we cannot update ECC codes properly for the newly
> >>>>>>> written data (we can't update bits from 0 to 1).
> >>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> 2) Attempt to read data in ECC mode will fail, because IIUC page is in dirty
> >>>>>>>>>       state (I mean was written at least once) and NAND controller tries to use
> >>>>>>>>>       ECC codes at 0x400 and 0x810, which are obviously broken in this case. Thus
> >>>>>>>>
> >>>>>>>> As i said above, write_oob_raw() and write_oob() should be reworked.
> >>>>>>>> i don't know what do you mean page was written at least once. anyway the page should be written once, even just write_oob_raw().
> >>>>>>>
> >>>>>>> Sorry, You mean that after OOB write, we cannot write to the data area (e.g. 0x0 .. 0x810) until page will be erased? For example
> >>>>>>> JFFS2 writes to OOB own markers, then it tries to write to the data area of such page.
> >>>>>
> >>>>> @Liang, I'll describe current test case in details:
> >>>>> 1) I have erased page, I can read it in  both raw and ecc modes - no problem (it is full of 0xFF).
> >>>>> 2) I (JFFS2 for example) want to write only OOB - let it be clean markers.
> >>>>> 3) I use raw write to the needed page (please correct me if i'm wrong). Four bytes
> >>>>>      at 0x400 and 0x810 are updated. All other bytes still 0xff.
> >>>>> 4) Now, when i'm trying to read this page in ECC mode, I get ECC errors: IIUC this
> >>>>>      happens because from controller point of view ECC codes are invalid for current
> >>>>>      data (all ECCs are 0xff). Is this behaviour is ok?
> >>>>
> >>>> Yes, it is exactly reported ECC errors.
> >>>
> >>> I see, so if we write OOB (e.g. using raw mode), there is no way to read this page in ECC mode later? And the
> >
> > Of course, there is no ECC parity bytes in it; or raw write the data with the ECC parity bytes per the layout (A) you describe above.
> >
>
> But don't it looks like strange? Just writing OOB makes page unreadable? May be it is better to move OOB data
> out of ECC area as I suggested in v2?
>
> >>> only way to make it readable is to write it in ECC mode, but before this write, we need to read it's
> >>> user's byte (from previous OOB write) in raw mode, put it to info buf (as user's bytes) and write this page. In this
> >>> case NAND controller will generate ECC codes including user's byte and page become readable in ECC mode
> >>> again.
> >
> > yes, you are right.
> >
> >>>
> >>>>
> >>>>> 5) Ok, don't care on these ECC errors, let's go further.
> >>>>> 6) I'm going to write same page in ECC mode - how to do it correctly? There is already
> >>>>>      4 OOB bytes, considered to be covered by ECC (but in fact now - ECC area is FFed).
> >>>>
> >>>> If step 4 has excuted "program" command at the page (nand_write_oob_std() does), it can't be written again before erasing the page(block). so we have to read the whole page in the ddr and change the content, erase block, write it again.
> >>>>
> >>>> I don't think Jffs2 has the same steps (1-6) as you said above. are you sure that happes on Jffs2 or just an example?
> >>
> >>
> >>>
> >>> I just checked JFFS2 mount/umount again, here is what i see:
> >>> 0) First attempt to mount JFFS2.
> >>> 1) It writes OOB to page N (i'm using raw write). It is cleanmarker value 0x85 0x19 0x03 0x20. Mount is done.
> >>> 2) Umount JFFS2. Done.
> >>> 3) Second attempt to mount JFFS2.
> >>> 4) It reads OOB from page N (i'm using raw read). Value is 0x85 0x19 0x03 0x20. Done.
> >>> 5) It reads page N in ECC mode, and i get:
> >>>      jffs2: mtd->read(0x100 bytes from N) returned ECC error
> >>> 6) Mount failed.
> >>>
> >>> We already had problem which looks like this on another device. Solution was to use OOB area which is
> >>> not covered by ECC for JFFS2 cleanmarkers.
> >
> > ok, so there is not ECC parity bytes and mtd->read() returns ECC error.
> > does it have to use raw write/read on step 1) and 4)?
> >
>
> If i'm using non raw access to OOB, for example write OOB (user bytes) in ECC mode, then
> steps 1) and 4) and 5) passes ok, but write to this page will be impossible (for example JFFS2
> writes to such pages later) - we can't update ECC codes properly without erasing whole page.
> Write operation will be done without problem, but read will trigger ECC errors due to broken
> ECC codes.
>
> In general problem that we discuss is that in current implementation data and OOB conflicts
> with each other by sharing same ECC codes, these ECC codes could be written only once (without
> erasing), while data and OOB has different callbacks to access and thus supposed to work
> separately.

The fact that there might be helpers just for writing OOB areas or just
in-band areas are optimizations. NAND pages are meant to be written a
single time, no matter what portion you write. In some cases, it is
possible to perform subpage writes if the chip supports it. Pages may
be split into several areas which cover a partial in-band area *and* a
partial OOB area. If you write into the in-band *or* out-of-band areas
of a given subpage, you *cannot* write the other part later without
erasing.

Thanks,
Miquèl

2023-05-02 11:29:35

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi Arseniy,

[email protected] wrote on Tue, 2 May 2023 13:11:38 +0300:

> On 02.05.2023 12:59, Miquel Raynal wrote:
> > Hi Arseniy,
> >
> > [email protected] wrote on Wed, 26 Apr 2023 17:46:19 +0300:
> >
> >> On 26.04.2023 16:51, Liang Yang wrote:
> >>> Hi Arseniy,
> >>>
> >>> On 2023/4/20 17:37, Arseniy Krasnov wrote:
> >>>> [ EXTERNAL EMAIL ]
> >>>>
> >>>> On 19.04.2023 09:41, Arseniy Krasnov wrote:
> >>>>>
> >>>>>
> >>>>> On 19.04.2023 06:05, Liang Yang wrote:
> >>>>>> Hi Arseniy,
> >>>>>>
> >>>>>> On 2023/4/18 22:57, Arseniy Krasnov wrote:
> >>>>>>> [ EXTERNAL EMAIL ]
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 18.04.2023 16:25, Miquel Raynal wrote:
> >>>>>>>> Hi Arseniy,
> >>>>>>>>
> >>>>>>>>>>> Hello again @Liang @Miquel!
> >>>>>>>>>>>
> >>>>>>>>>>> One more question about OOB access, as I can see current driver uses the following
> >>>>>>>>>>> callbacks:
> >>>>>>>>>>>
> >>>>>>>>>>>       nand->ecc.write_oob_raw = nand_write_oob_std;
> >>>>>>>>>>>       nand->ecc.write_oob = nand_write_oob_std;
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Function 'nand_write_oob_std()' writes data to the end of the page. But as I
> >>>>>>>>>>> can see by dumping 'data_buf' during read, physical layout of each page is the
> >>>>>>>>>>> following (1KB ECC):
> >>>>>>>>>>>
> >>>>>>>>>>> 0x000: [         1 KB of data        ]
> >>>>>>>>>>> 0x400: [ 2B user data] [ 14B ECC code]
> >>>>>>>>>>> 0x410: [         1 KB of data        ]    (A)
> >>>>>>>>>>> 0x810: [ 2B user data] [ 14B ECC code]
> >>>>>>>>>>> 0x820: [        32B unused           ]
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> So, after 'nand_write_oob_std()' (let data be sequence from [0x0 ... 0x3f]),
> >>>>>>>>>>> page will look like this:
> >>>>>>>>>>>
> >>>>>>>>>>> 0x000: [             0xFF            ]
> >>>>>>>>>>> 0x400: [           ........          ]
> >>>>>>>>>>> 0x7f0: [             0xFF            ]
> >>>>>>>>>>> 0x800: [ 00 .......................  ]
> >>>>>>>>>>> 0x830: [ ........................ 3f ]
> >>>>>>>>>>>
> >>>>>>>>>>> Here we have two problems:
> >>>>>>>>>>> 1) Attempt to display raw data by 'nanddump' utility produces a little bit
> >>>>>>>>>>>       invalid output, as driver relies on layout (A) from above. E.g. OOB data
> >>>>>>>>>>>       is at 0x400 and 0x810. Here is an example (attempt to write 0x11 0x22 0x33 0x44):
> >>>>>>>>>>>
> >>>>>>>>>>> 0x000007f0: 11 22 ff ff ff ff ff ff ff ff ff ff ff ff ff ff  |."..............|
> >>>>>>>>>>>      OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  |................|
> >>>>>>>>>>>      OOB Data: 33 44 ff ff ff ff ff ff ff ff ff ff ff ff ff ff  |3D..............|
> >>>>>>>>>>>      OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  |................|
> >>>>>>>>>>>      OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  |................|
> >>>>>>>>>>>
> >>>>>>>>>> Hi Arseniy,
> >>>>>>>>>>
> >>>>>>>>>> I realized the write_oob_raw() and write_oob() are wrong in meson_nand.c. I suggest both of them should be reworked and follow the format of meson nand controller. i.e. firstly format the data in Layout (A) and then write. reading is firstly reading the data of layout (A) and then compost the layout (B).
> >>>>>>>>>
> >>>>>>>>> IIUC after such writing only OOB (e.g. user bytes) according layout (A), hw will also write ECC codes, so
> >>>>>>>>> it will be impossible to write data to this page later, because we cannot update ECC codes properly for the newly
> >>>>>>>>> written data (we can't update bits from 0 to 1).
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> 2) Attempt to read data in ECC mode will fail, because IIUC page is in dirty
> >>>>>>>>>>>       state (I mean was written at least once) and NAND controller tries to use
> >>>>>>>>>>>       ECC codes at 0x400 and 0x810, which are obviously broken in this case. Thus
> >>>>>>>>>>
> >>>>>>>>>> As i said above, write_oob_raw() and write_oob() should be reworked.
> >>>>>>>>>> i don't know what do you mean page was written at least once. anyway the page should be written once, even just write_oob_raw().
> >>>>>>>>>
> >>>>>>>>> Sorry, You mean that after OOB write, we cannot write to the data area (e.g. 0x0 .. 0x810) until page will be erased? For example
> >>>>>>>>> JFFS2 writes to OOB own markers, then it tries to write to the data area of such page.
> >>>>>>>
> >>>>>>> @Liang, I'll describe current test case in details:
> >>>>>>> 1) I have erased page, I can read it in  both raw and ecc modes - no problem (it is full of 0xFF).
> >>>>>>> 2) I (JFFS2 for example) want to write only OOB - let it be clean markers.
> >>>>>>> 3) I use raw write to the needed page (please correct me if i'm wrong). Four bytes
> >>>>>>>      at 0x400 and 0x810 are updated. All other bytes still 0xff.
> >>>>>>> 4) Now, when i'm trying to read this page in ECC mode, I get ECC errors: IIUC this
> >>>>>>>      happens because from controller point of view ECC codes are invalid for current
> >>>>>>>      data (all ECCs are 0xff). Is this behaviour is ok?
> >>>>>>
> >>>>>> Yes, it is exactly reported ECC errors.
> >>>>>
> >>>>> I see, so if we write OOB (e.g. using raw mode), there is no way to read this page in ECC mode later? And the
> >>>
> >>> Of course, there is no ECC parity bytes in it; or raw write the data with the ECC parity bytes per the layout (A) you describe above.
> >>>
> >>
> >> But don't it looks like strange? Just writing OOB makes page unreadable? May be it is better to move OOB data
> >> out of ECC area as I suggested in v2?
> >>
> >>>>> only way to make it readable is to write it in ECC mode, but before this write, we need to read it's
> >>>>> user's byte (from previous OOB write) in raw mode, put it to info buf (as user's bytes) and write this page. In this
> >>>>> case NAND controller will generate ECC codes including user's byte and page become readable in ECC mode
> >>>>> again.
> >>>
> >>> yes, you are right.
> >>>
> >>>>>
> >>>>>>
> >>>>>>> 5) Ok, don't care on these ECC errors, let's go further.
> >>>>>>> 6) I'm going to write same page in ECC mode - how to do it correctly? There is already
> >>>>>>>      4 OOB bytes, considered to be covered by ECC (but in fact now - ECC area is FFed).
> >>>>>>
> >>>>>> If step 4 has excuted "program" command at the page (nand_write_oob_std() does), it can't be written again before erasing the page(block). so we have to read the whole page in the ddr and change the content, erase block, write it again.
> >>>>>>
> >>>>>> I don't think Jffs2 has the same steps (1-6) as you said above. are you sure that happes on Jffs2 or just an example?
> >>>>
> >>>>
> >>>>>
> >>>>> I just checked JFFS2 mount/umount again, here is what i see:
> >>>>> 0) First attempt to mount JFFS2.
> >>>>> 1) It writes OOB to page N (i'm using raw write). It is cleanmarker value 0x85 0x19 0x03 0x20. Mount is done.
> >>>>> 2) Umount JFFS2. Done.
> >>>>> 3) Second attempt to mount JFFS2.
> >>>>> 4) It reads OOB from page N (i'm using raw read). Value is 0x85 0x19 0x03 0x20. Done.
> >>>>> 5) It reads page N in ECC mode, and i get:
> >>>>>      jffs2: mtd->read(0x100 bytes from N) returned ECC error
> >>>>> 6) Mount failed.
> >>>>>
> >>>>> We already had problem which looks like this on another device. Solution was to use OOB area which is
> >>>>> not covered by ECC for JFFS2 cleanmarkers.
> >>>
> >>> ok, so there is not ECC parity bytes and mtd->read() returns ECC error.
> >>> does it have to use raw write/read on step 1) and 4)?
> >>>
> >>
> >> If i'm using non raw access to OOB, for example write OOB (user bytes) in ECC mode, then
> >> steps 1) and 4) and 5) passes ok, but write to this page will be impossible (for example JFFS2
> >> writes to such pages later) - we can't update ECC codes properly without erasing whole page.
> >> Write operation will be done without problem, but read will trigger ECC errors due to broken
> >> ECC codes.
> >>
> >> In general problem that we discuss is that in current implementation data and OOB conflicts
> >> with each other by sharing same ECC codes, these ECC codes could be written only once (without
> >> erasing), while data and OOB has different callbacks to access and thus supposed to work
> >> separately.
> >
> > The fact that there might be helpers just for writing OOB areas or just
> > in-band areas are optimizations. NAND pages are meant to be written a
> > single time, no matter what portion you write. In some cases, it is
> > possible to perform subpage writes if the chip supports it. Pages may
> > be split into several areas which cover a partial in-band area *and* a
> > partial OOB area. If you write into the in-band *or* out-of-band areas
> > of a given subpage, you *cannot* write the other part later without
>
> Thanks for details! So in case of JFFS2 it looks like strange, that it tries
> to write page after writing clean markers to it before? In the old vendor's
> driver OOB write callback is suppressed by return 0 always and JFFS2 works
> correctly.

Can you point the code you're mentioning? (both what JFFS2 which looks
strange to you and the old vendor hack)


Thanks,
Miquèl

2023-05-02 12:26:48

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi Arseniy,

Richard, your input is welcome below :-)

> >>>>>>> I just checked JFFS2 mount/umount again, here is what i see:
> >>>>>>> 0) First attempt to mount JFFS2.
> >>>>>>> 1) It writes OOB to page N (i'm using raw write). It is cleanmarker value 0x85 0x19 0x03 0x20. Mount is done.
> >>>>>>> 2) Umount JFFS2. Done.
> >>>>>>> 3) Second attempt to mount JFFS2.
> >>>>>>> 4) It reads OOB from page N (i'm using raw read). Value is 0x85 0x19 0x03 0x20. Done.
> >>>>>>> 5) It reads page N in ECC mode, and i get:
> >>>>>>>      jffs2: mtd->read(0x100 bytes from N) returned ECC error
> >>>>>>> 6) Mount failed.
> >>>>>>>
> >>>>>>> We already had problem which looks like this on another device. Solution was to use OOB area which is
> >>>>>>> not covered by ECC for JFFS2 cleanmarkers.
> >>>>>
> >>>>> ok, so there is not ECC parity bytes and mtd->read() returns ECC error.
> >>>>> does it have to use raw write/read on step 1) and 4)?
> >>>>>
> >>>>
> >>>> If i'm using non raw access to OOB, for example write OOB (user bytes) in ECC mode, then
> >>>> steps 1) and 4) and 5) passes ok, but write to this page will be impossible (for example JFFS2
> >>>> writes to such pages later) - we can't update ECC codes properly without erasing whole page.
> >>>> Write operation will be done without problem, but read will trigger ECC errors due to broken
> >>>> ECC codes.
> >>>>
> >>>> In general problem that we discuss is that in current implementation data and OOB conflicts
> >>>> with each other by sharing same ECC codes, these ECC codes could be written only once (without
> >>>> erasing), while data and OOB has different callbacks to access and thus supposed to work
> >>>> separately.
> >>>
> >>> The fact that there might be helpers just for writing OOB areas or just
> >>> in-band areas are optimizations. NAND pages are meant to be written a
> >>> single time, no matter what portion you write. In some cases, it is
> >>> possible to perform subpage writes if the chip supports it. Pages may
> >>> be split into several areas which cover a partial in-band area *and* a
> >>> partial OOB area. If you write into the in-band *or* out-of-band areas
> >>> of a given subpage, you *cannot* write the other part later without
> >>
> >> Thanks for details! So in case of JFFS2 it looks like strange, that it tries
> >> to write page after writing clean markers to it before? In the old vendor's
> >> driver OOB write callback is suppressed by return 0 always and JFFS2 works
> >> correctly.
> >
> > Can you point the code you're mentioning? (both what JFFS2 which looks
> > strange to you and the old vendor hack)
>
> Here is version of the old vendor's driver:
>
> https://github.com/kszaq/linux-amlogic/blob/master_new_amports/drivers/amlogic/nand/nand/aml_nand.c#L3260
>
> In my version there is no BUG() there, but it is same driver for the same chip.
>
> About JFFS2 - i didn't check its source code, but what I can see using printk(), is that it first
> tries to write cleanmarker using OOB write callback. Then later it tries to write to this page, so
> may be it is unexpected behaviour of JFFS2?

TBH I am not knowledgeable about JFFS2, maybe Richard can help here.

Are you sure you flash is recognized by JFFS2 as being a NAND device?
Did you enable CONFIG_JFFS2_FS_WRITEBUFFER correctly? Because
cleanmarker seem to be discarded when using a NAND device, and
recognizing the device as a NAND device requires the above option to be
set apparently.

Thanks,
Miquèl

2023-05-02 13:07:09

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi Arseniy,

[email protected] wrote on Tue, 2 May 2023 15:24:09 +0300:

> On 02.05.2023 15:17, Miquel Raynal wrote:
> > Hi Arseniy,
> >
> > Richard, your input is welcome below :-)
> >
> >>>>>>>>> I just checked JFFS2 mount/umount again, here is what i see:
> >>>>>>>>> 0) First attempt to mount JFFS2.
> >>>>>>>>> 1) It writes OOB to page N (i'm using raw write). It is cleanmarker value 0x85 0x19 0x03 0x20. Mount is done.
> >>>>>>>>> 2) Umount JFFS2. Done.
> >>>>>>>>> 3) Second attempt to mount JFFS2.
> >>>>>>>>> 4) It reads OOB from page N (i'm using raw read). Value is 0x85 0x19 0x03 0x20. Done.
> >>>>>>>>> 5) It reads page N in ECC mode, and i get:
> >>>>>>>>>      jffs2: mtd->read(0x100 bytes from N) returned ECC error
> >>>>>>>>> 6) Mount failed.
> >>>>>>>>>
> >>>>>>>>> We already had problem which looks like this on another device. Solution was to use OOB area which is
> >>>>>>>>> not covered by ECC for JFFS2 cleanmarkers.
> >>>>>>>
> >>>>>>> ok, so there is not ECC parity bytes and mtd->read() returns ECC error.
> >>>>>>> does it have to use raw write/read on step 1) and 4)?
> >>>>>>>
> >>>>>>
> >>>>>> If i'm using non raw access to OOB, for example write OOB (user bytes) in ECC mode, then
> >>>>>> steps 1) and 4) and 5) passes ok, but write to this page will be impossible (for example JFFS2
> >>>>>> writes to such pages later) - we can't update ECC codes properly without erasing whole page.
> >>>>>> Write operation will be done without problem, but read will trigger ECC errors due to broken
> >>>>>> ECC codes.
> >>>>>>
> >>>>>> In general problem that we discuss is that in current implementation data and OOB conflicts
> >>>>>> with each other by sharing same ECC codes, these ECC codes could be written only once (without
> >>>>>> erasing), while data and OOB has different callbacks to access and thus supposed to work
> >>>>>> separately.
> >>>>>
> >>>>> The fact that there might be helpers just for writing OOB areas or just
> >>>>> in-band areas are optimizations. NAND pages are meant to be written a
> >>>>> single time, no matter what portion you write. In some cases, it is
> >>>>> possible to perform subpage writes if the chip supports it. Pages may
> >>>>> be split into several areas which cover a partial in-band area *and* a
> >>>>> partial OOB area. If you write into the in-band *or* out-of-band areas
> >>>>> of a given subpage, you *cannot* write the other part later without
> >>>>
> >>>> Thanks for details! So in case of JFFS2 it looks like strange, that it tries
> >>>> to write page after writing clean markers to it before? In the old vendor's
> >>>> driver OOB write callback is suppressed by return 0 always and JFFS2 works
> >>>> correctly.
> >>>
> >>> Can you point the code you're mentioning? (both what JFFS2 which looks
> >>> strange to you and the old vendor hack)
> >>
> >> Here is version of the old vendor's driver:
> >>
> >> https://github.com/kszaq/linux-amlogic/blob/master_new_amports/drivers/amlogic/nand/nand/aml_nand.c#L3260
> >>
> >> In my version there is no BUG() there, but it is same driver for the same chip.
> >>
> >> About JFFS2 - i didn't check its source code, but what I can see using printk(), is that it first
> >> tries to write cleanmarker using OOB write callback. Then later it tries to write to this page, so
> >> may be it is unexpected behaviour of JFFS2?
> >
> > TBH I am not knowledgeable about JFFS2, maybe Richard can help here.
> >
> > Are you sure you flash is recognized by JFFS2 as being a NAND device?
> > Did you enable CONFIG_JFFS2_FS_WRITEBUFFER correctly? Because
> > cleanmarker seem to be discarded when using a NAND device, and
> > recognizing the device as a NAND device requires the above option to be
> > set apparently.
>
> Yes, I have
>
> CONFIG_JFFS2_FS_WRITEBUFFER=y
>
> And i see, that jffs2_mark_erased_block() calls jffs2_cleanmarker_oob() which checks that we have MTD_NANDFLASH. This
> check is true, so then jffs2_write_nand_cleanmarker() is called and there is OOB write in it. So I see opposite thing:
> cleanmarkers are not discarded with NAND device.

Excellent. So when cleanmarker_size == 0, it means there is no
cleanmarker. But if it is a NAND device, we write the marker anyway.

Well I guess it used to work on old controllers using a Hamming ECC
engine not protecting any user OOB bytes, so writing the clean markers
would simply not lead to ECC bytes being produced/written. Or it might
have worked as well on controller drivers not enabling the ECC engine
when performing OOB-only writes. It also requires the chip to be old
enough to support multiple writes on the same (sub)page as long as the
written bits do not overlap?

Perhaps that's what the hack in the old driver is for. But that's
IMHO broken in case of unexpected reboot :-)

Miquèl

2023-05-03 08:07:06

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi Arseniy,

[email protected] wrote on Tue, 2 May 2023 19:13:38 +0300:

> On 02.05.2023 16:05, Miquel Raynal wrote:
> > Hi Arseniy,
> >
> > [email protected] wrote on Tue, 2 May 2023 15:24:09 +0300:
> >
> >> On 02.05.2023 15:17, Miquel Raynal wrote:
> >>> Hi Arseniy,
> >>>
> >>> Richard, your input is welcome below :-)
> >>>
> >>>>>>>>>>> I just checked JFFS2 mount/umount again, here is what i see:
> >>>>>>>>>>> 0) First attempt to mount JFFS2.
> >>>>>>>>>>> 1) It writes OOB to page N (i'm using raw write). It is cleanmarker value 0x85 0x19 0x03 0x20. Mount is done.
> >>>>>>>>>>> 2) Umount JFFS2. Done.
> >>>>>>>>>>> 3) Second attempt to mount JFFS2.
> >>>>>>>>>>> 4) It reads OOB from page N (i'm using raw read). Value is 0x85 0x19 0x03 0x20. Done.
> >>>>>>>>>>> 5) It reads page N in ECC mode, and i get:
> >>>>>>>>>>>      jffs2: mtd->read(0x100 bytes from N) returned ECC error
> >>>>>>>>>>> 6) Mount failed.
> >>>>>>>>>>>
> >>>>>>>>>>> We already had problem which looks like this on another device. Solution was to use OOB area which is
> >>>>>>>>>>> not covered by ECC for JFFS2 cleanmarkers.
> >>>>>>>>>
> >>>>>>>>> ok, so there is not ECC parity bytes and mtd->read() returns ECC error.
> >>>>>>>>> does it have to use raw write/read on step 1) and 4)?
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> If i'm using non raw access to OOB, for example write OOB (user bytes) in ECC mode, then
> >>>>>>>> steps 1) and 4) and 5) passes ok, but write to this page will be impossible (for example JFFS2
> >>>>>>>> writes to such pages later) - we can't update ECC codes properly without erasing whole page.
> >>>>>>>> Write operation will be done without problem, but read will trigger ECC errors due to broken
> >>>>>>>> ECC codes.
> >>>>>>>>
> >>>>>>>> In general problem that we discuss is that in current implementation data and OOB conflicts
> >>>>>>>> with each other by sharing same ECC codes, these ECC codes could be written only once (without
> >>>>>>>> erasing), while data and OOB has different callbacks to access and thus supposed to work
> >>>>>>>> separately.
> >>>>>>>
> >>>>>>> The fact that there might be helpers just for writing OOB areas or just
> >>>>>>> in-band areas are optimizations. NAND pages are meant to be written a
> >>>>>>> single time, no matter what portion you write. In some cases, it is
> >>>>>>> possible to perform subpage writes if the chip supports it. Pages may
> >>>>>>> be split into several areas which cover a partial in-band area *and* a
> >>>>>>> partial OOB area. If you write into the in-band *or* out-of-band areas
> >>>>>>> of a given subpage, you *cannot* write the other part later without
> >>>>>>
> >>>>>> Thanks for details! So in case of JFFS2 it looks like strange, that it tries
> >>>>>> to write page after writing clean markers to it before? In the old vendor's
> >>>>>> driver OOB write callback is suppressed by return 0 always and JFFS2 works
> >>>>>> correctly.
> >>>>>
> >>>>> Can you point the code you're mentioning? (both what JFFS2 which looks
> >>>>> strange to you and the old vendor hack)
> >>>>
> >>>> Here is version of the old vendor's driver:
> >>>>
> >>>> https://github.com/kszaq/linux-amlogic/blob/master_new_amports/drivers/amlogic/nand/nand/aml_nand.c#L3260
> >>>>
> >>>> In my version there is no BUG() there, but it is same driver for the same chip.
> >>>>
> >>>> About JFFS2 - i didn't check its source code, but what I can see using printk(), is that it first
> >>>> tries to write cleanmarker using OOB write callback. Then later it tries to write to this page, so
> >>>> may be it is unexpected behaviour of JFFS2?
> >>>
> >>> TBH I am not knowledgeable about JFFS2, maybe Richard can help here.
> >>>
> >>> Are you sure you flash is recognized by JFFS2 as being a NAND device?
> >>> Did you enable CONFIG_JFFS2_FS_WRITEBUFFER correctly? Because
> >>> cleanmarker seem to be discarded when using a NAND device, and
> >>> recognizing the device as a NAND device requires the above option to be
> >>> set apparently.
> >>
> >> Yes, I have
> >>
> >> CONFIG_JFFS2_FS_WRITEBUFFER=y
> >>
> >> And i see, that jffs2_mark_erased_block() calls jffs2_cleanmarker_oob() which checks that we have MTD_NANDFLASH. This
> >> check is true, so then jffs2_write_nand_cleanmarker() is called and there is OOB write in it. So I see opposite thing:
> >> cleanmarkers are not discarded with NAND device.
> >
> > Excellent. So when cleanmarker_size == 0, it means there is no
> > cleanmarker. But if it is a NAND device, we write the marker anyway.
> >
> > Well I guess it used to work on old controllers using a Hamming ECC
> > engine not protecting any user OOB bytes, so writing the clean markers
> > would simply not lead to ECC bytes being produced/written. Or it might
> > have worked as well on controller drivers not enabling the ECC engine
> > when performing OOB-only writes. It also requires the chip to be old
> > enough to support multiple writes on the same (sub)page as long as the
> > written bits do not overlap?
>
> Yes, with controller which supports such modes there will be no problem here!
> What i see, is that this controller doesn't support multiple writes to the
> same page in ECC mode(e.g. it can't update ECC correctly).

I don't think this is a controller limitation. The NAND chip cannot
write ECC bytes a first time and then overwrite other ECC bytes, that
cannot work. The fact that we write ECC bytes in the first place is
because the ECC engine covers the free OOB bytes used by JFFS2 to write
its cleanmarkers.

> So in v2 i've added
> patch which moves OOB out of ECC area, thus JFFS2 driver will work correctly.

I am sorry but the above sentence is not clear to me. I believe you
meant the free OOB bytes are moved outside of the area protected by the
ECC engine. In this case I guess it should be fine.

> So for me main question here is:
>
> How JFFS2 should work with controllers where we can't update data and OOB
> independently? Driver of this filesystem knows nothing about this features of
> the controller.
>
> Or JFFS2 works incorrectly in my case when it tries to call write page callback
> after calling write OOB callback (IIUC it is better to ask Richard as You mentioned above).
>
> Or may be it is better to suppress OOB write callback (or set it to NULL) in this
> driver as in vendor's driver?

I would assume using the unprotected free OOB bytes to store the
cleanmarkers should work. But that's a bit fragile and very filesystem
oriented. I don't like this much. But on the other side JFFS2 is
legacy, you should use UBI (which does not play with OOB areas) :-)

Thanks,
Miquèl

>
> Thanks, Arseniy
>
> >
> > Perhaps that's what the hack in the old driver is for. But that's
> > IMHO broken in case of unexpected reboot :-)
> >
> > Miquèl

2023-05-03 20:01:47

by Richard Weinberger

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Sorry for joining the party late.

On Wed, May 3, 2023 at 10:07 AM Miquel Raynal <[email protected]> wrote:
> > So for me main question here is:
> >
> > How JFFS2 should work with controllers where we can't update data and OOB
> > independently? Driver of this filesystem knows nothing about this features of
> > the controller.
> >
> > Or JFFS2 works incorrectly in my case when it tries to call write page callback
> > after calling write OOB callback (IIUC it is better to ask Richard as You mentioned above).
> >
> > Or may be it is better to suppress OOB write callback (or set it to NULL) in this
> > driver as in vendor's driver?
>
> I would assume using the unprotected free OOB bytes to store the
> cleanmarkers should work. But that's a bit fragile and very filesystem
> oriented. I don't like this much. But on the other side JFFS2 is
> legacy, you should use UBI (which does not play with OOB areas) :-)

Please note that JFFS2's main use case is NOR flash. Support for NAND
flash was never
considered as fully complete/stable. That's why UBIFS (JFFS3 in the
very beginning)
was invented.

Thanks,
//richard

2023-05-04 12:21:26

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

Hi Arseniy,

[email protected] wrote on Thu, 4 May 2023 14:37:45 +0300:

> On 03.05.2023 13:23, Arseniy Krasnov wrote:
> >
> >
> > On 03.05.2023 11:03, Miquel Raynal wrote:
> >> Hi Arseniy,
> >>
> >> [email protected] wrote on Tue, 2 May 2023 19:13:38 +0300:
> >>
> >>> On 02.05.2023 16:05, Miquel Raynal wrote:
> >>>> Hi Arseniy,
> >>>>
> >>>> [email protected] wrote on Tue, 2 May 2023 15:24:09 +0300:
> >>>>
> >>>>> On 02.05.2023 15:17, Miquel Raynal wrote:
> >>>>>> Hi Arseniy,
> >>>>>>
> >>>>>> Richard, your input is welcome below :-)
> >>>>>>
> >>>>>>>>>>>>>> I just checked JFFS2 mount/umount again, here is what i see:
> >>>>>>>>>>>>>> 0) First attempt to mount JFFS2.
> >>>>>>>>>>>>>> 1) It writes OOB to page N (i'm using raw write). It is cleanmarker value 0x85 0x19 0x03 0x20. Mount is done.
> >>>>>>>>>>>>>> 2) Umount JFFS2. Done.
> >>>>>>>>>>>>>> 3) Second attempt to mount JFFS2.
> >>>>>>>>>>>>>> 4) It reads OOB from page N (i'm using raw read). Value is 0x85 0x19 0x03 0x20. Done.
> >>>>>>>>>>>>>> 5) It reads page N in ECC mode, and i get:
> >>>>>>>>>>>>>>      jffs2: mtd->read(0x100 bytes from N) returned ECC error
> >>>>>>>>>>>>>> 6) Mount failed.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> We already had problem which looks like this on another device. Solution was to use OOB area which is
> >>>>>>>>>>>>>> not covered by ECC for JFFS2 cleanmarkers.
> >>>>>>>>>>>>
> >>>>>>>>>>>> ok, so there is not ECC parity bytes and mtd->read() returns ECC error.
> >>>>>>>>>>>> does it have to use raw write/read on step 1) and 4)?
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> If i'm using non raw access to OOB, for example write OOB (user bytes) in ECC mode, then
> >>>>>>>>>>> steps 1) and 4) and 5) passes ok, but write to this page will be impossible (for example JFFS2
> >>>>>>>>>>> writes to such pages later) - we can't update ECC codes properly without erasing whole page.
> >>>>>>>>>>> Write operation will be done without problem, but read will trigger ECC errors due to broken
> >>>>>>>>>>> ECC codes.
> >>>>>>>>>>>
> >>>>>>>>>>> In general problem that we discuss is that in current implementation data and OOB conflicts
> >>>>>>>>>>> with each other by sharing same ECC codes, these ECC codes could be written only once (without
> >>>>>>>>>>> erasing), while data and OOB has different callbacks to access and thus supposed to work
> >>>>>>>>>>> separately.
> >>>>>>>>>>
> >>>>>>>>>> The fact that there might be helpers just for writing OOB areas or just
> >>>>>>>>>> in-band areas are optimizations. NAND pages are meant to be written a
> >>>>>>>>>> single time, no matter what portion you write. In some cases, it is
> >>>>>>>>>> possible to perform subpage writes if the chip supports it. Pages may
> >>>>>>>>>> be split into several areas which cover a partial in-band area *and* a
> >>>>>>>>>> partial OOB area. If you write into the in-band *or* out-of-band areas
> >>>>>>>>>> of a given subpage, you *cannot* write the other part later without
> >>>>>>>>>
> >>>>>>>>> Thanks for details! So in case of JFFS2 it looks like strange, that it tries
> >>>>>>>>> to write page after writing clean markers to it before? In the old vendor's
> >>>>>>>>> driver OOB write callback is suppressed by return 0 always and JFFS2 works
> >>>>>>>>> correctly.
> >>>>>>>>
> >>>>>>>> Can you point the code you're mentioning? (both what JFFS2 which looks
> >>>>>>>> strange to you and the old vendor hack)
> >>>>>>>
> >>>>>>> Here is version of the old vendor's driver:
> >>>>>>>
> >>>>>>> https://github.com/kszaq/linux-amlogic/blob/master_new_amports/drivers/amlogic/nand/nand/aml_nand.c#L3260
> >>>>>>>
> >>>>>>> In my version there is no BUG() there, but it is same driver for the same chip.
> >>>>>>>
> >>>>>>> About JFFS2 - i didn't check its source code, but what I can see using printk(), is that it first
> >>>>>>> tries to write cleanmarker using OOB write callback. Then later it tries to write to this page, so
> >>>>>>> may be it is unexpected behaviour of JFFS2?
> >>>>>>
> >>>>>> TBH I am not knowledgeable about JFFS2, maybe Richard can help here.
> >>>>>>
> >>>>>> Are you sure you flash is recognized by JFFS2 as being a NAND device?
> >>>>>> Did you enable CONFIG_JFFS2_FS_WRITEBUFFER correctly? Because
> >>>>>> cleanmarker seem to be discarded when using a NAND device, and
> >>>>>> recognizing the device as a NAND device requires the above option to be
> >>>>>> set apparently.
> >>>>>
> >>>>> Yes, I have
> >>>>>
> >>>>> CONFIG_JFFS2_FS_WRITEBUFFER=y
> >>>>>
> >>>>> And i see, that jffs2_mark_erased_block() calls jffs2_cleanmarker_oob() which checks that we have MTD_NANDFLASH. This
> >>>>> check is true, so then jffs2_write_nand_cleanmarker() is called and there is OOB write in it. So I see opposite thing:
> >>>>> cleanmarkers are not discarded with NAND device.
> >>>>
> >>>> Excellent. So when cleanmarker_size == 0, it means there is no
> >>>> cleanmarker. But if it is a NAND device, we write the marker anyway.
> >>>>
> >>>> Well I guess it used to work on old controllers using a Hamming ECC
> >>>> engine not protecting any user OOB bytes, so writing the clean markers
> >>>> would simply not lead to ECC bytes being produced/written. Or it might
> >>>> have worked as well on controller drivers not enabling the ECC engine
> >>>> when performing OOB-only writes. It also requires the chip to be old
> >>>> enough to support multiple writes on the same (sub)page as long as the
> >>>> written bits do not overlap?
> >>>
> >>> Yes, with controller which supports such modes there will be no problem here!
> >>> What i see, is that this controller doesn't support multiple writes to the
> >>> same page in ECC mode(e.g. it can't update ECC correctly).
> >>
> >> I don't think this is a controller limitation. The NAND chip cannot
> >> write ECC bytes a first time and then overwrite other ECC bytes, that
> >> cannot work. The fact that we write ECC bytes in the first place is
> >> because the ECC engine covers the free OOB bytes used by JFFS2 to write
> >> its cleanmarkers.
> >>
> >>> So in v2 i've added
> >>> patch which moves OOB out of ECC area, thus JFFS2 driver will work correctly.
> >>
> >> I am sorry but the above sentence is not clear to me. I believe you
> >> meant the free OOB bytes are moved outside of the area protected by the
> >> ECC engine. In this case I guess it should be fine.
> >
> > Exactly, free bytes which are reported by OOB layout callbacks were moved out of
> > ECC area.
> >
> >>
> >>> So for me main question here is:
> >>>
> >>> How JFFS2 should work with controllers where we can't update data and OOB
> >>> independently? Driver of this filesystem knows nothing about this features of
> >>> the controller.
> >>>
> >>> Or JFFS2 works incorrectly in my case when it tries to call write page callback
> >>> after calling write OOB callback (IIUC it is better to ask Richard as You mentioned above).
> >>>
> >>> Or may be it is better to suppress OOB write callback (or set it to NULL) in this
> >>> driver as in vendor's driver?
> >>
> >> I would assume using the unprotected free OOB bytes to store the
> >> cleanmarkers should work. But that's a bit fragile and very filesystem
> >> oriented. I don't like this much. But on the other side JFFS2 is
> >> legacy, you should use UBI (which does not play with OOB areas) :-)
> >
> > Problem here is that we can't use UBI in this case, because it does not support
> > small fs images. So the only way to make JFFS2 work is to move free OOB bytes to
> > non protected area. Otherwise i think we have strange situation that JFFS2 can't
> > work correctly on specific type on NAND controller. We already had same problem
> > on another NAND controller, and solution was to move OOB free bytes no non-protected
> > area:
> >
> > https://lore.kernel.org/all/[email protected]/
> >
> > Thanks, Arseniy
>
> Upd: may be i can add option for this driver, which makes JFFS2 work correctly on this chip.
> This feature suppresses OOB writes as in old driver. By default it is disabled and OOB is
> ECC protected(current behaviour), if enabled - it prints WARN_ONCE() and always returns 0.
> What do You think?
>
> Or may be add an option, which moves free bytes of OOB to ECC non-protected area and it is disabled
> by default.

I prefer having a single ooblayout where we expose unprotected user OOB
bytes only. As of today, the only upstream user of user OOB bytes is
JFFS2 anyway.

Thanks,
Miquèl