LinuxLists.cc - [PATCH v4 0/5] mtd: nand: properly handle bitflips in erased pages

2015-12-30 19:32:20

Subject: [PATCH v4 0/5] mtd: nand: properly handle bitflips in erased pages

Hi,

This patch series aims at providing a common logic to check for bitflips
in erased pages.

Currently each driver is implementing its own logic to check for bitflips
in erased pages. Not only this create code duplication, but most of these
implementations are incorrect.
Here are a few aspects that are often left aside in those implementations:
1/ they do not check OOB bytes when checking for the ff pattern, which
means they can consider a page as empty while the MTD user actually
wanted to write almost ff with a few bits to zero
2/ they check for the ff pattern on the whole page, while ECC actually
works on smaller chunks (usually 512 or 1024 bytes chunks)
3/ they use random bitflip thresholds to decide whether a page/chunk is
erased or not. IMO this threshold should be set to ECC strength (or
at least something correlated to this parameter)

The approach taken in this series is to provide two helper functions to
check for bitflips in erased pages. Each driver that needs to check for
such cases can then call the nand_check_erased_ecc_chunk() function, and
rely on the common logic to decide whether a page is erased or not.

While Brian suggested a few times to make this detection automatic for
all drivers that set a specific flag (NAND_CHECK_ERASED_BITFLIPS?), here
is a few reasons I think this is not such a good idea:
1/ some (a lot of) drivers do not properly implement the raw access
functions, and since we need to check for raw data and OOB bytes this
makes the automatic detection unusable for most drivers unless they
decide to correctly implement those methods (which would be a good
thing BTW).
2/ as a I said earlier, this check should be made at the ECC chunk level
and not at the page level. This spots two problems: some (a lot of)
drivers do not properly specify the ecc layout information, and even
if the ecc layout is correctly defined, there is no way to attach ECC
bytes to a specific ECC chunk.
3/ the last aspect is the perf penalty incured by this test. Automatically
doing that at the NAND core level implies reading the whole page again
in raw mode, while with the helper function approach, drivers supporting
access at the ECC chunk level can read only the faulty chunk in raw
mode.

Regarding the bitflips threshold at which an erased pages is considered as
faulty, I have assigned it to ECC strength. As mentioned by Andrea, using
ECC strength might cause some trouble, because if you already have some
bitflips in an erased page, programming it might generate even more of
them.
In the other hand, shouldn't that be checked after (or before) programming
a page. I mean, UBI is already capable of detecting pages which are over
the configured bitflips_threshold and move data around when it detects
such pages.
If we check data after writing a page we wouldn't have to bother about
setting a weaker value for the "bitflips in erased page" case.
Another thing in favor of the ECC strength value for this "bitflips in
erased page" threshold value: if the ECC engine is generating 0xff ECC
bytes when the page is empty, then it will be able to fix ECC strength
bitflips without complaining, so why should we use different value when
we detect bitflips using the pattern match approach?

Best Regards,

Boris

Changes since v3:
- drop already applied patches
- make the generic "bitflips in erased pages" check as an opt-in flag
- split driver changes to ease review
- addressed Brian's comments

Changes since v2:
- improve nand_check_erased_buf() implementation
- keep nand_check_erased_buf() private to nand_base.c
- patch existing ecc.correct() implementations to return consistent error
codes
- make the 'erased check' optional
- remove some custom implementations of the 'erased check'

Changes since v1:
- fix the nand_check_erased_buf() function
- mark the bitflips > bitflips_threshold condition as unlikely
- add missing memsets in nand_check_erased_ecc_chunk()

Boris Brezillon (5):
mtd: nand: return consistent error codes in ecc.correct()
implementations
mtd: nand: use nand_check_erased_ecc_chunk in default ECC read
functions
mtd: nand: davinci: remove custom 'erased check' implementation
mtd: nand: diskonchip: remove custom 'erased check' implementation
mtd: nand: jz4740: remove custom 'erased check' implementation

drivers/mtd/nand/atmel_nand.c | 2 +-
drivers/mtd/nand/bf5xx_nand.c | 20 +++++++++++-----
drivers/mtd/nand/davinci_nand.c | 15 ++++--------
drivers/mtd/nand/diskonchip.c | 37 ++--------------------------
drivers/mtd/nand/jz4740_nand.c | 22 ++---------------
drivers/mtd/nand/mxc_nand.c | 4 ++--
drivers/mtd/nand/nand_base.c | 53 +++++++++++++++++++++++++++++++++++------
drivers/mtd/nand/nand_bch.c | 2 +-
drivers/mtd/nand/nand_ecc.c | 2 +-
drivers/mtd/nand/omap2.c | 6 ++---
drivers/mtd/nand/r852.c | 4 ++--
include/linux/mtd/nand.h | 18 +++++++++++++-
include/linux/mtd/nand_bch.h | 2 +-
13 files changed, 96 insertions(+), 91 deletions(-)

--
2.1.4

2015-12-30 19:32:17

by Boris Brezillon

[permalink] [raw]

Subject: [PATCH v4 1/5] mtd: nand: return consistent error codes in ecc.correct() implementations

The error code returned by the ecc.correct() are not consistent over the
all implementations.

Document the expected behavior in include/linux/mtd/nand.h and fix
offending implementations.

Signed-off-by: Boris Brezillon <[email protected]>
---
drivers/mtd/nand/atmel_nand.c | 2 +-
drivers/mtd/nand/bf5xx_nand.c | 20 ++++++++++++++------
drivers/mtd/nand/davinci_nand.c | 6 +++---
drivers/mtd/nand/jz4740_nand.c | 4 ++--
drivers/mtd/nand/mxc_nand.c | 4 ++--
drivers/mtd/nand/nand_bch.c | 2 +-
drivers/mtd/nand/nand_ecc.c | 2 +-
drivers/mtd/nand/omap2.c | 6 +++---
drivers/mtd/nand/r852.c | 4 ++--
include/linux/mtd/nand.h | 8 +++++++-
include/linux/mtd/nand_bch.h | 2 +-
11 files changed, 37 insertions(+), 23 deletions(-)

diff --git a/drivers/mtd/nand/atmel_nand.c b/drivers/mtd/nand/atmel_nand.c
index 18c4e14..b216bf5 100644
--- a/drivers/mtd/nand/atmel_nand.c
+++ b/drivers/mtd/nand/atmel_nand.c
@@ -1445,7 +1445,7 @@ static int atmel_nand_correct(struct mtd_info *mtd, u_char *dat,
* We can't correct so many errors */
dev_dbg(host->dev, "atmel_nand : multiple errors detected."
" Unable to correct.\n");
- return -EIO;
+ return -EBADMSG;
}

/* if there's a single bit error : we can correct it */
diff --git a/drivers/mtd/nand/bf5xx_nand.c b/drivers/mtd/nand/bf5xx_nand.c
index 9514e13..89d9414 100644
--- a/drivers/mtd/nand/bf5xx_nand.c
+++ b/drivers/mtd/nand/bf5xx_nand.c
@@ -252,7 +252,7 @@ static int bf5xx_nand_correct_data_256(struct mtd_info *mtd, u_char *dat,
*/
if (hweight32(syndrome[0]) == 1) {
dev_err(info->device, "ECC data was incorrect!\n");
- return 1;
+ return -EBADMSG;
}

syndrome[1] = (calced & 0x7FF) ^ (stored & 0x7FF);
@@ -285,7 +285,7 @@ static int bf5xx_nand_correct_data_256(struct mtd_info *mtd, u_char *dat,
data = data ^ (0x1 << failing_bit);
*(dat + failing_byte) = data;

- return 0;
+ return 1;
}

/*
@@ -298,26 +298,34 @@ static int bf5xx_nand_correct_data_256(struct mtd_info *mtd, u_char *dat,
dev_err(info->device,
"Please discard data, mark bad block\n");

- return 1;
+ return -EBADMSG;
}

static int bf5xx_nand_correct_data(struct mtd_info *mtd, u_char *dat,
u_char *read_ecc, u_char *calc_ecc)
{
struct nand_chip *chip = mtd_to_nand(mtd);
- int ret;
+ int ret, bitflips = 0;

ret = bf5xx_nand_correct_data_256(mtd, dat, read_ecc, calc_ecc);
+ if (ret < 0)
+ return ret;
+
+ bitflips = ret;

/* If ecc size is 512, correct second 256 bytes */
if (chip->ecc.size == 512) {
dat += 256;
read_ecc += 3;
calc_ecc += 3;
- ret |= bf5xx_nand_correct_data_256(mtd, dat, read_ecc, calc_ecc);
+ ret = bf5xx_nand_correct_data_256(mtd, dat, read_ecc, calc_ecc);
+ if (ret < 0)
+ return ret;
+
+ bitflips += ret;
}

- return ret;
+ return bitflips;
}

static void bf5xx_nand_enable_hwecc(struct mtd_info *mtd, int mode)
diff --git a/drivers/mtd/nand/davinci_nand.c b/drivers/mtd/nand/davinci_nand.c
index 3b49fe8..ddb73c3 100644
--- a/drivers/mtd/nand/davinci_nand.c
+++ b/drivers/mtd/nand/davinci_nand.c
@@ -207,7 +207,7 @@ static int nand_davinci_correct_1bit(struct mtd_info *mtd, u_char *dat,
dat[diff >> (12 + 3)] ^= BIT((diff >> 12) & 7);
return 1;
} else {
- return -1;
+ return -EBADMSG;
}
} else if (!(diff & (diff - 1))) {
/* Single bit ECC error in the ECC itself,
@@ -215,7 +215,7 @@ static int nand_davinci_correct_1bit(struct mtd_info *mtd, u_char *dat,
return 1;
} else {
/* Uncorrectable error */
- return -1;
+ return -EBADMSG;
}

}
@@ -391,7 +391,7 @@ compare:
return 0;
case 1: /* five or more errors detected */
davinci_nand_readl(info, NAND_ERR_ERRVAL1_OFFSET);
- return -EIO;
+ return -EBADMSG;
case 2: /* error addresses computed */
case 3:
num_errors = 1 + ((fsr >> 16) & 0x03);
diff --git a/drivers/mtd/nand/jz4740_nand.c b/drivers/mtd/nand/jz4740_nand.c
index a2363d3..adccae3 100644
--- a/drivers/mtd/nand/jz4740_nand.c
+++ b/drivers/mtd/nand/jz4740_nand.c
@@ -254,7 +254,7 @@ static int jz_nand_correct_ecc_rs(struct mtd_info *mtd, uint8_t *dat,
} while (!(status & JZ_NAND_STATUS_DEC_FINISH) && --timeout);

if (timeout == 0)
- return -1;
+ return -ETIMEDOUT;

reg = readl(nand->base + JZ_REG_NAND_ECC_CTRL);
reg &= ~JZ_NAND_ECC_CTRL_ENABLE;
@@ -262,7 +262,7 @@ static int jz_nand_correct_ecc_rs(struct mtd_info *mtd, uint8_t *dat,

if (status & JZ_NAND_STATUS_ERROR) {
if (status & JZ_NAND_STATUS_UNCOR_ERROR)
- return -1;
+ return -EBADMSG;

error_count = (status & JZ_NAND_STATUS_ERR_COUNT) >> 29;

diff --git a/drivers/mtd/nand/mxc_nand.c b/drivers/mtd/nand/mxc_nand.c
index 9540099..66e56bb 100644
--- a/drivers/mtd/nand/mxc_nand.c
+++ b/drivers/mtd/nand/mxc_nand.c
@@ -674,7 +674,7 @@ static int mxc_nand_correct_data_v1(struct mtd_info *mtd, u_char *dat,

if (((ecc_status & 0x3) == 2) || ((ecc_status >> 2) == 2)) {
pr_debug("MXC_NAND: HWECC uncorrectable 2-bit ECC error\n");
- return -1;
+ return -EBADMSG;
}

return 0;
@@ -701,7 +701,7 @@ static int mxc_nand_correct_data_v2_v3(struct mtd_info *mtd, u_char *dat,
err = ecc_stat & ecc_bit_mask;
if (err > err_limit) {
printk(KERN_WARNING "UnCorrectable RS-ECC Error\n");
- return -1;
+ return -EBADMSG;
} else {
ret += err;
}
diff --git a/drivers/mtd/nand/nand_bch.c b/drivers/mtd/nand/nand_bch.c
index e5758d8..a87c1b6 100644
--- a/drivers/mtd/nand/nand_bch.c
+++ b/drivers/mtd/nand/nand_bch.c
@@ -98,7 +98,7 @@ int nand_bch_correct_data(struct mtd_info *mtd, unsigned char *buf,
}
} else if (count < 0) {
printk(KERN_ERR "ecc unrecoverable error\n");
- count = -1;
+ count = -EBADMSG;
}
return count;
}
diff --git a/drivers/mtd/nand/nand_ecc.c b/drivers/mtd/nand/nand_ecc.c
index e612985..d1770b0 100644
--- a/drivers/mtd/nand/nand_ecc.c
+++ b/drivers/mtd/nand/nand_ecc.c
@@ -507,7 +507,7 @@ int __nand_correct_data(unsigned char *buf,
return 1; /* error in ECC data; no action needed */

pr_err("%s: uncorrectable ECC error\n", __func__);
- return -1;
+ return -EBADMSG;
}
EXPORT_SYMBOL(__nand_correct_data);

diff --git a/drivers/mtd/nand/omap2.c b/drivers/mtd/nand/omap2.c
index e9cbbc6..c553f78 100644
--- a/drivers/mtd/nand/omap2.c
+++ b/drivers/mtd/nand/omap2.c
@@ -826,12 +826,12 @@ static int omap_compare_ecc(u8 *ecc_data1, /* read from NAND memory */
case 1:
/* Uncorrectable error */
pr_debug("ECC UNCORRECTED_ERROR 1\n");
- return -1;
+ return -EBADMSG;

case 11:
/* UN-Correctable error */
pr_debug("ECC UNCORRECTED_ERROR B\n");
- return -1;
+ return -EBADMSG;

case 12:
/* Correctable error */
@@ -861,7 +861,7 @@ static int omap_compare_ecc(u8 *ecc_data1, /* read from NAND memory */
return 0;
}
pr_debug("UNCORRECTED_ERROR default\n");
- return -1;
+ return -EBADMSG;
}
}

diff --git a/drivers/mtd/nand/r852.c b/drivers/mtd/nand/r852.c
index cb0bf09..5b15f2f 100644
--- a/drivers/mtd/nand/r852.c
+++ b/drivers/mtd/nand/r852.c
@@ -477,7 +477,7 @@ static int r852_ecc_correct(struct mtd_info *mtd, uint8_t *dat,

if (dev->dma_error) {
dev->dma_error = 0;
- return -1;
+ return -EIO;
}

r852_write_reg(dev, R852_CTL, dev->ctlreg | R852_CTL_ECC_ACCESS);
@@ -491,7 +491,7 @@ static int r852_ecc_correct(struct mtd_info *mtd, uint8_t *dat,
/* ecc uncorrectable error */
if (ecc_status & R852_ECC_FAIL) {
dbg("ecc: unrecoverable error, in half %d", i);
- error = -1;
+ error = -EBADMSG;
goto exit;
}

diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h
index 3e92be1..5189581 100644
--- a/include/linux/mtd/nand.h
+++ b/include/linux/mtd/nand.h
@@ -456,7 +456,13 @@ struct nand_hw_control {
* @hwctl: function to control hardware ECC generator. Must only
* be provided if an hardware ECC is available
* @calculate: function for ECC calculation or readback from ECC hardware
- * @correct: function for ECC correction, matching to ECC generator (sw/hw)
+ * @correct: function for ECC correction, matching to ECC generator (sw/hw).
+ * Should return a positive number representing the number of
+ * corrected bitflips, -EBADMSG if the number of bitflips exceed
+ * ECC strength, or any other error code if the error is not
+ * directly related to correction.
+ * If -EBADMSG is returned the input buffers should be left
+ * untouched.
* @read_page_raw: function to read a raw page without ECC. This function
* should hide the specific layout used by the ECC
* controller and always return contiguous in-band and
diff --git a/include/linux/mtd/nand_bch.h b/include/linux/mtd/nand_bch.h
index 74acf53..fb0bc34 100644
--- a/include/linux/mtd/nand_bch.h
+++ b/include/linux/mtd/nand_bch.h
@@ -55,7 +55,7 @@ static inline int
nand_bch_correct_data(struct mtd_info *mtd, unsigned char *buf,
unsigned char *read_ecc, unsigned char *calc_ecc)
{
- return -1;
+ return -ENOTSUPP;
}

static inline struct nand_bch_control *
--
2.1.4

2015-12-30 19:32:24

by Boris Brezillon

[permalink] [raw]

Subject: [PATCH v4 2/5] mtd: nand: use nand_check_erased_ecc_chunk in default ECC read functions

The default NAND read functions are relying on the underlying controller
driver to correct bitflips, but some of those controllers cannot properly
fix bitflips in erased pages.
Check for bitflips in erased pages in default core functions if the driver
delegated the this check by setting the NAND_ECC_GENERIC_ERASED_CHECK flag.

Signed-off-by: Boris Brezillon <[email protected]>
---
drivers/mtd/nand/nand_base.c | 53 ++++++++++++++++++++++++++++++++++++++------
include/linux/mtd/nand.h | 10 +++++++++
2 files changed, 56 insertions(+), 7 deletions(-)

diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index 928081b..909a1d4 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -1426,6 +1426,16 @@ static int nand_read_subpage(struct mtd_info *mtd, struct nand_chip *chip,

stat = chip->ecc.correct(mtd, p,
&chip->buffers->ecccode[i], &chip->buffers->ecccalc[i]);
+ if (stat == -EBADMSG &&
+ (chip->ecc.options & NAND_ECC_GENERIC_ERASED_CHECK)) {
+ /* check for empty pages with bitflips */
+ stat = nand_check_erased_ecc_chunk(p, chip->ecc.size,
+ &chip->buffers->ecccode[i],
+ chip->ecc.bytes,
+ NULL, 0,
+ chip->ecc.strength);
+ }
+
if (stat < 0) {
mtd->ecc_stats.failed++;
} else {
@@ -1475,6 +1485,15 @@ static int nand_read_page_hwecc(struct mtd_info *mtd, struct nand_chip *chip,
int stat;

stat = chip->ecc.correct(mtd, p, &ecc_code[i], &ecc_calc[i]);
+ if (stat == -EBADMSG &&
+ (chip->ecc.options & NAND_ECC_GENERIC_ERASED_CHECK)) {
+ /* check for empty pages with bitflips */
+ stat = nand_check_erased_ecc_chunk(p, eccsize,
+ &ecc_code[i], eccbytes,
+ NULL, 0,
+ chip->ecc.strength);
+ }
+
if (stat < 0) {
mtd->ecc_stats.failed++;
} else {
@@ -1527,6 +1546,15 @@ static int nand_read_page_hwecc_oob_first(struct mtd_info *mtd,
chip->ecc.calculate(mtd, p, &ecc_calc[i]);

stat = chip->ecc.correct(mtd, p, &ecc_code[i], NULL);
+ if (stat == -EBADMSG &&
+ (chip->ecc.options & NAND_ECC_GENERIC_ERASED_CHECK)) {
+ /* check for empty pages with bitflips */
+ stat = nand_check_erased_ecc_chunk(p, eccsize,
+ &ecc_code[i], eccbytes,
+ NULL, 0,
+ chip->ecc.strength);
+ }
+
if (stat < 0) {
mtd->ecc_stats.failed++;
} else {
@@ -1554,6 +1582,7 @@ static int nand_read_page_syndrome(struct mtd_info *mtd, struct nand_chip *chip,
int i, eccsize = chip->ecc.size;
int eccbytes = chip->ecc.bytes;
int eccsteps = chip->ecc.steps;
+ int eccpadbytes = eccbytes + chip->ecc.prepad + chip->ecc.postpad;
uint8_t *p = buf;
uint8_t *oob = chip->oob_poi;
unsigned int max_bitflips = 0;
@@ -1573,19 +1602,29 @@ static int nand_read_page_syndrome(struct mtd_info *mtd, struct nand_chip *chip,
chip->read_buf(mtd, oob, eccbytes);
stat = chip->ecc.correct(mtd, p, oob, NULL);

- if (stat < 0) {
- mtd->ecc_stats.failed++;
- } else {
- mtd->ecc_stats.corrected += stat;
- max_bitflips = max_t(unsigned int, max_bitflips, stat);
- }
-
oob += eccbytes;

if (chip->ecc.postpad) {
chip->read_buf(mtd, oob, chip->ecc.postpad);
oob += chip->ecc.postpad;
}
+
+ if (stat == -EBADMSG &&
+ (chip->ecc.options & NAND_ECC_GENERIC_ERASED_CHECK)) {
+ /* check for empty pages with bitflips */
+ stat = nand_check_erased_ecc_chunk(p, chip->ecc.size,
+ oob - eccpadbytes,
+ eccpadbytes,
+ NULL, 0,
+ chip->ecc.strength);
+ }
+
+ if (stat < 0) {
+ mtd->ecc_stats.failed++;
+ } else {
+ mtd->ecc_stats.corrected += stat;
+ max_bitflips = max_t(unsigned int, max_bitflips, stat);
+ }
}

/* Calculate remaining oob bytes */
diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h
index 5189581..86487db 100644
--- a/include/linux/mtd/nand.h
+++ b/include/linux/mtd/nand.h
@@ -129,6 +129,14 @@ typedef enum {
/* Enable Hardware ECC before syndrome is read back from flash */
#define NAND_ECC_READSYN 2

+/*
+ * Enable generic NAND 'page erased' check. This check is only done when
+ * ecc.correct() returns -EBADMSG.
+ * Set this flag if your implementation does not fix bitflips in erased
+ * pages and you want to rely on the default implementation.
+ */
+#define NAND_ECC_GENERIC_ERASED_CHECK BIT(0)
+
/* Bit mask for flags passed to do_nand_read_ecc */
#define NAND_GET_DEVICE 0x80

@@ -451,6 +459,7 @@ struct nand_hw_control {
* @total: total number of ECC bytes per page
* @prepad: padding information for syndrome based ECC generators
* @postpad: padding information for syndrome based ECC generators
+ * @options: ECC specific options (see NAND_ECC_XXX flags defined above)
* @layout: ECC layout control struct pointer
* @priv: pointer to private ECC control data
* @hwctl: function to control hardware ECC generator. Must only
@@ -500,6 +509,7 @@ struct nand_ecc_ctrl {
int strength;
int prepad;
int postpad;
+ unsigned int options;
struct nand_ecclayout *layout;
void *priv;
void (*hwctl)(struct mtd_info *mtd, int mode);
--
2.1.4

2015-12-30 19:33:09

by Boris Brezillon

[permalink] [raw]

Subject: [PATCH v4 3/5] mtd: nand: davinci: remove custom 'erased check' implementation

The davinci drivers is manually checking for 'erased pages' while
correcting ECC bytes.
This logic can now done by the core infrastructure, and can thus be removed
from this drivers.

Signed-off-by: Boris Brezillon <[email protected]>
---
drivers/mtd/nand/davinci_nand.c | 9 +--------
1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/drivers/mtd/nand/davinci_nand.c b/drivers/mtd/nand/davinci_nand.c
index ddb73c3..8cb821b 100644
--- a/drivers/mtd/nand/davinci_nand.c
+++ b/drivers/mtd/nand/davinci_nand.c
@@ -317,14 +317,6 @@ static int nand_davinci_correct_4bit(struct mtd_info *mtd,
unsigned num_errors, corrected;
unsigned long timeo;

- /* All bytes 0xff? It's an erased page; ignore its ECC. */
- for (i = 0; i < 10; i++) {
- if (ecc_code[i] != 0xff)
- goto compare;
- }
- return 0;
-
-compare:
/* Unpack ten bytes into eight 10 bit values. We know we're
* little-endian, and use type punning for less shifting/masking.
*/
@@ -749,6 +741,7 @@ static int nand_davinci_probe(struct platform_device *pdev)
info->chip.ecc.correct = nand_davinci_correct_4bit;
info->chip.ecc.hwctl = nand_davinci_hwctl_4bit;
info->chip.ecc.bytes = 10;
+ info->chip.ecc.options = NAND_ECC_GENERIC_ERASED_CHECK;
} else {
info->chip.ecc.calculate = nand_davinci_calculate_1bit;
info->chip.ecc.correct = nand_davinci_correct_1bit;
--
2.1.4

2015-12-30 19:33:04

by Boris Brezillon

[permalink] [raw]

Subject: [PATCH v4 4/5] mtd: nand: diskonchip: remove custom 'erased check' implementation

The diskonchip drivers is manually checking for 'erased pages' while
correcting ECC bytes.
This logic can now done by the core infrastructure, and can thus be removed
from this drivers.

Signed-off-by: Boris Brezillon <[email protected]>
---
drivers/mtd/nand/diskonchip.c | 37 ++-----------------------------------
1 file changed, 2 insertions(+), 35 deletions(-)

diff --git a/drivers/mtd/nand/diskonchip.c b/drivers/mtd/nand/diskonchip.c
index a5c0466..a6927bf 100644
--- a/drivers/mtd/nand/diskonchip.c
+++ b/drivers/mtd/nand/diskonchip.c
@@ -74,10 +74,6 @@ struct doc_priv {
int (*late_init)(struct mtd_info *mtd);
};

-/* This is the syndrome computed by the HW ecc generator upon reading an empty
- page, one with all 0xff for data and stored ecc code. */
-static u_char empty_read_syndrome[6] = { 0x26, 0xff, 0x6d, 0x47, 0x73, 0x7a };
-
/* This is the ecc value computed by the HW ecc generator upon writing an empty
page, one with all 0xff for data. */
static u_char empty_write_ecc[6] = { 0x4b, 0x00, 0xe2, 0x0e, 0x93, 0xf7 };
@@ -912,7 +908,6 @@ static int doc200x_correct_data(struct mtd_info *mtd, u_char *dat,
void __iomem *docptr = doc->virtadr;
uint8_t calc_ecc[6];
volatile u_char dummy;
- int emptymatch = 1;

/* flush the pipeline */
if (DoC_is_2000(doc)) {
@@ -936,37 +931,9 @@ static int doc200x_correct_data(struct mtd_info *mtd, u_char *dat,
calc_ecc[i] = ReadDOC_(docptr, DoC_Mplus_ECCSyndrome0 + i);
else
calc_ecc[i] = ReadDOC_(docptr, DoC_ECCSyndrome0 + i);
- if (calc_ecc[i] != empty_read_syndrome[i])
- emptymatch = 0;
- }
- /* If emptymatch=1, the read syndrome is consistent with an
- all-0xff data and stored ecc block. Check the stored ecc. */
- if (emptymatch) {
- for (i = 0; i < 6; i++) {
- if (read_ecc[i] == 0xff)
- continue;
- emptymatch = 0;
- break;
- }
}
- /* If emptymatch still =1, check the data block. */
- if (emptymatch) {
- /* Note: this somewhat expensive test should not be triggered
- often. It could be optimized away by examining the data in
- the readbuf routine, and remembering the result. */
- for (i = 0; i < 512; i++) {
- if (dat[i] == 0xff)
- continue;
- emptymatch = 0;
- break;
- }
- }
- /* If emptymatch still =1, this is almost certainly a freshly-
- erased block, in which case the ECC will not come out right.
- We'll suppress the error and tell the caller everything's
- OK. Because it is. */
- if (!emptymatch)
- ret = doc_ecc_decode(rs_decoder, dat, calc_ecc);
+
+ ret = doc_ecc_decode(rs_decoder, dat, calc_ecc);
if (ret > 0)
printk(KERN_ERR "doc200x_correct_data corrected %d errors\n", ret);
}
--
2.1.4

2015-12-30 19:32:58

by Boris Brezillon

[permalink] [raw]

Subject: [PATCH v4 5/5] mtd: nand: jz4740: remove custom 'erased check' implementation

The jz4740 drivers is manually checking for 'erased pages' while
correcting ECC bytes.
This logic can now done by the core infrastructure, and can thus be removed
from this drivers.

Signed-off-by: Boris Brezillon <[email protected]>
---
drivers/mtd/nand/jz4740_nand.c | 18 ------------------
1 file changed, 18 deletions(-)

diff --git a/drivers/mtd/nand/jz4740_nand.c b/drivers/mtd/nand/jz4740_nand.c
index adccae3..f934060 100644
--- a/drivers/mtd/nand/jz4740_nand.c
+++ b/drivers/mtd/nand/jz4740_nand.c
@@ -224,24 +224,6 @@ static int jz_nand_correct_ecc_rs(struct mtd_info *mtd, uint8_t *dat,
uint32_t t;
unsigned int timeout = 1000;

- t = read_ecc[0];
-
- if (t == 0xff) {
- for (i = 1; i < 9; ++i)
- t &= read_ecc[i];
-
- t &= dat[0];
- t &= dat[nand->chip.ecc.size / 2];
- t &= dat[nand->chip.ecc.size - 1];
-
- if (t == 0xff) {
- for (i = 1; i < nand->chip.ecc.size - 1; ++i)
- t &= dat[i];
- if (t == 0xff)
- return 0;
- }
- }
-
for (i = 0; i < 9; ++i)
writeb(read_ecc[i], nand->base + JZ_REG_NAND_PAR0 + i);

--
2.1.4

2015-12-30 19:40:07

by Boris Brezillon

[permalink] [raw]

Subject: [PATCH v5 4/5] mtd: nand: diskonchip: remove custom 'erased check' implementation

The diskonchip drivers is manually checking for 'erased pages' while
correcting ECC bytes.
This logic can now done by the core infrastructure, and can thus be removed
from this drivers.

Signed-off-by: Boris Brezillon <[email protected]>
---
Changes since v4:
- add missing NAND_ECC_GENERIC_ERASED_CHECK flag

drivers/mtd/nand/diskonchip.c | 38 +++-----------------------------------
1 file changed, 3 insertions(+), 35 deletions(-)

diff --git a/drivers/mtd/nand/diskonchip.c b/drivers/mtd/nand/diskonchip.c
index a5c0466..4f4aa35 100644
--- a/drivers/mtd/nand/diskonchip.c
+++ b/drivers/mtd/nand/diskonchip.c
@@ -74,10 +74,6 @@ struct doc_priv {
int (*late_init)(struct mtd_info *mtd);
};

-/* This is the syndrome computed by the HW ecc generator upon reading an empty
- page, one with all 0xff for data and stored ecc code. */
-static u_char empty_read_syndrome[6] = { 0x26, 0xff, 0x6d, 0x47, 0x73, 0x7a };
-
/* This is the ecc value computed by the HW ecc generator upon writing an empty
page, one with all 0xff for data. */
static u_char empty_write_ecc[6] = { 0x4b, 0x00, 0xe2, 0x0e, 0x93, 0xf7 };
@@ -912,7 +908,6 @@ static int doc200x_correct_data(struct mtd_info *mtd, u_char *dat,
void __iomem *docptr = doc->virtadr;
uint8_t calc_ecc[6];
volatile u_char dummy;
- int emptymatch = 1;

/* flush the pipeline */
if (DoC_is_2000(doc)) {
@@ -936,37 +931,9 @@ static int doc200x_correct_data(struct mtd_info *mtd, u_char *dat,
calc_ecc[i] = ReadDOC_(docptr, DoC_Mplus_ECCSyndrome0 + i);
else
calc_ecc[i] = ReadDOC_(docptr, DoC_ECCSyndrome0 + i);
- if (calc_ecc[i] != empty_read_syndrome[i])
- emptymatch = 0;
- }
- /* If emptymatch=1, the read syndrome is consistent with an
- all-0xff data and stored ecc block. Check the stored ecc. */
- if (emptymatch) {
- for (i = 0; i < 6; i++) {
- if (read_ecc[i] == 0xff)
- continue;
- emptymatch = 0;
- break;
- }
}
- /* If emptymatch still =1, check the data block. */
- if (emptymatch) {
- /* Note: this somewhat expensive test should not be triggered
- often. It could be optimized away by examining the data in
- the readbuf routine, and remembering the result. */
- for (i = 0; i < 512; i++) {
- if (dat[i] == 0xff)
- continue;
- emptymatch = 0;
- break;
- }
- }
- /* If emptymatch still =1, this is almost certainly a freshly-
- erased block, in which case the ECC will not come out right.
- We'll suppress the error and tell the caller everything's
- OK. Because it is. */
- if (!emptymatch)
- ret = doc_ecc_decode(rs_decoder, dat, calc_ecc);
+
+ ret = doc_ecc_decode(rs_decoder, dat, calc_ecc);
if (ret > 0)
printk(KERN_ERR "doc200x_correct_data corrected %d errors\n", ret);
}
@@ -1586,6 +1553,7 @@ static int __init doc_probe(unsigned long physadr)
nand->ecc.size = 512;
nand->ecc.bytes = 6;
nand->ecc.strength = 2;
+ nand->ecc.options = NAND_ECC_GENERIC_ERASED_CHECK;
nand->bbt_options = NAND_BBT_USE_FLASH;
/* Skip the automatic BBT scan so we can run it manually */
nand->options |= NAND_SKIP_BBTSCAN;
--
2.1.4

2015-12-30 19:41:34

by Boris Brezillon

[permalink] [raw]

Subject: [PATCH v5 5/5] mtd: nand: jz4740: remove custom 'erased check' implementation

The jz4740 drivers is manually checking for 'erased pages' while
correcting ECC bytes.
This logic can now done by the core infrastructure, and can thus be removed
from this drivers.

Signed-off-by: Boris Brezillon <[email protected]>
---
Changes since v4:
- add missing NAND_ECC_GENERIC_ERASED_CHECK flag

drivers/mtd/nand/jz4740_nand.c | 19 +------------------
1 file changed, 1 insertion(+), 18 deletions(-)

diff --git a/drivers/mtd/nand/jz4740_nand.c b/drivers/mtd/nand/jz4740_nand.c
index adccae3..b19d2a9 100644
--- a/drivers/mtd/nand/jz4740_nand.c
+++ b/drivers/mtd/nand/jz4740_nand.c
@@ -224,24 +224,6 @@ static int jz_nand_correct_ecc_rs(struct mtd_info *mtd, uint8_t *dat,
uint32_t t;
unsigned int timeout = 1000;

- t = read_ecc[0];
-
- if (t == 0xff) {
- for (i = 1; i < 9; ++i)
- t &= read_ecc[i];
-
- t &= dat[0];
- t &= dat[nand->chip.ecc.size / 2];
- t &= dat[nand->chip.ecc.size - 1];
-
- if (t == 0xff) {
- for (i = 1; i < nand->chip.ecc.size - 1; ++i)
- t &= dat[i];
- if (t == 0xff)
- return 0;
- }
- }
-
for (i = 0; i < 9; ++i)
writeb(read_ecc[i], nand->base + JZ_REG_NAND_PAR0 + i);

@@ -443,6 +425,7 @@ static int jz_nand_probe(struct platform_device *pdev)
chip->ecc.size = 512;
chip->ecc.bytes = 9;
chip->ecc.strength = 4;
+ chip->ecc.options = NAND_ECC_GENERIC_ERASED_CHECK;

if (pdata)
chip->ecc.layout = pdata->ecc_layout;
--
2.1.4

2015-12-30 23:36:02

by Franklin S Cooper Jr

[permalink] [raw]

Subject: Re: [PATCH v4 0/5] mtd: nand: properly handle bitflips in erased pages

On 12/30/2015 01:32 PM, Boris Brezillon wrote:
> Hi,
>
> This patch series aims at providing a common logic to check for bitflips
> in erased pages.
>
> Currently each driver is implementing its own logic to check for bitflips
> in erased pages. Not only this create code duplication, but most of these
> implementations are incorrect.
> Here are a few aspects that are often left aside in those implementations:
> 1/ they do not check OOB bytes when checking for the ff pattern, which
> means they can consider a page as empty while the MTD user actually
> wanted to write almost ff with a few bits to zero
> 2/ they check for the ff pattern on the whole page, while ECC actually
> works on smaller chunks (usually 512 or 1024 bytes chunks)
> 3/ they use random bitflip thresholds to decide whether a page/chunk is
> erased or not. IMO this threshold should be set to ECC strength (or
> at least something correlated to this parameter)
>
> The approach taken in this series is to provide two helper functions to
> check for bitflips in erased pages. Each driver that needs to check for
> such cases can then call the nand_check_erased_ecc_chunk() function, and
> rely on the common logic to decide whether a page is erased or not.
>
> While Brian suggested a few times to make this detection automatic for
> all drivers that set a specific flag (NAND_CHECK_ERASED_BITFLIPS?), here
> is a few reasons I think this is not such a good idea:
> 1/ some (a lot of) drivers do not properly implement the raw access
> functions, and since we need to check for raw data and OOB bytes this
> makes the automatic detection unusable for most drivers unless they
> decide to correctly implement those methods (which would be a good
> thing BTW).
> 2/ as a I said earlier, this check should be made at the ECC chunk level
> and not at the page level. This spots two problems: some (a lot of)
> drivers do not properly specify the ecc layout information, and even
> if the ecc layout is correctly defined, there is no way to attach ECC
> bytes to a specific ECC chunk.
> 3/ the last aspect is the perf penalty incured by this test. Automatically
> doing that at the NAND core level implies reading the whole page again
> in raw mode, while with the helper function approach, drivers supporting
> access at the ECC chunk level can read only the faulty chunk in raw
> mode.
>
> Regarding the bitflips threshold at which an erased pages is considered as
> faulty, I have assigned it to ECC strength. As mentioned by Andrea, using
> ECC strength might cause some trouble, because if you already have some
> bitflips in an erased page, programming it might generate even more of
> them.
> In the other hand, shouldn't that be checked after (or before) programming
> a page. I mean, UBI is already capable of detecting pages which are over
> the configured bitflips_threshold and move data around when it detects
> such pages.
> If we check data after writing a page we wouldn't have to bother about
> setting a weaker value for the "bitflips in erased page" case.
> Another thing in favor of the ECC strength value for this "bitflips in
> erased page" threshold value: if the ECC engine is generating 0xff ECC
> bytes when the page is empty, then it will be able to fix ECC strength
> bitflips without complaining, so why should we use different value when
> we detect bitflips using the pattern match approach?
>
> Best Regards,
>
> Boris
>
> Changes since v3:
> - drop already applied patches
> - make the generic "bitflips in erased pages" check as an opt-in flag
> - split driver changes to ease review
> - addressed Brian's comments
>
> Changes since v2:
> - improve nand_check_erased_buf() implementation
> - keep nand_check_erased_buf() private to nand_base.c
> - patch existing ecc.correct() implementations to return consistent error
> codes
> - make the 'erased check' optional
> - remove some custom implementations of the 'erased check'
>
> Changes since v1:
> - fix the nand_check_erased_buf() function
> - mark the bitflips > bitflips_threshold condition as unlikely
> - add missing memsets in nand_check_erased_ecc_chunk()
>
>
> Boris Brezillon (5):
> mtd: nand: return consistent error codes in ecc.correct()
> implementations
> mtd: nand: use nand_check_erased_ecc_chunk in default ECC read
> functions
> mtd: nand: davinci: remove custom 'erased check' implementation
> mtd: nand: diskonchip: remove custom 'erased check' implementation
> mtd: nand: jz4740: remove custom 'erased check' implementation
>
> drivers/mtd/nand/atmel_nand.c | 2 +-
> drivers/mtd/nand/bf5xx_nand.c | 20 +++++++++++-----
> drivers/mtd/nand/davinci_nand.c | 15 ++++--------
> drivers/mtd/nand/diskonchip.c | 37 ++--------------------------
> drivers/mtd/nand/jz4740_nand.c | 22 ++---------------
> drivers/mtd/nand/mxc_nand.c | 4 ++--
> drivers/mtd/nand/nand_base.c | 53 +++++++++++++++++++++++++++++++++++------
> drivers/mtd/nand/nand_bch.c | 2 +-
> drivers/mtd/nand/nand_ecc.c | 2 +-
> drivers/mtd/nand/omap2.c | 6 ++---
> drivers/mtd/nand/r852.c | 4 ++--
> include/linux/mtd/nand.h | 18 +++++++++++++-
> include/linux/mtd/nand_bch.h | 2 +-
> 13 files changed, 96 insertions(+), 91 deletions(-)
>

Validated this patchset on TI K2E evm.

Tested-by: Franklin S Cooper Jr. <[email protected]>