2020-05-25 12:20:58

by Bean Huo

[permalink] [raw]
Subject: [PATCH v6 0/5] Micron SLC NAND filling block

From: Bean Huo <[email protected]>

Hi,
On planar 2D Micron NAND devices when a block erase command is issued,
occasionally even though a block erase operation completes and returns a pass
status, the flash block may not be completely erased. Subsequent operations to
this block on very rare cases can result in subtle failures or corruption. These
extremely rare cases should nevertheless be considered. This patchset is to
address this potential issue.

After submission of patch V1 [1] and V2 [2], we stopped its update since we get
stuck in the solution on how to avoid the power-loss issue in case power-cut
hits the block filling. In the v1 and v2, to avoid this issue, we always damaged
page0, page1, this's based on the hypothesis that NAND FS is UBIFS. This
FS-specifical code is unacceptable in the MTD layer. Also, it cannot cover all
NAND based file system. Based on the current discussion, seems that re-write all
first 15 page from page0 is a satisfactory solution.

Meanwhile, I borrowed one idea from Miquel Raynal patchset [3], in which keeps
a recode of programmed pages, base on it, for most of the cases, we don't need
to read every page to see if current erasing block is a partially programmed
block.

Changelog:

v5 - v6:
1. Fix a misleading-indentation in patch 5/5
(Reported-by: kbuild test robot <[email protected]>)
2. Rebase patch to
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git nand/next
v4 - v5:
1. Add Miquel Raynal Authorship and SoB in 4/5 and 5/5 (Miquel Raynal)
2. Change commit message in 5/5. (Steve deRosier)
3. delete unused variable max_bitflips in 4/5
v3 - v4:
1. In the patch 4/5, change to directly use ecc.strength to judge the page
is a empty page or not, rather than max_bitflips < mtd->bitflip_threshold
2. In the patch 5/5, for the powerloss case, from the next time boot up,
lots of page will be programmed from >page15 address, if still using
first_p as GENMASK() bitmask starting position, writtenp will be always 0.
fix it by changing its bitmask starting at bit position 0.
v2 - v3:
1. Rebase patch to the latest MTD git tree
2. Add a record that keeps tracking the programmed pages in the first 16 pages
3. Change from program odd pages, damage page 0 and page 1, to program all
first 15 pages
4. Address issues which exist in the V2.
v1 - v2:
1. Rebased V1 to latest Linux kernel.
2. Add erase preparation function pointer in nand_manufacturer_ops.

[1] https://www.spinics.net/lists/linux-mtd/msg04112.html
[2] https://www.spinics.net/lists/linux-mtd/msg04450.html
[3] https://www.spinics.net/lists/linux-mtd/msg13083.html

Bean Huo (3):
mtd: rawnand: group all NAND specific ops into new nand_chip_ops
mtd: rawnand: Add {pre,post}_erase hooks in nand_chip_ops
mtd: rawnand: Introduce a new function nand_check_is_erased_page()

Miquel Raynal (2):
mtd: rawnand: Add write_oob hook in nand_chip_ops
mtd: rawnand: micron: Micron SLC NAND filling block

drivers/mtd/nand/raw/internals.h | 3 +-
drivers/mtd/nand/raw/nand_base.c | 87 ++++++++++++++++++----
drivers/mtd/nand/raw/nand_hynix.c | 2 +-
drivers/mtd/nand/raw/nand_macronix.c | 10 +--
drivers/mtd/nand/raw/nand_micron.c | 104 ++++++++++++++++++++++++++-
include/linux/mtd/rawnand.h | 40 +++++++----
6 files changed, 211 insertions(+), 35 deletions(-)

--
2.17.1


2020-05-25 12:20:58

by Bean Huo

[permalink] [raw]
Subject: [PATCH v6 1/5] mtd: rawnand: group all NAND specific ops into new nand_chip_ops

From: Bean Huo <[email protected]>

This patch is to create a new structure nand_chip_ops, and take all NAND
specific functions out from nand_chip and put them in this new structure.

Signed-off-by: Bean Huo <[email protected]>
---
drivers/mtd/nand/raw/nand_base.c | 20 +++++++++---------
drivers/mtd/nand/raw/nand_hynix.c | 2 +-
drivers/mtd/nand/raw/nand_macronix.c | 10 ++++-----
drivers/mtd/nand/raw/nand_micron.c | 2 +-
include/linux/mtd/rawnand.h | 31 +++++++++++++++++-----------
5 files changed, 36 insertions(+), 29 deletions(-)

diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index 6a6a0a36b3fd..b86f641f6d74 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -3285,10 +3285,10 @@ static int nand_setup_read_retry(struct nand_chip *chip, int retry_mode)
if (retry_mode >= chip->read_retries)
return -EINVAL;

- if (!chip->setup_read_retry)
+ if (!chip->ops.setup_read_retry)
return -EOPNOTSUPP;

- return chip->setup_read_retry(chip, retry_mode);
+ return chip->ops.setup_read_retry(chip, retry_mode);
}

static void nand_wait_readrdy(struct nand_chip *chip)
@@ -4532,8 +4532,8 @@ static int nand_suspend(struct mtd_info *mtd)
int ret = 0;

mutex_lock(&chip->lock);
- if (chip->suspend)
- ret = chip->suspend(chip);
+ if (chip->ops.suspend)
+ ret = chip->ops.suspend(chip);
if (!ret)
chip->suspended = 1;
mutex_unlock(&chip->lock);
@@ -4551,8 +4551,8 @@ static void nand_resume(struct mtd_info *mtd)

mutex_lock(&chip->lock);
if (chip->suspended) {
- if (chip->resume)
- chip->resume(chip);
+ if (chip->ops.resume)
+ chip->ops.resume(chip);
chip->suspended = 0;
} else {
pr_err("%s called for a chip which is not in suspended state\n",
@@ -4581,10 +4581,10 @@ static int nand_lock(struct mtd_info *mtd, loff_t ofs, uint64_t len)
{
struct nand_chip *chip = mtd_to_nand(mtd);

- if (!chip->lock_area)
+ if (!chip->ops.lock_area)
return -ENOTSUPP;

- return chip->lock_area(chip, ofs, len);
+ return chip->ops.lock_area(chip, ofs, len);
}

/**
@@ -4597,10 +4597,10 @@ static int nand_unlock(struct mtd_info *mtd, loff_t ofs, uint64_t len)
{
struct nand_chip *chip = mtd_to_nand(mtd);

- if (!chip->unlock_area)
+ if (!chip->ops.unlock_area)
return -ENOTSUPP;

- return chip->unlock_area(chip, ofs, len);
+ return chip->ops.unlock_area(chip, ofs, len);
}

/* Set default functions */
diff --git a/drivers/mtd/nand/raw/nand_hynix.c b/drivers/mtd/nand/raw/nand_hynix.c
index 7caedaa5b9e5..7d1be53f27f3 100644
--- a/drivers/mtd/nand/raw/nand_hynix.c
+++ b/drivers/mtd/nand/raw/nand_hynix.c
@@ -337,7 +337,7 @@ static int hynix_mlc_1xnm_rr_init(struct nand_chip *chip,
rr->nregs = nregs;
rr->regs = hynix_1xnm_mlc_read_retry_regs;
hynix->read_retry = rr;
- chip->setup_read_retry = hynix_nand_setup_read_retry;
+ chip->ops.setup_read_retry = hynix_nand_setup_read_retry;
chip->read_retries = nmodes;

out:
diff --git a/drivers/mtd/nand/raw/nand_macronix.c b/drivers/mtd/nand/raw/nand_macronix.c
index 09c254c97b5c..1472f925f386 100644
--- a/drivers/mtd/nand/raw/nand_macronix.c
+++ b/drivers/mtd/nand/raw/nand_macronix.c
@@ -130,7 +130,7 @@ static void macronix_nand_onfi_init(struct nand_chip *chip)
return;

chip->read_retries = MACRONIX_NUM_READ_RETRY_MODES;
- chip->setup_read_retry = macronix_nand_setup_read_retry;
+ chip->ops.setup_read_retry = macronix_nand_setup_read_retry;

if (p->supports_set_get_features) {
bitmap_set(p->set_feature_list,
@@ -242,8 +242,8 @@ static void macronix_nand_block_protection_support(struct nand_chip *chip)
bitmap_set(chip->parameters.set_feature_list,
ONFI_FEATURE_ADDR_MXIC_PROTECTION, 1);

- chip->lock_area = mxic_nand_lock;
- chip->unlock_area = mxic_nand_unlock;
+ chip->ops.lock_area = mxic_nand_lock;
+ chip->ops.unlock_area = mxic_nand_unlock;
}

static int nand_power_down_op(struct nand_chip *chip)
@@ -312,8 +312,8 @@ static void macronix_nand_deep_power_down_support(struct nand_chip *chip)
if (i < 0)
return;

- chip->suspend = mxic_nand_suspend;
- chip->resume = mxic_nand_resume;
+ chip->ops.suspend = mxic_nand_suspend;
+ chip->ops.resume = mxic_nand_resume;
}

static int macronix_nand_init(struct nand_chip *chip)
diff --git a/drivers/mtd/nand/raw/nand_micron.c b/drivers/mtd/nand/raw/nand_micron.c
index 3589b4fce0d4..4385092a9325 100644
--- a/drivers/mtd/nand/raw/nand_micron.c
+++ b/drivers/mtd/nand/raw/nand_micron.c
@@ -84,7 +84,7 @@ static int micron_nand_onfi_init(struct nand_chip *chip)
struct nand_onfi_vendor_micron *micron = (void *)p->onfi->vendor;

chip->read_retries = micron->read_retry_options;
- chip->setup_read_retry = micron_nand_setup_read_retry;
+ chip->ops.setup_read_retry = micron_nand_setup_read_retry;
}

if (p->supports_set_get_features) {
diff --git a/include/linux/mtd/rawnand.h b/include/linux/mtd/rawnand.h
index 2804c13e5662..0c73e9a81e3a 100644
--- a/include/linux/mtd/rawnand.h
+++ b/include/linux/mtd/rawnand.h
@@ -1027,6 +1027,23 @@ struct nand_legacy {
struct nand_controller dummy_controller;
};

+/**
+ * struct nand_chip_ops - NAND Chip specific operations
+ * @suspend: [REPLACEABLE] specific NAND device suspend operation
+ * @resume: [REPLACEABLE] specific NAND device resume operation
+ * @lock_area: [REPLACEABLE] specific NAND chip lock operation
+ * @unlock_area: [REPLACEABLE] specific NAND chip unlock operation
+ * @setup_read_retry: [FLASHSPECIFIC] flash (vendor) specific function for
+ * setting the read-retry mode. Mostly needed for MLC NAND.
+ */
+struct nand_chip_ops {
+ int (*suspend)(struct nand_chip *chip);
+ void (*resume)(struct nand_chip *chip);
+ int (*lock_area)(struct nand_chip *chip, loff_t ofs, u64 len);
+ int (*unlock_area)(struct nand_chip *chip, loff_t ofs, u64 len);
+ int (*setup_read_retry)(struct nand_chip *chip, int retry_mode);
+};
+
/**
* struct nand_chip - NAND Private Flash Chip Data
* @base: Inherit from the generic NAND device
@@ -1035,8 +1052,6 @@ struct nand_legacy {
* you're modifying an existing driver that is using those
* fields/hooks, you should consider reworking the driver
* avoid using them.
- * @setup_read_retry: [FLASHSPECIFIC] flash (vendor) specific function for
- * setting the read-retry mode. Mostly needed for MLC NAND.
* @ecc: [BOARDSPECIFIC] ECC control structure
* @buf_align: minimum buffer alignment required by a platform
* @oob_poi: "poison value buffer," used for laying out OOB data
@@ -1081,8 +1096,6 @@ struct nand_legacy {
* @lock: lock protecting the suspended field. Also used to
* serialize accesses to the NAND device.
* @suspended: set to 1 when the device is suspended, 0 when it's not.
- * @suspend: [REPLACEABLE] specific NAND device suspend operation
- * @resume: [REPLACEABLE] specific NAND device resume operation
* @bbt: [INTERN] bad block table pointer
* @bbt_td: [REPLACEABLE] bad block table descriptor for flash
* lookup.
@@ -1096,8 +1109,7 @@ struct nand_legacy {
* @manufacturer: [INTERN] Contains manufacturer information
* @manufacturer.desc: [INTERN] Contains manufacturer's description
* @manufacturer.priv: [INTERN] Contains manufacturer private information
- * @lock_area: [REPLACEABLE] specific NAND chip lock operation
- * @unlock_area: [REPLACEABLE] specific NAND chip unlock operation
+ * @ops: NAND-specific operations description structure
*/

struct nand_chip {
@@ -1105,8 +1117,6 @@ struct nand_chip {

struct nand_legacy legacy;

- int (*setup_read_retry)(struct nand_chip *chip, int retry_mode);
-
unsigned int options;
unsigned int bbt_options;

@@ -1138,8 +1148,6 @@ struct nand_chip {

struct mutex lock;
unsigned int suspended : 1;
- int (*suspend)(struct nand_chip *chip);
- void (*resume)(struct nand_chip *chip);

uint8_t *oob_poi;
struct nand_controller *controller;
@@ -1160,8 +1168,7 @@ struct nand_chip {
void *priv;
} manufacturer;

- int (*lock_area)(struct nand_chip *chip, loff_t ofs, uint64_t len);
- int (*unlock_area)(struct nand_chip *chip, loff_t ofs, uint64_t len);
+ struct nand_chip_ops ops;
};

extern const struct mtd_ooblayout_ops nand_ooblayout_sp_ops;
--
2.17.1

2020-05-25 12:21:01

by Bean Huo

[permalink] [raw]
Subject: [PATCH v6 2/5] mtd: rawnand: Add {pre, post}_erase hooks in nand_chip_ops

From: Bean Huo <[email protected]>

Add {pre,post}_erase hooks in the structure nand_chip_ops:
pre_erase will be called before a block is physically erased.
post_erase will be called after a block is erased.

Signed-off-by: Bean Huo <[email protected]>
---
drivers/mtd/nand/raw/nand_base.c | 18 +++++++++++++-----
include/linux/mtd/rawnand.h | 16 ++++++++++------
2 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index b86f641f6d74..3de53c42fb74 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -4369,7 +4369,7 @@ static int nand_erase(struct mtd_info *mtd, struct erase_info *instr)
int nand_erase_nand(struct nand_chip *chip, struct erase_info *instr,
int allowbbt)
{
- int page, pages_per_block, ret, chipnr;
+ int page, pages_per_block, ret, chipnr, eb;
loff_t len;

pr_debug("%s: start = 0x%012llx, len = %llu\n",
@@ -4423,16 +4423,24 @@ int nand_erase_nand(struct nand_chip *chip, struct erase_info *instr,
(page + pages_per_block))
chip->pagecache.page = -1;

- ret = nand_erase_op(chip, (page & chip->pagemask) >>
- (chip->phys_erase_shift - chip->page_shift));
+ eb = (page & chip->pagemask) >>
+ (chip->phys_erase_shift - chip->page_shift);
+
+ if (chip->ops.pre_erase)
+ chip->ops.pre_erase(chip, eb);
+
+ ret = nand_erase_op(chip, eb);
if (ret) {
- pr_debug("%s: failed erase, page 0x%08x\n",
- __func__, page);
+ pr_debug("%s: failed erase block %d, page 0x%08x\n",
+ __func__, eb, page);
instr->fail_addr =
((loff_t)page << chip->page_shift);
goto erase_exit;
}

+ if (chip->ops.post_erase)
+ chip->ops.post_erase(chip, eb);
+
/* Increment page address and decrement length */
len -= (1ULL << chip->phys_erase_shift);
page += pages_per_block;
diff --git a/include/linux/mtd/rawnand.h b/include/linux/mtd/rawnand.h
index 0c73e9a81e3a..59150f729cf0 100644
--- a/include/linux/mtd/rawnand.h
+++ b/include/linux/mtd/rawnand.h
@@ -1029,12 +1029,14 @@ struct nand_legacy {

/**
* struct nand_chip_ops - NAND Chip specific operations
- * @suspend: [REPLACEABLE] specific NAND device suspend operation
- * @resume: [REPLACEABLE] specific NAND device resume operation
- * @lock_area: [REPLACEABLE] specific NAND chip lock operation
- * @unlock_area: [REPLACEABLE] specific NAND chip unlock operation
- * @setup_read_retry: [FLASHSPECIFIC] flash (vendor) specific function for
- * setting the read-retry mode. Mostly needed for MLC NAND.
+ * @suspend: [REPLACEABLE] specific NAND device suspend operation
+ * @resume: [REPLACEABLE] specific NAND device resume operation
+ * @lock_area: [REPLACEABLE] specific NAND chip lock operation
+ * @unlock_area: [REPLACEABLE] specific NAND chip unlock operation
+ * @setup_read_retry: [FLASHSPECIFIC] flash (vendor) specific function for
+ * setting the read-retry mode. Mostly needed for MLC NAND.
+ * @pre_erase: [FLASHSPECIFIC] prepare a physical erase block
+ * @post_erase: [FLASHSPECIFIC] physical block erase post
*/
struct nand_chip_ops {
int (*suspend)(struct nand_chip *chip);
@@ -1042,6 +1044,8 @@ struct nand_chip_ops {
int (*lock_area)(struct nand_chip *chip, loff_t ofs, u64 len);
int (*unlock_area)(struct nand_chip *chip, loff_t ofs, u64 len);
int (*setup_read_retry)(struct nand_chip *chip, int retry_mode);
+ int (*pre_erase)(struct nand_chip *chip, u32 eraseblock);
+ int (*post_erase)(struct nand_chip *chip, u32 eraseblock);
};

/**
--
2.17.1

2020-05-25 12:22:59

by Bean Huo

[permalink] [raw]
Subject: [PATCH v6 5/5] mtd: rawnand: micron: Micron SLC NAND filling block

From: Bean Huo <[email protected]>

On planar 2D Micron NAND devices when a block erase command is issued,
occasionally even though a block erase operation completes and returns a
pass status, the flash block may not be completely erased. Subsequent
operations to this block on very rare cases can result in subtle failures
or corruption. These extremely rare cases should nevertheless be
considered. These rare occurrences have been observed on partially written
blocks.

To avoid this rare occurrence, we should make sure that at least 15 pages
have been programmed to a block before it is erased. In case we find that
less than 15 pages have been programmed, we will rewrite first 15 pages of
block.

Signed-off-by: Miquel Raynal <[email protected]>
Signed-off-by: Bean Huo <[email protected]>
---
drivers/mtd/nand/raw/nand_micron.c | 102 +++++++++++++++++++++++++++++
1 file changed, 102 insertions(+)

diff --git a/drivers/mtd/nand/raw/nand_micron.c b/drivers/mtd/nand/raw/nand_micron.c
index 4385092a9325..85aa17ddd3fc 100644
--- a/drivers/mtd/nand/raw/nand_micron.c
+++ b/drivers/mtd/nand/raw/nand_micron.c
@@ -36,6 +36,9 @@
#define NAND_ECC_STATUS_1_3_CORRECTED BIT(4)
#define NAND_ECC_STATUS_7_8_CORRECTED (BIT(4) | BIT(3))

+#define MICRON_SHALLOW_ERASE_MIN_PAGE 15
+#define MICRON_PAGE_MASK_TRIGGER GENMASK(MICRON_SHALLOW_ERASE_MIN_PAGE, 0)
+
struct nand_onfi_vendor_micron {
u8 two_plane_read;
u8 read_cache;
@@ -64,6 +67,7 @@ struct micron_on_die_ecc {

struct micron_nand {
struct micron_on_die_ecc ecc;
+ u16 *writtenp;
};

static int micron_nand_setup_read_retry(struct nand_chip *chip, int retry_mode)
@@ -472,6 +476,93 @@ static int micron_supports_on_die_ecc(struct nand_chip *chip)
return MICRON_ON_DIE_SUPPORTED;
}

+static int micron_nand_pre_erase(struct nand_chip *chip, u32 eraseblock)
+{
+ struct micron_nand *micron = nand_get_manufacturer_data(chip);
+ struct mtd_info *mtd = nand_to_mtd(chip);
+ u8 last_page = MICRON_SHALLOW_ERASE_MIN_PAGE - 1;
+ u32 page;
+ u8 *data_buf;
+ int ret, i;
+
+ data_buf = nand_get_data_buf(chip);
+ WARN_ON(!data_buf);
+
+ if (likely(micron->writtenp[eraseblock] & BIT(last_page)))
+ return 0;
+
+ page = eraseblock << (chip->phys_erase_shift - chip->page_shift);
+
+ if (unlikely(micron->writtenp[eraseblock] == 0)) {
+ ret = nand_read_page_raw(chip, data_buf, 1, page + last_page);
+ if (ret)
+ return ret; /* Read error */
+ ret = nand_check_is_erased_page(chip, data_buf, true);
+ if (!ret)
+ return 0;
+ }
+
+ memset(data_buf, 0x00, mtd->writesize);
+
+ for (i = 0; i < MICRON_SHALLOW_ERASE_MIN_PAGE; i++) {
+ ret = nand_write_page_raw(chip, data_buf, false, page + i);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
+static int micron_nand_post_erase(struct nand_chip *chip, u32 eraseblock)
+{
+ struct micron_nand *micron = nand_get_manufacturer_data(chip);
+
+ if (!micron)
+ return -EINVAL;
+
+ micron->writtenp[eraseblock] = 0;
+
+ return 0;
+}
+
+static int micron_nand_write_oob(struct nand_chip *chip, loff_t to,
+ struct mtd_oob_ops *ops)
+{
+ struct micron_nand *micron = nand_get_manufacturer_data(chip);
+ u32 eb_sz = nanddev_eraseblock_size(&chip->base);
+ u32 p_sz = nanddev_page_size(&chip->base);
+ u32 ppeb = nanddev_pages_per_eraseblock(&chip->base);
+ u32 nb_p_tot = ops->len / p_sz;
+ u32 first_eb = DIV_ROUND_DOWN_ULL(to, eb_sz);
+ u32 first_p = DIV_ROUND_UP_ULL(to - (first_eb * eb_sz), p_sz);
+ u32 nb_eb = DIV_ROUND_UP_ULL(first_p + nb_p_tot, ppeb);
+ u32 remaining_p, eb, nb_p;
+ int ret;
+
+ ret = nand_write_oob_nand(chip, to, ops);
+
+ if (ret || ops->len != ops->retlen)
+ return ret;
+
+ /* Mark the last pages of the first erase block to write */
+ nb_p = min(nb_p_tot, ppeb - first_p);
+ micron->writtenp[first_eb] |= GENMASK(first_p + nb_p, 0) &
+ MICRON_PAGE_MASK_TRIGGER;
+ remaining_p = nb_p_tot - nb_p;
+
+ /* Mark all the pages of all "in-the-middle" erase blocks */
+ for (eb = first_eb + 1; eb < first_eb + nb_eb - 1; eb++) {
+ micron->writtenp[eb] |= MICRON_PAGE_MASK_TRIGGER;
+ remaining_p -= ppeb;
+ }
+
+ /* Mark the first pages of the last erase block to write */
+ if (remaining_p)
+ micron->writtenp[eb] |= GENMASK(remaining_p - 1, 0) &
+ MICRON_PAGE_MASK_TRIGGER;
+ return 0;
+}
+
static int micron_nand_init(struct nand_chip *chip)
{
struct mtd_info *mtd = nand_to_mtd(chip);
@@ -558,6 +649,17 @@ static int micron_nand_init(struct nand_chip *chip)
}
}

+ if (nand_is_slc(chip)) {
+ micron->writtenp = kcalloc(nanddev_neraseblocks(&chip->base),
+ sizeof(u16), GFP_KERNEL);
+ if (!micron->writtenp)
+ goto err_free_manuf_data;
+
+ chip->ops.write_oob = micron_nand_write_oob;
+ chip->ops.pre_erase = micron_nand_pre_erase;
+ chip->ops.post_erase = micron_nand_post_erase;
+ }
+
return 0;

err_free_manuf_data:
--
2.17.1

2020-05-25 18:33:45

by Bean Huo

[permalink] [raw]
Subject: [PATCH v6 4/5] mtd: rawnand: Introduce a new function nand_check_is_erased_page()

From: Bean Huo <[email protected]>

Add a new function nand_check_is_erased_page() in nand_base.c, which is
used to check whether one programmable page is empty or already programmed.

Signed-off-by: Bean Huo <[email protected]>
---
drivers/mtd/nand/raw/nand_base.c | 40 ++++++++++++++++++++++++++++++++
include/linux/mtd/rawnand.h | 2 ++
2 files changed, 42 insertions(+)

diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index ab39bb33e688..05ee32106af9 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -2697,6 +2697,46 @@ int nand_check_erased_ecc_chunk(void *data, int datalen,
}
EXPORT_SYMBOL(nand_check_erased_ecc_chunk);

+/**
+ * nand_check_is_erased_page - check if this page is a empty page
+ * @chip: nand chip info structure
+ * @page_data: data buffer containing the data in the page being checked
+ * @oob: indicate if chip->oob_poi points to oob date of the page
+ *
+ * Returns true if this is an un-programmed page, false otherwise.
+ */
+int nand_check_is_erased_page(struct nand_chip *chip, u8 *page_data, bool oob)
+{
+ struct mtd_info *mtd = nand_to_mtd(chip);
+ int ret, i;
+ u8 *databuf, *eccbuf = NULL;
+ struct mtd_oob_region oobregion;
+ int datasize, eccbytes = 0;
+
+ databuf = page_data;
+ datasize = chip->ecc.size;
+
+ if (oob) {
+ mtd_ooblayout_ecc(mtd, 0, &oobregion);
+ eccbuf = chip->oob_poi + oobregion.offset;
+ eccbytes = chip->ecc.bytes;
+ }
+
+ for (i = 0; i < chip->ecc.steps; i++) {
+ ret = nand_check_erased_ecc_chunk(databuf, datasize,
+ eccbuf, eccbytes,
+ NULL, 0, chip->ecc.strength);
+ if (ret < 0)
+ return false;
+
+ databuf += chip->ecc.size;
+ eccbuf += chip->ecc.bytes;
+ }
+
+ return true;
+}
+EXPORT_SYMBOL(nand_check_is_erased_page);
+
/**
* nand_read_page_raw_notsupp - dummy read raw page function
* @chip: nand chip info structure
diff --git a/include/linux/mtd/rawnand.h b/include/linux/mtd/rawnand.h
index ae1cc60260a7..44c2715691bb 100644
--- a/include/linux/mtd/rawnand.h
+++ b/include/linux/mtd/rawnand.h
@@ -1339,6 +1339,8 @@ int nand_check_erased_ecc_chunk(void *data, int datalen,
void *extraoob, int extraooblen,
int threshold);

+int nand_check_is_erased_page(struct nand_chip *chip, u8 *page_data, bool oob);
+
int nand_ecc_choose_conf(struct nand_chip *chip,
const struct nand_ecc_caps *caps, int oobavail);

--
2.17.1

2020-05-25 20:19:25

by Bean Huo

[permalink] [raw]
Subject: [PATCH v6 3/5] mtd: rawnand: Add write_oob hook in nand_chip_ops

From: Bean Huo <[email protected]>

Break the function nand_write_oob() into two functions, and one of them
is named nand_write_oob_nand(), which will be assigned to new added hook
write_oob by default. The hook write_oob will be overwritten in the NAND
vendor lower-level driver if needed.

Signed-off-by: Miquel Raynal <[email protected]>
Signed-off-by: Bean Huo <[email protected]>
---
drivers/mtd/nand/raw/internals.h | 3 ++-
drivers/mtd/nand/raw/nand_base.c | 9 +++++++++
include/linux/mtd/rawnand.h | 3 +++
3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/mtd/nand/raw/internals.h b/drivers/mtd/nand/raw/internals.h
index 03866b0aadea..94d300a207ac 100644
--- a/drivers/mtd/nand/raw/internals.h
+++ b/drivers/mtd/nand/raw/internals.h
@@ -99,7 +99,8 @@ int nand_read_param_page_op(struct nand_chip *chip, u8 page, void *buf,
void nand_decode_ext_id(struct nand_chip *chip);
void panic_nand_wait(struct nand_chip *chip, unsigned long timeo);
void sanitize_string(uint8_t *s, size_t len);
-
+int nand_write_oob_nand(struct nand_chip *chip, loff_t to,
+ struct mtd_oob_ops *ops);
static inline bool nand_has_exec_op(struct nand_chip *chip)
{
if (!chip->controller || !chip->controller->ops ||
diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index 3de53c42fb74..ab39bb33e688 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -4318,6 +4318,13 @@ static int nand_write_oob(struct mtd_info *mtd, loff_t to,
struct mtd_oob_ops *ops)
{
struct nand_chip *chip = mtd_to_nand(mtd);
+
+ return chip->ops.write_oob(chip, to, ops);
+}
+
+int nand_write_oob_nand(struct nand_chip *chip, loff_t to,
+ struct mtd_oob_ops *ops)
+{
int ret;

ops->retlen = 0;
@@ -4624,6 +4631,8 @@ static void nand_set_defaults(struct nand_chip *chip)

if (!chip->buf_align)
chip->buf_align = 1;
+
+ chip->ops.write_oob = nand_write_oob_nand;
}

/* Sanitize ONFI strings so we can safely print them */
diff --git a/include/linux/mtd/rawnand.h b/include/linux/mtd/rawnand.h
index 59150f729cf0..ae1cc60260a7 100644
--- a/include/linux/mtd/rawnand.h
+++ b/include/linux/mtd/rawnand.h
@@ -1037,6 +1037,7 @@ struct nand_legacy {
* setting the read-retry mode. Mostly needed for MLC NAND.
* @pre_erase: [FLASHSPECIFIC] prepare a physical erase block
* @post_erase: [FLASHSPECIFIC] physical block erase post
+ * @write_oob: [REPLACEABLE] Raw NAND write operation
*/
struct nand_chip_ops {
int (*suspend)(struct nand_chip *chip);
@@ -1046,6 +1047,8 @@ struct nand_chip_ops {
int (*setup_read_retry)(struct nand_chip *chip, int retry_mode);
int (*pre_erase)(struct nand_chip *chip, u32 eraseblock);
int (*post_erase)(struct nand_chip *chip, u32 eraseblock);
+ int (*write_oob)(struct nand_chip *chip, loff_t to,
+ struct mtd_oob_ops *ops);
};

/**
--
2.17.1

2020-05-28 14:19:58

by Bean Huo

[permalink] [raw]
Subject: Re: [PATCH v6 0/5] Micron SLC NAND filling block

hi, Richard


On Mon, 2020-05-25 at 14:18 +0200, Bean Huo wrote:
> After submission of patch V1 [1] and V2 [2], we stopped its update
> since we get
> stuck in the solution on how to avoid the power-loss issue in case
> power-cut
> hits the block filling. In the v1 and v2, to avoid this issue, we
> always damaged
> page0, page1, this's based on the hypothesis that NAND FS is UBIFS.
> This
> FS-specifical code is unacceptable in the MTD layer. Also, it cannot
> cover all
> NAND based file system. Based on the current discussion, seems that
> re-write all
> first 15 page from page0 is a satisfactory solution.

This patch has overwrite page0~page14, damage EC and VID header boths.
I know this is safe for UBIFS, even fastmap is enabled (you fixed this
in (ubi: fastmap: Correctly handle interrupted erasures in EBA)).
Now, how about jffs2?


Thanks,
Bean

2020-06-01 21:13:37

by Bean Huo

[permalink] [raw]
Subject: Re: [PATCH v6 0/5] Micron SLC NAND filling block


Hi Richard
would you please help us confirm below question??

Thanks,
Bean

On Thu, 2020-05-28 at 16:14 +0200, Bean Huo wrote:
> hi, Richard
>
>
> On Mon, 2020-05-25 at 14:18 +0200, Bean Huo wrote:
> > After submission of patch V1 [1] and V2 [2], we stopped its update
> > since we get
> > stuck in the solution on how to avoid the power-loss issue in case
> > power-cut
> > hits the block filling. In the v1 and v2, to avoid this issue, we
> > always damaged
> > page0, page1, this's based on the hypothesis that NAND FS is UBIFS.
> > This
> > FS-specifical code is unacceptable in the MTD layer. Also, it
> > cannot
> > cover all
> > NAND based file system. Based on the current discussion, seems that
> > re-write all
> > first 15 page from page0 is a satisfactory solution.
>

> This patch has overwrite page0~page14, damage EC and VID header
> boths.
> I know this is safe for UBIFS, even fastmap is enabled (you fixed
> this in (ubi: fastmap: Correctly handle interrupted erasures in
> EBA)).
> Now, how about jffs2?
>
>
> Thanks,
> Bean
>

2020-06-02 07:08:39

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH v6 0/5] Micron SLC NAND filling block

Hello

Bean Huo <[email protected]> wrote on Mon, 01 Jun 2020 23:10:43 +0200:

> Hi Richard
> would you please help us confirm below question??
>
> Thanks,
> Bean
>
> On Thu, 2020-05-28 at 16:14 +0200, Bean Huo wrote:
> > hi, Richard
> >
> >
> > On Mon, 2020-05-25 at 14:18 +0200, Bean Huo wrote:
> > > After submission of patch V1 [1] and V2 [2], we stopped its update
> > > since we get
> > > stuck in the solution on how to avoid the power-loss issue in case
> > > power-cut
> > > hits the block filling. In the v1 and v2, to avoid this issue, we
> > > always damaged
> > > page0, page1, this's based on the hypothesis that NAND FS is UBIFS.
> > > This
> > > FS-specifical code is unacceptable in the MTD layer. Also, it
> > > cannot
> > > cover all
> > > NAND based file system. Based on the current discussion, seems that
> > > re-write all
> > > first 15 page from page0 is a satisfactory solution.
> >
>
> > This patch has overwrite page0~page14, damage EC and VID header
> > boths.
> > I know this is safe for UBIFS, even fastmap is enabled (you fixed
> > this in (ubi: fastmap: Correctly handle interrupted erasures in
> > EBA)).
> > Now, how about jffs2?
> >
> >
> > Thanks,
> > Bean
> >
>

FYI, Bean already askes me privately and here was my answer. Feel free
to comment.

---8<---
I'm not sure we are synced on this issue, because it is clearly
not addressed in your patchset.

Quoting Richard, the ubifs log uses a fixed range of LEBs. It replays
them upon mount and checks whether they are empty, new or outdated refs
it assumes that interrupted erase got detecred by UBI and such a LEB
will just contain 0xFF bytes. Rewriting the page before erase basically
fails this assumption.

For JFFS2, the problem is the clean marker. You cannot destroy the
payload while keeping the clean marker which says "this block is fine
and contain data".

Also, if you destroy the clean marker, you need to take care of not
turning the block being discovered as "bad" at reboot time if a power
cut happens before the erasure.

All of this is not impossible but:
- we need to write specific code for each user
- we don't want to create more problems that we already have

I cannot give you more details because this is not something that I
master. Ask Richard directly if you need more details on this.
--->8---

Cheers,
Miquèl

2020-06-02 07:50:41

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v6 0/5] Micron SLC NAND filling block

Hi Bean,

On Mon, 01 Jun 2020 23:10:43 +0200
Bean Huo <[email protected]> wrote:

> Hi Richard
> would you please help us confirm below question??

Miquel suggested an approach that would allow us to deal with both JFFS2
and UBI/UBIFS without having any FS/wear-leveling specific code at the
NAND level, but you decided to ignore his comments. Sorry but there's
nothing we can do to help you if you don't listen to our
recommendations.

I've been quite disappointed by your behavior in the past, and it
continues. Recently you've taken Miquel's patches and claimed ownership
on them (probably not intentionally, but still) while you were clearly
unable to rework your original series the way I suggested (which Miquel
did after seeing you would never send new versions). And when Miquel
suggested a change to the implementation he had done based on the
discussion we had with Richard, you decided to ignore it and pursue in
the original direction. So, quite frankly, I'm really not convinced you
can conduct such a change.

Regards,

Boris

2020-06-02 09:02:49

by Bean Huo

[permalink] [raw]
Subject: Re: [PATCH v6 0/5] Micron SLC NAND filling block

On Tue, 2020-06-02 at 09:48 +0200, Boris Brezillon wrote:
> Hi Bean,
>
> On Mon, 01 Jun 2020 23:10:43 +0200
> Bean Huo <[email protected]> wrote:
>
> > Hi Richard
> > would you please help us confirm below question??
>
> Miquel suggested an approach that would allow us to deal with both
> JFFS2
> and UBI/UBIFS without having any FS/wear-leveling specific code at
> the
> NAND level, but you decided to ignore his comments. Sorry but there's
> nothing we can do to help you if you don't listen to our
> recommendations.

Expose this issue to FS layer, it is not good idea. that will impact
more code, and involve duplicated code.
>
> I've been quite disappointed by your behavior in the past, and it

> continues. Recently you've taken Miquel's patches and claimed
> ownership
did you seem my recent patch? you can ignore that see this.


> on them (probably not intentionally, but still) while you were
> clearly
> unable to rework your original series the way I suggested (which
> Miquel
> did after seeing you would never send new versions).

seriously?

> And when Miquel
> suggested a change to the implementation he had done based on the
> discussion we had with Richard, you decided to ignore it and pursue
> in
> the original direction. So, quite frankly, I'm really not convinced
> you
> can conduct such a change.
>

As Miquel mentioned, we need richard's final comfirmation,
If he agrees with this proposal, I give up my current patch.

2020-06-02 11:55:54

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v6 0/5] Micron SLC NAND filling block

On Tue, 02 Jun 2020 10:59:46 +0200
Bean Huo <[email protected]> wrote:

> On Tue, 2020-06-02 at 09:48 +0200, Boris Brezillon wrote:
> > Hi Bean,
> >
> > On Mon, 01 Jun 2020 23:10:43 +0200
> > Bean Huo <[email protected]> wrote:
> >
> > > Hi Richard
> > > would you please help us confirm below question??
> >
> > Miquel suggested an approach that would allow us to deal with both
> > JFFS2
> > and UBI/UBIFS without having any FS/wear-leveling specific code at
> > the
> > NAND level, but you decided to ignore his comments. Sorry but there's
> > nothing we can do to help you if you don't listen to our
> > recommendations.
>
> Expose this issue to FS layer, it is not good idea. that will impact
> more code, and involve duplicated code.

Sorry but as far as I'm concerned, you've lost the right to have your
word in such design choices a long time ago. You can't deliberately lie
to us for several weeks/months and expect us to trust you (your
judgment) after that.

Back to the actual proposal, it's something that came from a discussion
we had with Miquel and Richard. It's certainly not perfect, but neither
is the option of hardcoding a quirk for JFFS2/UBI/UBIFS in the Micron
NAND driver.

BTW, I think you completely occluded Miquel's suggestion to have a
generic implementation at the MTD level for users who don't care about
the pattern that's written to those 'soon-to-be-erased' blocks. See,
that's one of the things I'm complaining about. You seem to ignore
(don't know if it's deliberate or not) some of the suggestions we do.

> >
> > I've been quite disappointed by your behavior in the past, and it
>
> > continues. Recently you've taken Miquel's patches and claimed
> > ownership
> did you seem my recent patch? you can ignore that see this.

I don't understand what you mean here, sorry.

>
>
> > on them (probably not intentionally, but still) while you were
> > clearly
> > unable to rework your original series the way I suggested (which
> > Miquel
> > did after seeing you would never send new versions).
>
> seriously?

Yes, seriously!

>
> > And when Miquel
> > suggested a change to the implementation he had done based on the
> > discussion we had with Richard, you decided to ignore it and pursue
> > in
> > the original direction. So, quite frankly, I'm really not convinced
> > you
> > can conduct such a change.
> >
>
> As Miquel mentioned, we need richard's final comfirmation,
> If he agrees with this proposal, I give up my current patch.
>

Actually, you need more than Richard's blessing. Miquel has to agree on
the NAND changes, and even if I can't block the solution, I think I can
at least give my opinion: anything that involves FS/wear-leveling
specific code at the NAND level should be avoided. Given the discussion
we had regarding JFFS2 and the cleanmarkers, I don't think we can come
up with a solution that's safe for every users, hence the proposal to
empower users with this responsibility.