Add the basic driver for Arasan NAND Flash Controller used in
Zynq UltraScale+ MPSoC. It supports HW ECC and upto 24bit correction
Signed-off-by: Naga Sureshkumar Relli <[email protected]>
---
Changes in v12:
- Rebased on top of 4.20
- As suggested by Boris, instead of checking the command using nfc_op.cmds[],
use PROG_PGRD or PROG_PGPROG as appropriate in reads and writes.
- Also use address cycles information provided by core instead of guessing it.
Changes in v11:
Fixed the below commits given by Boris
- implemented separate hooks for each pattern
- Changed EVNT_TIMEOUT_MSEC to EVENT_TIMEOUT_MSEC
- Grouped register offsets with theri fields, previously
there are defines at randome positions
- changes cmnds to cmds and s32 to u32
- Removed unnecessary fields from struct anfc_op
- Renamed bch and bchmode to strength and ecc_strength respectively
- Passed nand_chip object direclty to all functions
- Replace is_vmalloc_addr() with virt_addr_valid()
- Use default routines for read/write_oob()
- Added core support to get sdr timing mode value
Changes in v10:
- Implemented ->exec_op() interface.
- Converted the driver to nand_scan().
Changes in v9:
- Added the SPDX tags
Changes in v8:
- Implemented setup_data_interface hook
- fixed checkpatch --strict warnings
- Added anfc_config_ecc in read_page_hwecc
- Fixed returning status value by reading flash status in read_byte()
instead of reading previous value.
Changes in v7:
- Implemented Marek suggestions and comments
- Corrected the acronyms those should be in caps
- Modified kconfig/Make file to keep arasan entry in sorted order
- Added is_vmlloc_addr check
- Used ioread/write32_rep variants to avoid compilation error for intel
platforms
- separated PIO and DMA mode read/write functions
- Minor cleanup
Chnages in v6:
- Addressed most of the Brian and Boris comments
- Separated the nandchip from the nand controller
- Removed the ecc lookup table from driver
- Now use framework nand waitfunction and readoob
- Fixed the compiler warning
- Adapted the new frameowrk changes related to ecc and ooblayout
- Disabled the clocks after the nand_reelase
- Now using only one completion object
- Boris suggessions like adapting cmd_ctrl and rework on read/write byte
are not implemented and i will patch them later
- Also check_erased_ecc_chunk for erase and check for is_vmalloc_addr will
implement later once the basic driver is mainlined.
Changes in v5:
- Renamed the driver filei as arasan_nand.c
- Fixed all comments relaqted coding style
- Fixed comments related to propagating the errors
- Modified the anfc_write_page_hwecc as per the write_page
prototype
Changes in v4:
- Added support for onfi timing mode configuration
- Added clock supppport
- Added support for multiple chipselects
Changes in v3:
- Removed unused variables
- Avoided busy loop and used jifies based implementation
- Fixed compiler warnings "right shift count >= width of type"
- Removed unneeded codei and improved error reporting
- Added onfi version check to ensure reading the valid address cycles
Changes in v2:
- Added missing of.h to avoid kbuild system report erro
---
drivers/mtd/nand/raw/Kconfig | 7 +
drivers/mtd/nand/raw/Makefile | 1 +
drivers/mtd/nand/raw/arasan_nand.c | 1238 ++++++++++++++++++++++++++++++++++++
3 files changed, 1246 insertions(+)
create mode 100644 drivers/mtd/nand/raw/arasan_nand.c
diff --git a/drivers/mtd/nand/raw/Kconfig b/drivers/mtd/nand/raw/Kconfig
index c7efc31..3f7ae73 100644
--- a/drivers/mtd/nand/raw/Kconfig
+++ b/drivers/mtd/nand/raw/Kconfig
@@ -541,4 +541,11 @@ config MTD_NAND_TEGRA
is supported. Extra OOB bytes when using HW ECC are currently
not supported.
+config MTD_NAND_ARASAN
+ tristate "Support for Arasan Nand Flash controller"
+ depends on HAS_IOMEM && HAS_DMA
+ help
+ Enables the driver for the Arasan Nand Flash controller on
+ Zynq Ultrascale+ MPSoC.
+
endif # MTD_NAND
diff --git a/drivers/mtd/nand/raw/Makefile b/drivers/mtd/nand/raw/Makefile
index 57159b3..042d53d 100644
--- a/drivers/mtd/nand/raw/Makefile
+++ b/drivers/mtd/nand/raw/Makefile
@@ -56,6 +56,7 @@ obj-$(CONFIG_MTD_NAND_BRCMNAND) += brcmnand/
obj-$(CONFIG_MTD_NAND_QCOM) += qcom_nandc.o
obj-$(CONFIG_MTD_NAND_MTK) += mtk_ecc.o mtk_nand.o
obj-$(CONFIG_MTD_NAND_TEGRA) += tegra_nand.o
+obj-$(CONFIG_MTD_NAND_ARASAN) += arasan_nand.o
nand-objs := nand_base.o nand_legacy.o nand_bbt.o nand_timings.o nand_ids.o
nand-objs += nand_onfi.o
diff --git a/drivers/mtd/nand/raw/arasan_nand.c b/drivers/mtd/nand/raw/arasan_nand.c
new file mode 100644
index 0000000..b8f39c3
--- /dev/null
+++ b/drivers/mtd/nand/raw/arasan_nand.c
@@ -0,0 +1,1238 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Arasan NAND Flash Controller Driver
+ *
+ * Copyright (C) 2014 - 2017 Xilinx, Inc.
+ * Author: Punnaiah Choudary Kalluri <[email protected]>
+ * Author: Naga Sureshkumar Relli <[email protected]>
+ *
+ */
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/dma-mapping.h>
+#include <linux/interrupt.h>
+#include <linux/io-64-nonatomic-lo-hi.h>
+#include <linux/module.h>
+#include <linux/mtd/mtd.h>
+#include <linux/mtd/rawnand.h>
+#include <linux/mtd/partitions.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/delay.h>
+#include <linux/mtd/nand_bch.h>
+
+#define EVENT_TIMEOUT_MSEC 1000
+
+#define PKT_OFST 0x00
+#define PKT_CNT_SHIFT 12
+
+#define MEM_ADDR1_OFST 0x04
+#define MEM_ADDR2_OFST 0x08
+#define PG_ADDR_SHIFT 16
+#define BCH_MODE_SHIFT 25
+#define MEM_ADDR_MASK GENMASK(7, 0)
+#define BCH_MODE_MASK GENMASK(27, 25)
+#define CS_MASK GENMASK(31, 30)
+#define CS_SHIFT 30
+
+#define CMD_OFST 0x0C
+#define ECC_ENABLE BIT(31)
+#define DMA_EN_MASK GENMASK(27, 26)
+#define DMA_ENABLE 0x2
+#define DMA_EN_SHIFT 26
+#define REG_PAGE_SIZE_SHIFT 23
+
+#define PROG_OFST 0x10
+#define PROG_PGRD BIT(0)
+#define PROG_ERASE BIT(2)
+#define PROG_STATUS BIT(3)
+#define PROG_PGPROG BIT(4)
+#define PROG_RDID BIT(6)
+#define PROG_RDPARAM BIT(7)
+#define PROG_RST BIT(8)
+#define PROG_GET_FEATURE BIT(9)
+#define PROG_SET_FEATURE BIT(10)
+
+#define INTR_STS_EN_OFST 0x14
+#define INTR_SIG_EN_OFST 0x18
+#define XFER_COMPLETE BIT(2)
+#define READ_READY BIT(1)
+#define WRITE_READY BIT(0)
+#define MBIT_ERROR BIT(3)
+#define EVENT_MASK (XFER_COMPLETE | READ_READY | WRITE_READY | MBIT_ERROR)
+
+#define INTR_STS_OFST 0x1C
+#define READY_STS_OFST 0x20
+#define DMA_ADDR1_OFST 0x24
+#define FLASH_STS_OFST 0x28
+#define DATA_PORT_OFST 0x30
+#define ECC_OFST 0x34
+#define BCH_EN_SHIFT 27
+#define ECC_SIZE_SHIFT 16
+
+#define ECC_ERR_CNT_OFST 0x38
+#define PAGE_ERR_CNT_MASK GENMASK(16, 8)
+#define PKT_ERR_CNT_MASK GENMASK(7, 0)
+
+#define ECC_SPR_CMD_OFST 0x3C
+#define CMD2_SHIFT 8
+#define ADDR_CYCLES_SHIFT 28
+
+#define ECC_ERR_CNT_1BIT_OFST 0x40
+#define ECC_ERR_CNT_2BIT_OFST 0x44
+#define DMA_ADDR0_OFST 0x50
+#define DATA_INTERFACE_OFST 0x6C
+#define ANFC_MAX_CHUNK_SIZE 0x4000
+#define ANFC_MAX_ADDR_CYCLES 7
+
+#define REG_PAGE_SIZE_512 0
+#define REG_PAGE_SIZE_1K 5
+#define REG_PAGE_SIZE_2K 1
+#define REG_PAGE_SIZE_4K 2
+#define REG_PAGE_SIZE_8K 3
+#define REG_PAGE_SIZE_16K 4
+
+#define TEMP_BUF_SIZE 1024
+#define SDR_MODE_PACKET_SIZE 4
+
+#define SDR_MODE_DEFLT_FREQ 80000000
+#define COL_ROW_ADDR(pos, val) (((val) & 0xFF) << (8 * (pos)))
+
+/*
+ * Arasan NAND controller can't detect errors beyond 24-bit in BCH
+ * For an erased page we observed that multibit error count as 16
+ * with 24-bit ECC. so if the count is equal to or greater than 16
+ * then we can say that its an uncorrectable ECC error.
+ */
+#define MULTI_BIT_ERR_CNT 16
+
+struct anfc_op {
+ u32 cmds[4];
+ u32 len;
+ u32 col;
+ u32 row;
+ unsigned int data_instr_idx;
+ const struct nand_op_instr *data_instr;
+ u32 naddrcycles;
+};
+
+/**
+ * struct anfc_nand_chip - Defines the nand chip related information
+ * @node: Used to store NAND chips into a list.
+ * @chip: NAND chip information structure.
+ * @strength: Bch or Hamming mode enable/disable.
+ * @ecc_strength: Ecc strength 4.8/12/16.
+ * @eccval: Ecc config value.
+ * @spare_caddr_cycles: Column address cycle information for spare area.
+ * @pktsize: Packet size for read / write operation.
+ * @csnum: chipselect number to be used.
+ * @spktsize: Packet size in ddr mode for status operation.
+ * @inftimeval: Data interface and timing mode information
+ */
+struct anfc_nand_chip {
+ struct list_head node;
+ struct nand_chip chip;
+ bool strength;
+ u32 ecc_strength;
+ u32 eccval;
+ u16 spare_caddr_cycles;
+ u32 pktsize;
+ int csnum;
+ u32 spktsize;
+ u32 inftimeval;
+};
+
+/**
+ * struct anfc_nand_controller - Defines the Arasan NAND flash controller
+ * driver instance
+ * @controller: base controller structure.
+ * @chips: list of all nand chips attached to the ctrler.
+ * @dev: Pointer to the device structure.
+ * @base: Virtual address of the NAND flash device.
+ * @clk_sys: Pointer to the system clock.
+ * @clk_flash: Pointer to the flash clock.
+ * @dma: Dma enable/disable.
+ * @buf: Buffer used for read/write byte operations.
+ * @irq: irq number
+ * @bufshift: Variable used for indexing buffer operation
+ * @csnum: Chip select number currently inuse.
+ * @event: Completion event for nand status events.
+ * @status: Status of the flash device.
+ * @prog: Used to initiate controller operations.
+ */
+struct anfc_nand_controller {
+ struct nand_controller controller;
+ struct list_head chips;
+ struct device *dev;
+ void __iomem *base;
+ struct clk *clk_sys;
+ struct clk *clk_flash;
+ int irq;
+ int csnum;
+ struct completion event;
+ int status;
+ u8 buf[TEMP_BUF_SIZE];
+ u32 eccval;
+};
+
+static int anfc_ooblayout_ecc(struct mtd_info *mtd, int section,
+ struct mtd_oob_region *oobregion)
+{
+ struct nand_chip *nand = mtd_to_nand(mtd);
+
+ if (section >= nand->ecc.steps)
+ return -ERANGE;
+
+ if (section)
+ return -ERANGE;
+
+ oobregion->length = nand->ecc.total;
+ oobregion->offset = mtd->oobsize - oobregion->length;
+
+ return 0;
+}
+
+static int anfc_ooblayout_free(struct mtd_info *mtd, int section,
+ struct mtd_oob_region *oobregion)
+{
+ struct nand_chip *nand = mtd_to_nand(mtd);
+
+ if (section >= nand->ecc.steps)
+ return -ERANGE;
+
+ if (section)
+ return -ERANGE;
+
+ oobregion->offset = 2;
+ oobregion->length = mtd->oobsize - nand->ecc.total - 2;
+
+ return 0;
+}
+
+static const struct mtd_ooblayout_ops anfc_ooblayout_ops = {
+ .ecc = anfc_ooblayout_ecc,
+ .free = anfc_ooblayout_free,
+};
+
+static inline struct anfc_nand_chip *to_anfc_nand(struct nand_chip *nand)
+{
+ return container_of(nand, struct anfc_nand_chip, chip);
+}
+
+static inline struct anfc_nand_controller *to_anfc(struct nand_controller *ctrl)
+{
+ return container_of(ctrl, struct anfc_nand_controller, controller);
+}
+
+static u8 anfc_page(u32 pagesize)
+{
+ switch (pagesize) {
+ case 512:
+ return REG_PAGE_SIZE_512;
+ case 1024:
+ return REG_PAGE_SIZE_1K;
+ case 2048:
+ return REG_PAGE_SIZE_2K;
+ case 4096:
+ return REG_PAGE_SIZE_4K;
+ case 8192:
+ return REG_PAGE_SIZE_8K;
+ case 16384:
+ return REG_PAGE_SIZE_16K;
+ default:
+ break;
+ }
+
+ return 0;
+}
+
+static inline void anfc_enable_intrs(struct anfc_nand_controller *nfc, u32 val)
+{
+ writel(val, nfc->base + INTR_STS_EN_OFST);
+ writel(val, nfc->base + INTR_SIG_EN_OFST);
+}
+
+static inline void anfc_config_ecc(struct anfc_nand_controller *nfc, bool on)
+{
+ if (on)
+ nfc->eccval = 1;
+ else
+ nfc->eccval = 0;
+}
+
+static inline void anfc_config_dma(struct anfc_nand_controller *nfc, int on)
+{
+ u32 val;
+
+ val = readl(nfc->base + CMD_OFST);
+ val &= ~DMA_EN_MASK;
+ if (on)
+ val |= DMA_ENABLE << DMA_EN_SHIFT;
+ writel(val, nfc->base + CMD_OFST);
+}
+
+static inline int anfc_wait_for_event(struct anfc_nand_controller *nfc)
+{
+ return wait_for_completion_timeout(&nfc->event,
+ msecs_to_jiffies(EVENT_TIMEOUT_MSEC));
+}
+
+static inline void anfc_setpktszcnt(struct anfc_nand_controller *nfc,
+ u32 pktsize, u32 pktcount)
+{
+ writel(pktsize | (pktcount << PKT_CNT_SHIFT), nfc->base + PKT_OFST);
+}
+
+static inline void anfc_set_eccsparecmd(struct anfc_nand_controller *nfc,
+ struct anfc_nand_chip *achip, u8 cmd1,
+ u8 cmd2)
+{
+ writel(cmd1 | (cmd2 << CMD2_SHIFT) |
+ (achip->spare_caddr_cycles << ADDR_CYCLES_SHIFT),
+ nfc->base + ECC_SPR_CMD_OFST);
+}
+
+static void anfc_setpagecoladdr(struct anfc_nand_controller *nfc, u32 page,
+ u16 col)
+{
+ u32 val;
+
+ writel(col | (page << PG_ADDR_SHIFT), nfc->base + MEM_ADDR1_OFST);
+
+ val = readl(nfc->base + MEM_ADDR2_OFST);
+ val = (val & ~MEM_ADDR_MASK) |
+ ((page >> PG_ADDR_SHIFT) & MEM_ADDR_MASK);
+ writel(val, nfc->base + MEM_ADDR2_OFST);
+}
+
+static void anfc_prepare_cmd(struct nand_chip *chip, u8 cmd1,
+ u8 cmd2, u8 dmamode,
+ u32 pagesize, u8 addrcycles)
+{
+ u32 regval;
+ struct anfc_nand_controller *nfc = to_anfc(chip->controller);
+
+ regval = cmd1 | (cmd2 << CMD2_SHIFT);
+ if (dmamode)
+ regval |= DMA_ENABLE << DMA_EN_SHIFT;
+ regval |= addrcycles << ADDR_CYCLES_SHIFT;
+ regval |= anfc_page(pagesize) << REG_PAGE_SIZE_SHIFT;
+ if (chip->ecc.mode == NAND_ECC_HW)
+ regval |= (nfc->eccval << 31);
+ writel(regval, nfc->base + CMD_OFST);
+}
+
+static void anfc_rw_dma_op(struct mtd_info *mtd, u8 *buf, int len,
+ bool do_read, u32 prog, int pktcount, int pktsize)
+{
+ dma_addr_t paddr;
+ struct nand_chip *chip = mtd_to_nand(mtd);
+ struct anfc_nand_controller *nfc = to_anfc(chip->controller);
+ struct anfc_nand_chip *achip = to_anfc_nand(chip);
+ u32 eccintr = 0, dir;
+
+ if (pktsize == 0)
+ pktsize = len;
+
+ anfc_setpktszcnt(nfc, pktsize, pktcount);
+
+ if (!achip->strength)
+ eccintr = MBIT_ERROR;
+
+ if (do_read)
+ dir = DMA_FROM_DEVICE;
+ else
+ dir = DMA_TO_DEVICE;
+ paddr = dma_map_single(nfc->dev, buf, len, dir);
+ if (dma_mapping_error(nfc->dev, paddr)) {
+ dev_err(nfc->dev, "Read buffer mapping error");
+ return;
+ }
+ writel(paddr, nfc->base + DMA_ADDR0_OFST);
+ writel((paddr >> 32), nfc->base + DMA_ADDR1_OFST);
+ anfc_enable_intrs(nfc, (XFER_COMPLETE | eccintr));
+ writel(prog, nfc->base + PROG_OFST);
+ anfc_wait_for_event(nfc);
+ dma_unmap_single(nfc->dev, paddr, len, dir);
+}
+
+static void anfc_rw_pio_op(struct mtd_info *mtd, u8 *buf, int len,
+ bool do_read, int prog, int pktcount, int pktsize)
+{
+ struct nand_chip *chip = mtd_to_nand(mtd);
+ struct anfc_nand_controller *nfc = to_anfc(chip->controller);
+ struct anfc_nand_chip *achip = to_anfc_nand(chip);
+ u32 *bufptr = (u32 *)buf;
+ u32 cnt = 0, intr = 0;
+
+ anfc_config_dma(nfc, 0);
+
+ if (pktsize == 0)
+ pktsize = len;
+
+ anfc_setpktszcnt(nfc, pktsize, pktcount);
+
+ if (!achip->strength)
+ intr = MBIT_ERROR;
+
+ if (do_read)
+ intr |= READ_READY;
+ else
+ intr |= WRITE_READY;
+ anfc_enable_intrs(nfc, intr);
+ writel(prog, nfc->base + PROG_OFST);
+ while (cnt < pktcount) {
+ anfc_wait_for_event(nfc);
+ cnt++;
+ if (cnt == pktcount)
+ anfc_enable_intrs(nfc, XFER_COMPLETE);
+ if (do_read)
+ ioread32_rep(nfc->base + DATA_PORT_OFST, bufptr,
+ pktsize / 4);
+ else
+ iowrite32_rep(nfc->base + DATA_PORT_OFST, bufptr,
+ pktsize / 4);
+ bufptr += (pktsize / 4);
+ if (cnt < pktcount)
+ anfc_enable_intrs(nfc, intr);
+ }
+ anfc_wait_for_event(nfc);
+}
+
+static void anfc_read_data_op(struct nand_chip *chip, u8 *buf, int len,
+ int pktcount, int pktsize)
+{
+ struct mtd_info *mtd = nand_to_mtd(chip);
+
+ if (virt_addr_valid(buf))
+ anfc_rw_dma_op(mtd, buf, len, 1, PROG_PGRD, pktcount, pktsize);
+ else
+ anfc_rw_pio_op(mtd, buf, len, 1, PROG_PGRD, pktcount, pktsize);
+}
+
+static void anfc_write_data_op(struct nand_chip *chip, const u8 *buf,
+ int len, int pktcount, int pktsize)
+{
+ struct mtd_info *mtd = nand_to_mtd(chip);
+
+ if (virt_addr_valid(buf))
+ anfc_rw_dma_op(mtd, (char *)buf, len, 0, PROG_PGPROG, pktcount,
+ pktsize);
+ else
+ anfc_rw_pio_op(mtd, (char *)buf, len, 0, PROG_PGPROG, pktcount,
+ pktsize);
+}
+
+static int anfc_read_page_hwecc(struct nand_chip *chip, u8 *buf,
+ int oob_required, int page)
+{
+ struct anfc_nand_controller *nfc = to_anfc(chip->controller);
+ struct anfc_nand_chip *achip = to_anfc_nand(chip);
+ struct mtd_info *mtd = nand_to_mtd(chip);
+ u8 *ecc_code = chip->ecc.code_buf;
+ u8 *p;
+ int eccsize = chip->ecc.size;
+ int eccbytes = chip->ecc.bytes;
+ int stat = 0, i;
+ u32 ret;
+ unsigned int max_bitflips = 0;
+ u32 eccsteps = chip->ecc.steps;
+ u32 one_bit_err = 0, multi_bit_err = 0;
+
+ anfc_set_eccsparecmd(nfc, achip, NAND_CMD_RNDOUT, NAND_CMD_RNDOUTSTART);
+ anfc_config_ecc(nfc, true);
+
+ ret = nand_read_page_op(chip, page, 0, buf, mtd->writesize);
+ if (ret)
+ return ret;
+
+ anfc_config_ecc(nfc, false);
+ if (achip->strength) {
+ /*
+ * In BCH mode Arasan NAND controller can correct ECC upto
+ * 24-bit Beyond that, it can't even detect errors.
+ */
+ multi_bit_err = readl(nfc->base + ECC_ERR_CNT_OFST);
+ multi_bit_err = ((multi_bit_err & PAGE_ERR_CNT_MASK) >> 8);
+ } else {
+ /*
+ * In Hamming mode Arasan NAND controller can correct ECC upto
+ * 1-bit and can detect upto 2-bit errors.
+ */
+ one_bit_err = readl(nfc->base + ECC_ERR_CNT_1BIT_OFST);
+ multi_bit_err = readl(nfc->base + ECC_ERR_CNT_2BIT_OFST);
+ /* Clear ecc error count register 1Bit, 2Bit */
+ writel(0x0, nfc->base + ECC_ERR_CNT_1BIT_OFST);
+ writel(0x0, nfc->base + ECC_ERR_CNT_2BIT_OFST);
+ }
+
+ if (oob_required)
+ chip->ecc.read_oob(chip, page);
+
+ if (multi_bit_err >= MULTI_BIT_ERR_CNT) {
+ if (!oob_required)
+ chip->ecc.read_oob(chip, page);
+
+ mtd_ooblayout_get_eccbytes(mtd, ecc_code, chip->oob_poi, 0,
+ chip->ecc.total);
+ p = buf;
+ for (i = 0; eccsteps; eccsteps--, i += eccbytes,
+ p += eccsize) {
+ stat = nand_check_erased_ecc_chunk(p,
+ chip->ecc.size,
+ &ecc_code[i],
+ eccbytes,
+ NULL, 0,
+ chip->ecc.strength);
+ if (stat < 0) {
+ mtd->ecc_stats.failed++;
+ } else {
+ mtd->ecc_stats.corrected += stat;
+ max_bitflips = max_t(unsigned int, max_bitflips,
+ stat);
+ }
+ }
+ }
+
+ return max_bitflips;
+}
+
+static int anfc_write_page_hwecc(struct nand_chip *chip, const u8 *buf,
+ int oob_required, int page)
+{
+ int ret;
+ struct anfc_nand_controller *nfc = to_anfc(chip->controller);
+ struct anfc_nand_chip *achip = to_anfc_nand(chip);
+ struct mtd_info *mtd = nand_to_mtd(chip);
+ u8 *ecc_calc = chip->ecc.calc_buf;
+
+ anfc_config_ecc(nfc, true);
+ anfc_set_eccsparecmd(nfc, achip, NAND_CMD_RNDIN, 0);
+ ret = nand_prog_page_op(chip, page, 0, buf, mtd->writesize);
+ if (ret)
+ return ret;
+
+ anfc_config_ecc(nfc, false);
+ if (oob_required) {
+ nand_read_oob_op(chip, page, 0, ecc_calc, mtd->oobsize);
+ ret = mtd_ooblayout_set_eccbytes(mtd, ecc_calc, chip->oob_poi,
+ 0, chip->ecc.total);
+ chip->ecc.write_oob(chip, page);
+ }
+
+ return 0;
+}
+
+static int anfc_ecc_init(struct mtd_info *mtd,
+ struct nand_ecc_ctrl *ecc, int ecc_mode)
+{
+ u32 ecc_addr;
+ unsigned int ecc_strength, steps;
+ struct nand_chip *chip = mtd_to_nand(mtd);
+ struct anfc_nand_chip *achip = to_anfc_nand(chip);
+
+ ecc->mode = NAND_ECC_HW;
+ ecc->read_page = anfc_read_page_hwecc;
+ ecc->write_page = anfc_write_page_hwecc;
+
+ mtd_set_ooblayout(mtd, &anfc_ooblayout_ops);
+ steps = mtd->writesize / chip->ecc_step_ds;
+
+ switch (chip->ecc_strength_ds) {
+ case 12:
+ ecc_strength = 0x1;
+ break;
+ case 8:
+ ecc_strength = 0x2;
+ break;
+ case 4:
+ ecc_strength = 0x3;
+ break;
+ case 24:
+ ecc_strength = 0x4;
+ break;
+ default:
+ ecc_strength = 0x0;
+ }
+ if (!ecc_strength)
+ ecc->total = 3 * steps;
+ else
+ ecc->total =
+ DIV_ROUND_UP(fls(8 * chip->ecc_step_ds) *
+ chip->ecc_strength_ds * steps, 8);
+ ecc->strength = chip->ecc_strength_ds;
+ ecc->size = chip->ecc_step_ds;
+ ecc->bytes = ecc->total / steps;
+ ecc->steps = steps;
+ achip->ecc_strength = ecc_strength;
+ achip->strength = achip->ecc_strength;
+ ecc_addr = mtd->writesize + (mtd->oobsize - ecc->total);
+ achip->eccval = ecc_addr | (ecc->total << ECC_SIZE_SHIFT) |
+ (achip->strength << BCH_EN_SHIFT);
+
+ if (chip->ecc_step_ds >= 1024)
+ achip->pktsize = 1024;
+ else
+ achip->pktsize = 512;
+
+ return 0;
+}
+
+/* NAND framework ->exec_op() hooks and related helpers */
+static void anfc_parse_instructions(struct nand_chip *chip,
+ const struct nand_subop *subop,
+ struct anfc_op *nfc_op)
+{
+ const struct nand_op_instr *instr = NULL;
+ unsigned int op_id;
+ int i = 0;
+
+ memset(nfc_op, 0, sizeof(struct anfc_op));
+ for (op_id = 0; op_id < subop->ninstrs; op_id++) {
+ instr = &subop->instrs[op_id];
+ switch (instr->type) {
+ case NAND_OP_CMD_INSTR:
+ if (op_id)
+ nfc_op->cmds[1] = instr->ctx.cmd.opcode;
+ else
+ nfc_op->cmds[0] = instr->ctx.cmd.opcode;
+ break;
+
+ case NAND_OP_ADDR_INSTR:
+ i = nand_subop_get_addr_start_off(subop, op_id);
+ nfc_op->naddrcycles = nand_subop_get_num_addr_cyc(subop,
+ op_id
+ );
+ for (; i < nfc_op->naddrcycles; i++) {
+ u8 val = instr->ctx.addr.addrs[i];
+
+ if (i < 2)
+ nfc_op->col |= COL_ROW_ADDR(i,
+ val);
+ else
+ nfc_op->row |= COL_ROW_ADDR(i -
+ 2, val);
+ }
+ break;
+ case NAND_OP_DATA_IN_INSTR:
+ nfc_op->data_instr = instr;
+ nfc_op->data_instr_idx = op_id;
+ break;
+ case NAND_OP_DATA_OUT_INSTR:
+ nfc_op->data_instr = instr;
+ nfc_op->data_instr_idx = op_id;
+ break;
+ case NAND_OP_WAITRDY_INSTR:
+ break;
+ }
+ }
+}
+
+static int anfc_reset_cmd_type_exec(struct nand_chip *chip,
+ const struct nand_subop *subop)
+{
+ struct anfc_op nfc_op = {};
+ struct anfc_nand_controller *nfc = to_anfc(chip->controller);
+
+ anfc_parse_instructions(chip, subop, &nfc_op);
+ anfc_prepare_cmd(chip, nfc_op.cmds[0], 0, 0, 0, 0);
+ anfc_enable_intrs(nfc, XFER_COMPLETE);
+ writel(PROG_RST, nfc->base + PROG_OFST);
+ anfc_wait_for_event(nfc);
+
+ return 0;
+}
+
+static int anfc_read_id_type_exec(struct nand_chip *chip,
+ const struct nand_subop *subop)
+{
+ const struct nand_op_instr *instr;
+ struct anfc_op nfc_op = {};
+ unsigned int op_id, len;
+ struct anfc_nand_controller *nfc = to_anfc(chip->controller);
+ struct mtd_info *mtd = nand_to_mtd(chip);
+
+ anfc_parse_instructions(chip, subop, &nfc_op);
+ instr = nfc_op.data_instr;
+ op_id = nfc_op.data_instr_idx;
+ len = nand_subop_get_data_len(subop, op_id);
+ anfc_prepare_cmd(chip, nfc_op.cmds[0], 0, 0, 0, 1);
+ anfc_setpagecoladdr(nfc, nfc_op.row, nfc_op.col);
+ anfc_rw_pio_op(mtd, nfc->buf, roundup(len, 4), 1, PROG_RDID, 1, 0);
+ memcpy(instr->ctx.data.buf.in, nfc->buf, len);
+
+ return 0;
+}
+
+static int anfc_read_status_exec(struct nand_chip *chip,
+ const struct nand_subop *subop)
+{
+ const struct nand_op_instr *instr;
+ struct anfc_op nfc_op = {};
+ unsigned int op_id, len;
+ struct anfc_nand_chip *achip = to_anfc_nand(chip);
+ struct anfc_nand_controller *nfc = to_anfc(chip->controller);
+
+ anfc_parse_instructions(chip, subop, &nfc_op);
+ instr = nfc_op.data_instr;
+ op_id = nfc_op.data_instr_idx;
+
+ anfc_prepare_cmd(chip, nfc_op.cmds[0], 0, 0, 0, 0);
+ anfc_setpktszcnt(nfc, achip->spktsize / 4, 1);
+ anfc_setpagecoladdr(nfc, nfc_op.row, nfc_op.col);
+
+ anfc_enable_intrs(nfc, XFER_COMPLETE);
+ writel(PROG_STATUS, nfc->base + PROG_OFST);
+ anfc_wait_for_event(nfc);
+
+ if (!nfc_op.data_instr)
+ return 0;
+
+ len = nand_subop_get_data_len(subop, op_id);
+
+ /*
+ * The Arasan NAND controller will update the status value
+ * returned by the flash device in FLASH_STS register.
+ */
+ nfc->status = readl(nfc->base + FLASH_STS_OFST);
+ memcpy(instr->ctx.data.buf.in, &nfc->status, len);
+
+ return 0;
+}
+
+static int anfc_erase_type_exec(struct nand_chip *chip,
+ const struct nand_subop *subop)
+{
+ const struct nand_op_instr *instr;
+ struct anfc_nand_controller *nfc = to_anfc(chip->controller);
+ struct anfc_op nfc_op = {};
+ u32 op_id;
+
+ anfc_parse_instructions(chip, subop, &nfc_op);
+ instr = nfc_op.data_instr;
+ op_id = nfc_op.data_instr_idx;
+
+ anfc_prepare_cmd(chip, nfc_op.cmds[0], nfc_op.cmds[1], 0,
+ 0, nfc_op.naddrcycles);
+ anfc_setpagecoladdr(nfc, nfc_op.row, nfc_op.col);
+ anfc_enable_intrs(nfc, XFER_COMPLETE);
+ writel(PROG_ERASE, nfc->base + PROG_OFST);
+ anfc_wait_for_event(nfc);
+
+ return 0;
+}
+
+static int anfc_read_param_get_feature_sp_read_type_exec(struct nand_chip *chip,
+ const struct nand_subop
+ *subop)
+{
+ const struct nand_op_instr *instr;
+ struct anfc_nand_controller *nfc = to_anfc(chip->controller);
+ unsigned int op_id, len;
+ struct anfc_op nfc_op = {};
+ struct mtd_info *mtd = nand_to_mtd(chip);
+
+ anfc_parse_instructions(chip, subop, &nfc_op);
+ instr = nfc_op.data_instr;
+ op_id = nfc_op.data_instr_idx;
+
+ anfc_prepare_cmd(chip, nfc_op.cmds[0], 0, 1, mtd->writesize,
+ nfc_op.naddrcycles);
+ anfc_setpagecoladdr(nfc, nfc_op.row, nfc_op.col);
+
+ if (!nfc_op.data_instr)
+ return 0;
+
+ len = nand_subop_get_data_len(subop, op_id);
+ anfc_read_data_op(chip, nfc->buf, roundup(len, 4),
+ 1, 0);
+ memcpy(instr->ctx.data.buf.in, nfc->buf, len);
+
+ return 0;
+}
+
+static int anfc_random_datain_type_exec(struct nand_chip *chip,
+ const struct nand_subop *subop)
+{
+ const struct nand_op_instr *instr;
+ struct anfc_nand_controller *nfc = to_anfc(chip->controller);
+ unsigned int op_id, len;
+ struct anfc_op nfc_op = {};
+ struct mtd_info *mtd = nand_to_mtd(chip);
+
+ anfc_parse_instructions(chip, subop, &nfc_op);
+ instr = nfc_op.data_instr;
+ op_id = nfc_op.data_instr_idx;
+
+ len = nand_subop_get_data_len(subop, op_id);
+ anfc_rw_pio_op(mtd, nfc->buf, roundup(len, 4), 1, PROG_PGRD, 1, 0);
+ memcpy(instr->ctx.data.buf.in, nfc->buf, len);
+
+ return 0;
+}
+
+static int anfc_setfeature_type_exec(struct nand_chip *chip,
+ const struct nand_subop *subop)
+{
+ const struct nand_op_instr *instr;
+ struct anfc_nand_controller *nfc = to_anfc(chip->controller);
+ unsigned int op_id, len;
+ struct anfc_op nfc_op = {};
+ struct mtd_info *mtd = nand_to_mtd(chip);
+
+ anfc_parse_instructions(chip, subop, &nfc_op);
+ instr = nfc_op.data_instr;
+ op_id = nfc_op.data_instr_idx;
+
+ anfc_prepare_cmd(chip, nfc_op.cmds[0], 0, 0, 0, nfc_op.naddrcycles);
+ anfc_setpagecoladdr(nfc, nfc_op.row, nfc_op.col);
+
+ if (!nfc_op.data_instr)
+ return 0;
+
+ len = nand_subop_get_data_len(subop, op_id);
+ memcpy(nfc->buf, (char *)instr->ctx.data.buf.out, len);
+ anfc_rw_pio_op(mtd, nfc->buf, roundup(len, 4), 0, PROG_SET_FEATURE, 1,
+ 0);
+
+ return 0;
+}
+
+static int anfc_change_read_column_type_exec(struct nand_chip *chip,
+ const struct nand_subop *subop)
+{
+ const struct nand_op_instr *instr;
+ struct anfc_nand_controller *nfc = to_anfc(chip->controller);
+ unsigned int op_id, len;
+ struct anfc_op nfc_op = {};
+ struct mtd_info *mtd = nand_to_mtd(chip);
+
+ anfc_parse_instructions(chip, subop, &nfc_op);
+ instr = nfc_op.data_instr;
+ op_id = nfc_op.data_instr_idx;
+
+ anfc_prepare_cmd(chip, nfc_op.cmds[0], nfc_op.cmds[1], 1,
+ mtd->writesize, nfc_op.naddrcycles);
+ anfc_setpagecoladdr(nfc, nfc_op.row, nfc_op.col);
+
+ if (!nfc_op.data_instr)
+ return 0;
+
+ len = nand_subop_get_data_len(subop, op_id);
+ anfc_rw_pio_op(mtd, nfc->buf, roundup(len, 4), 1, PROG_PGRD, 1, 0);
+ memcpy(instr->ctx.data.buf.in, nfc->buf, len);
+
+ return 0;
+}
+
+static int anfc_page_read_type_exec(struct nand_chip *chip,
+ const struct nand_subop *subop)
+{
+ const struct nand_op_instr *instr;
+ struct anfc_nand_chip *achip = to_anfc_nand(chip);
+ struct anfc_nand_controller *nfc = to_anfc(chip->controller);
+ unsigned int op_id, len;
+ struct anfc_op nfc_op = {};
+ struct mtd_info *mtd = nand_to_mtd(chip);
+
+ anfc_parse_instructions(chip, subop, &nfc_op);
+ instr = nfc_op.data_instr;
+ op_id = nfc_op.data_instr_idx;
+ anfc_prepare_cmd(chip, nfc_op.cmds[0], nfc_op.cmds[1], 1,
+ mtd->writesize, nfc_op.naddrcycles);
+ anfc_setpagecoladdr(nfc, nfc_op.row, nfc_op.col);
+
+ if (!nfc_op.data_instr)
+ return 0;
+
+ len = nand_subop_get_data_len(subop, op_id);
+ anfc_read_data_op(chip, instr->ctx.data.buf.in, mtd->writesize,
+ DIV_ROUND_UP(mtd->writesize, achip->pktsize),
+ achip->pktsize);
+
+ return 0;
+}
+
+static int anfc_zero_len_page_write_type_exec(struct nand_chip *chip,
+ const struct nand_subop *subop)
+{
+ const struct nand_op_instr *instr;
+ struct anfc_nand_controller *nfc = to_anfc(chip->controller);
+ unsigned int op_id;
+ struct anfc_op nfc_op = {};
+ struct mtd_info *mtd = nand_to_mtd(chip);
+
+ anfc_parse_instructions(chip, subop, &nfc_op);
+ instr = nfc_op.data_instr;
+ op_id = nfc_op.data_instr_idx;
+
+ anfc_prepare_cmd(chip, nfc_op.cmds[0], nfc_op.cmds[1], 1,
+ mtd->writesize, nfc_op.naddrcycles);
+ anfc_setpagecoladdr(nfc, nfc_op.row, nfc_op.col);
+
+ return 0;
+}
+
+static int anfc_page_write_type_exec(struct nand_chip *chip,
+ const struct nand_subop *subop)
+{
+ const struct nand_op_instr *instr;
+ struct anfc_nand_chip *achip = to_anfc_nand(chip);
+ struct anfc_nand_controller *nfc = to_anfc(chip->controller);
+ unsigned int op_id, len;
+ struct anfc_op nfc_op = {};
+ struct mtd_info *mtd = nand_to_mtd(chip);
+
+ anfc_parse_instructions(chip, subop, &nfc_op);
+ instr = nfc_op.data_instr;
+ op_id = nfc_op.data_instr_idx;
+ anfc_prepare_cmd(chip, nfc_op.cmds[0], nfc_op.cmds[1], 1,
+ mtd->writesize, nfc_op.naddrcycles);
+ anfc_setpagecoladdr(nfc, nfc_op.row, nfc_op.col);
+ if (!nfc_op.data_instr)
+ return 0;
+
+ len = nand_subop_get_data_len(subop, op_id);
+ anfc_write_data_op(chip, (char *)instr->ctx.data.buf.out,
+ mtd->writesize,
+ DIV_ROUND_UP(mtd->writesize, achip->pktsize),
+ achip->pktsize);
+
+ return 0;
+}
+
+static const struct nand_op_parser anfc_op_parser = NAND_OP_PARSER(
+ /* Use a separate function for each pattern */
+ NAND_OP_PARSER_PATTERN(
+ anfc_random_datain_type_exec,
+ NAND_OP_PARSER_PAT_DATA_IN_ELEM(false, ANFC_MAX_CHUNK_SIZE)),
+ NAND_OP_PARSER_PATTERN(
+ anfc_change_read_column_type_exec,
+ NAND_OP_PARSER_PAT_CMD_ELEM(false),
+ NAND_OP_PARSER_PAT_ADDR_ELEM(false, ANFC_MAX_ADDR_CYCLES),
+ NAND_OP_PARSER_PAT_CMD_ELEM(false),
+ NAND_OP_PARSER_PAT_DATA_IN_ELEM(false, ANFC_MAX_CHUNK_SIZE)),
+ NAND_OP_PARSER_PATTERN(
+ anfc_page_read_type_exec,
+ NAND_OP_PARSER_PAT_CMD_ELEM(false),
+ NAND_OP_PARSER_PAT_ADDR_ELEM(false, ANFC_MAX_ADDR_CYCLES),
+ NAND_OP_PARSER_PAT_CMD_ELEM(false),
+ NAND_OP_PARSER_PAT_WAITRDY_ELEM(false),
+ NAND_OP_PARSER_PAT_DATA_IN_ELEM(false, ANFC_MAX_CHUNK_SIZE)),
+ NAND_OP_PARSER_PATTERN(
+ anfc_page_write_type_exec,
+ NAND_OP_PARSER_PAT_CMD_ELEM(false),
+ NAND_OP_PARSER_PAT_ADDR_ELEM(false, ANFC_MAX_ADDR_CYCLES),
+ NAND_OP_PARSER_PAT_DATA_OUT_ELEM(false, ANFC_MAX_CHUNK_SIZE),
+ NAND_OP_PARSER_PAT_CMD_ELEM(false),
+ NAND_OP_PARSER_PAT_WAITRDY_ELEM(true)),
+ NAND_OP_PARSER_PATTERN(
+ anfc_read_id_type_exec,
+ NAND_OP_PARSER_PAT_CMD_ELEM(false),
+ NAND_OP_PARSER_PAT_ADDR_ELEM(false, ANFC_MAX_ADDR_CYCLES),
+ NAND_OP_PARSER_PAT_DATA_IN_ELEM(false, ANFC_MAX_CHUNK_SIZE)),
+ NAND_OP_PARSER_PATTERN(
+ anfc_erase_type_exec,
+ NAND_OP_PARSER_PAT_CMD_ELEM(false),
+ NAND_OP_PARSER_PAT_ADDR_ELEM(false, ANFC_MAX_ADDR_CYCLES),
+ NAND_OP_PARSER_PAT_CMD_ELEM(false),
+ NAND_OP_PARSER_PAT_WAITRDY_ELEM(false)),
+ NAND_OP_PARSER_PATTERN(
+ anfc_read_status_exec,
+ NAND_OP_PARSER_PAT_CMD_ELEM(false),
+ NAND_OP_PARSER_PAT_DATA_IN_ELEM(false, 1)),
+ NAND_OP_PARSER_PATTERN(
+ anfc_reset_cmd_type_exec,
+ NAND_OP_PARSER_PAT_CMD_ELEM(false),
+ NAND_OP_PARSER_PAT_WAITRDY_ELEM(false)),
+ NAND_OP_PARSER_PATTERN(
+ anfc_setfeature_type_exec,
+ NAND_OP_PARSER_PAT_CMD_ELEM(false),
+ NAND_OP_PARSER_PAT_ADDR_ELEM(false, ANFC_MAX_ADDR_CYCLES),
+ NAND_OP_PARSER_PAT_DATA_OUT_ELEM(false, ANFC_MAX_CHUNK_SIZE),
+ NAND_OP_PARSER_PAT_WAITRDY_ELEM(true)),
+ NAND_OP_PARSER_PATTERN(
+ anfc_read_param_get_feature_sp_read_type_exec,
+ NAND_OP_PARSER_PAT_CMD_ELEM(false),
+ NAND_OP_PARSER_PAT_ADDR_ELEM(false, ANFC_MAX_ADDR_CYCLES),
+ NAND_OP_PARSER_PAT_WAITRDY_ELEM(false),
+ NAND_OP_PARSER_PAT_DATA_IN_ELEM(true, ANFC_MAX_CHUNK_SIZE)),
+ NAND_OP_PARSER_PATTERN(
+ anfc_zero_len_page_write_type_exec,
+ NAND_OP_PARSER_PAT_CMD_ELEM(false),
+ NAND_OP_PARSER_PAT_ADDR_ELEM(false, ANFC_MAX_ADDR_CYCLES)),
+ );
+
+static int anfc_exec_op(struct nand_chip *chip,
+ const struct nand_operation *op,
+ bool check_only)
+{
+ return nand_op_parser_exec_op(chip, &anfc_op_parser,
+ op, check_only);
+}
+
+static void anfc_select_chip(struct nand_chip *chip, int num)
+{
+ u32 val;
+ struct anfc_nand_chip *achip = to_anfc_nand(chip);
+ struct anfc_nand_controller *nfc = to_anfc(chip->controller);
+
+ if (num < 0)
+ return;
+
+ val = readl(nfc->base + MEM_ADDR2_OFST);
+ val &= (val & ~(CS_MASK | BCH_MODE_MASK));
+ val |= (achip->csnum << CS_SHIFT) |
+ (achip->ecc_strength << BCH_MODE_SHIFT);
+ writel(val, nfc->base + MEM_ADDR2_OFST);
+ nfc->csnum = achip->csnum;
+ writel(achip->eccval, nfc->base + ECC_OFST);
+ writel(achip->inftimeval, nfc->base + DATA_INTERFACE_OFST);
+}
+
+static irqreturn_t anfc_irq_handler(int irq, void *ptr)
+{
+ struct anfc_nand_controller *nfc = ptr;
+ u32 status;
+
+ status = readl(nfc->base + INTR_STS_OFST);
+ if (status & EVENT_MASK) {
+ complete(&nfc->event);
+ writel(status & EVENT_MASK, nfc->base + INTR_STS_OFST);
+ writel(0, nfc->base + INTR_STS_EN_OFST);
+ writel(0, nfc->base + INTR_SIG_EN_OFST);
+ return IRQ_HANDLED;
+ }
+
+ return IRQ_NONE;
+}
+
+static int anfc_setup_data_interface(struct nand_chip *chip, int csline,
+ const struct nand_data_interface *conf)
+{
+ struct anfc_nand_controller *nfc = to_anfc(chip->controller);
+ struct anfc_nand_chip *achip = to_anfc_nand(chip);
+ int err;
+ const struct nand_sdr_timings *sdr;
+ u32 inftimeval;
+ bool change_sdr_clk = false;
+
+ if (csline == NAND_DATA_IFACE_CHECK_ONLY)
+ return 0;
+
+ /*
+ * If the controller is already in the same mode as flash device
+ * then no need to change the timing mode again.
+ */
+ sdr = nand_get_sdr_timings(conf);
+ if (IS_ERR(sdr))
+ return PTR_ERR(sdr);
+
+ if (sdr->mode < 0)
+ return -ENOTSUPP;
+
+ inftimeval = sdr->mode & 7;
+ if (sdr->mode >= 2 && sdr->mode <= 5)
+ change_sdr_clk = true;
+ /*
+ * SDR timing modes 2-5 will not work for the arasan nand when
+ * freq > 90 MHz, so reduce the freq in SDR modes 2-5 to < 90Mhz
+ */
+ if (change_sdr_clk) {
+ clk_disable_unprepare(nfc->clk_sys);
+ err = clk_set_rate(nfc->clk_sys, SDR_MODE_DEFLT_FREQ);
+ if (err) {
+ dev_err(nfc->dev, "Can't set the clock rate\n");
+ return err;
+ }
+ err = clk_prepare_enable(nfc->clk_sys);
+ if (err) {
+ dev_err(nfc->dev, "Unable to enable sys clock.\n");
+ clk_disable_unprepare(nfc->clk_sys);
+ return err;
+ }
+ }
+ achip->inftimeval = inftimeval;
+
+ return 0;
+}
+
+static int anfc_nand_attach_chip(struct nand_chip *chip)
+{
+ struct mtd_info *mtd = nand_to_mtd(chip);
+ struct anfc_nand_chip *achip = to_anfc_nand(chip);
+ u32 ret;
+
+ if (mtd->writesize <= SZ_512)
+ achip->spare_caddr_cycles = 1;
+ else
+ achip->spare_caddr_cycles = 2;
+
+ chip->ecc.calc_buf = kmalloc(mtd->oobsize, GFP_KERNEL);
+ chip->ecc.code_buf = kmalloc(mtd->oobsize, GFP_KERNEL);
+ ret = anfc_ecc_init(mtd, &chip->ecc, chip->ecc.mode);
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+static const struct nand_controller_ops anfc_nand_controller_ops = {
+ .attach_chip = anfc_nand_attach_chip,
+};
+
+static int anfc_nand_chip_init(struct anfc_nand_controller *nfc,
+ struct anfc_nand_chip *anand_chip,
+ struct device_node *np)
+{
+ struct nand_chip *chip = &anand_chip->chip;
+ struct mtd_info *mtd = nand_to_mtd(chip);
+ int ret;
+
+ ret = of_property_read_u32(np, "reg", &anand_chip->csnum);
+ if (ret) {
+ dev_err(nfc->dev, "can't get chip-select\n");
+ return -ENXIO;
+ }
+ mtd->name = devm_kasprintf(nfc->dev, GFP_KERNEL, "arasan_nand.%d",
+ anand_chip->csnum);
+ mtd->dev.parent = nfc->dev;
+ chip->controller = &nfc->controller;
+ chip->options = NAND_BUSWIDTH_AUTO | NAND_NO_SUBPAGE_WRITE;
+ chip->bbt_options = NAND_BBT_USE_FLASH;
+ chip->select_chip = anfc_select_chip;
+ chip->setup_data_interface = anfc_setup_data_interface;
+ chip->exec_op = anfc_exec_op;
+ nand_set_flash_node(chip, np);
+
+ anand_chip->spktsize = SDR_MODE_PACKET_SIZE;
+
+ ret = nand_scan(chip, 1);
+ if (ret) {
+ dev_err(nfc->dev, "nand_scan_tail for NAND failed\n");
+ return ret;
+ }
+
+ return mtd_device_register(mtd, NULL, 0);
+}
+
+static int anfc_probe(struct platform_device *pdev)
+{
+ struct anfc_nand_controller *nfc;
+ struct anfc_nand_chip *anand_chip;
+ struct device_node *np = pdev->dev.of_node, *child;
+ struct resource *res;
+ int err;
+
+ nfc = devm_kzalloc(&pdev->dev, sizeof(*nfc), GFP_KERNEL);
+ if (!nfc)
+ return -ENOMEM;
+
+ nand_controller_init(&nfc->controller);
+ INIT_LIST_HEAD(&nfc->chips);
+ init_completion(&nfc->event);
+ nfc->dev = &pdev->dev;
+ platform_set_drvdata(pdev, nfc);
+ nfc->csnum = -1;
+ nfc->controller.ops = &anfc_nand_controller_ops;
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ nfc->base = devm_ioremap_resource(&pdev->dev, res);
+ if (IS_ERR(nfc->base))
+ return PTR_ERR(nfc->base);
+ nfc->irq = platform_get_irq(pdev, 0);
+ if (nfc->irq < 0) {
+ dev_err(&pdev->dev, "platform_get_irq failed\n");
+ return -ENXIO;
+ }
+ dma_set_mask(&pdev->dev, DMA_BIT_MASK(64));
+ err = devm_request_irq(&pdev->dev, nfc->irq, anfc_irq_handler,
+ 0, "arasannfc", nfc);
+ if (err)
+ return err;
+ nfc->clk_sys = devm_clk_get(&pdev->dev, "clk_sys");
+ if (IS_ERR(nfc->clk_sys)) {
+ dev_err(&pdev->dev, "sys clock not found.\n");
+ return PTR_ERR(nfc->clk_sys);
+ }
+
+ nfc->clk_flash = devm_clk_get(&pdev->dev, "clk_flash");
+ if (IS_ERR(nfc->clk_flash)) {
+ dev_err(&pdev->dev, "flash clock not found.\n");
+ return PTR_ERR(nfc->clk_flash);
+ }
+
+ err = clk_prepare_enable(nfc->clk_sys);
+ if (err) {
+ dev_err(&pdev->dev, "Unable to enable sys clock.\n");
+ return err;
+ }
+
+ err = clk_prepare_enable(nfc->clk_flash);
+ if (err) {
+ dev_err(&pdev->dev, "Unable to enable flash clock.\n");
+ goto clk_dis_sys;
+ }
+
+ for_each_available_child_of_node(np, child) {
+ anand_chip = devm_kzalloc(&pdev->dev, sizeof(*anand_chip),
+ GFP_KERNEL);
+ if (!anand_chip) {
+ of_node_put(child);
+ err = -ENOMEM;
+ goto nandchip_clean_up;
+ }
+ err = anfc_nand_chip_init(nfc, anand_chip, child);
+ if (err) {
+ devm_kfree(&pdev->dev, anand_chip);
+ continue;
+ }
+
+ list_add_tail(&anand_chip->node, &nfc->chips);
+ }
+ return 0;
+
+nandchip_clean_up:
+ list_for_each_entry(anand_chip, &nfc->chips, node)
+ nand_release(&anand_chip->chip);
+ clk_disable_unprepare(nfc->clk_flash);
+clk_dis_sys:
+ clk_disable_unprepare(nfc->clk_sys);
+
+ return err;
+}
+
+static int anfc_remove(struct platform_device *pdev)
+{
+ struct anfc_nand_controller *nfc = platform_get_drvdata(pdev);
+ struct anfc_nand_chip *anand_chip;
+
+ list_for_each_entry(anand_chip, &nfc->chips, node)
+ nand_release(&anand_chip->chip);
+
+ clk_disable_unprepare(nfc->clk_sys);
+ clk_disable_unprepare(nfc->clk_flash);
+
+ return 0;
+}
+
+static const struct of_device_id anfc_ids[] = {
+ { .compatible = "arasan,nfc-v3p10" },
+ { .compatible = "xlnx,zynqmp-nand" },
+ { }
+};
+MODULE_DEVICE_TABLE(of, anfc_ids);
+
+static struct platform_driver anfc_driver = {
+ .driver = {
+ .name = "arasan-nand-controller",
+ .of_match_table = anfc_ids,
+ },
+ .probe = anfc_probe,
+ .remove = anfc_remove,
+};
+module_platform_driver(anfc_driver);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Xilinx, Inc");
+MODULE_DESCRIPTION("Arasan NAND Flash Controller Driver");
--
2.7.4
Hi Naga,
Just a preliminary review. I still think we have problems with how you
execute NAND operations. You seem to assume that read/write operation
are always page write/read operation with a size aligned on a page
size. This is wrong, either the controller is able to execute the exact
operation that has been requested or it returns -ENOTSUPP.
On Fri, 9 Nov 2018 10:30:41 +0530
Naga Sureshkumar Relli <[email protected]> wrote:
> +
> +/**
> + * struct anfc_nand_chip - Defines the nand chip related information
> + * @node: Used to store NAND chips into a list.
> + * @chip: NAND chip information structure.
> + * @strength: Bch or Hamming mode enable/disable.
The name is still confusing. BTW, can't you deduce Hamming vs BCH from
the strength? Hamming strength is 1, while BCH strengths seem to start
at 4.
> + * @ecc_strength: Ecc strength 4.8/12/16.
^/
> + * @eccval: Ecc config value.
> + * @spare_caddr_cycles: Column address cycle information for spare area.
> + * @pktsize: Packet size for read / write operation.
> + * @csnum: chipselect number to be used.
> + * @spktsize: Packet size in ddr mode for status operation.
> + * @inftimeval: Data interface and timing mode information
> + */
> +struct anfc_nand_chip {
> + struct list_head node;
> + struct nand_chip chip;
> + bool strength;
> + u32 ecc_strength;
> + u32 eccval;
> + u16 spare_caddr_cycles;
> + u32 pktsize;
> + int csnum;
> + u32 spktsize;
> + u32 inftimeval;
> +};
> +
> +/**
> + * struct anfc_nand_controller - Defines the Arasan NAND flash controller
> + * driver instance
> + * @controller: base controller structure.
> + * @chips: list of all nand chips attached to the ctrler.
> + * @dev: Pointer to the device structure.
> + * @base: Virtual address of the NAND flash device.
> + * @clk_sys: Pointer to the system clock.
> + * @clk_flash: Pointer to the flash clock.
> + * @dma: Dma enable/disable.
> + * @buf: Buffer used for read/write byte operations.
> + * @irq: irq number
You should need this field. Just get the irq num in you probe path an
register an irq handler with devm_request_irq().
> + * @bufshift: Variable used for indexing buffer operation
> + * @csnum: Chip select number currently inuse.
^ in use
> + * @event: Completion event for nand status events.
> + * @status: Status of the flash device.
> + * @prog: Used to initiate controller operations.
> + */
> +struct anfc_nand_controller {
> + struct nand_controller controller;
> + struct list_head chips;
> + struct device *dev;
> + void __iomem *base;
> + struct clk *clk_sys;
> + struct clk *clk_flash;
> + int irq;
> + int csnum;
> + struct completion event;
> + int status;
> + u8 buf[TEMP_BUF_SIZE];
Allocate this buf dynamically.
> + u32 eccval;
> +};
> +static int anfc_page_write_type_exec(struct nand_chip *chip,
> + const struct nand_subop *subop)
> +{
> + const struct nand_op_instr *instr;
> + struct anfc_nand_chip *achip = to_anfc_nand(chip);
> + struct anfc_nand_controller *nfc = to_anfc(chip->controller);
> + unsigned int op_id, len;
> + struct anfc_op nfc_op = {};
> + struct mtd_info *mtd = nand_to_mtd(chip);
> +
> + anfc_parse_instructions(chip, subop, &nfc_op);
> + instr = nfc_op.data_instr;
> + op_id = nfc_op.data_instr_idx;
> + anfc_prepare_cmd(chip, nfc_op.cmds[0], nfc_op.cmds[1], 1,
> + mtd->writesize, nfc_op.naddrcycles);
> + anfc_setpagecoladdr(nfc, nfc_op.row, nfc_op.col);
> + if (!nfc_op.data_instr)
> + return 0;
> +
> + len = nand_subop_get_data_len(subop, op_id);
> + anfc_write_data_op(chip, (char *)instr->ctx.data.buf.out,
^ extra white space here
and please drop the cast.
Can you please run checkpatch --strict prior to submitting patches?
> + mtd->writesize,
> + DIV_ROUND_UP(mtd->writesize, achip->pktsize),
No, that's wrong. You should use instr->ctx.data.len here, and the
DIV_ROUND_UP() thing is a bit scary. That means you might be writing
more data than requested.
> + achip->pktsize);
> +
> + return 0;
> +}
> +
> +
> +static int anfc_probe(struct platform_device *pdev)
> +{
> + struct anfc_nand_controller *nfc;
> + struct anfc_nand_chip *anand_chip;
> + struct device_node *np = pdev->dev.of_node, *child;
> + struct resource *res;
> + int err;
> +
> + nfc = devm_kzalloc(&pdev->dev, sizeof(*nfc), GFP_KERNEL);
> + if (!nfc)
> + return -ENOMEM;
> +
> + nand_controller_init(&nfc->controller);
> + INIT_LIST_HEAD(&nfc->chips);
> + init_completion(&nfc->event);
> + nfc->dev = &pdev->dev;
> + platform_set_drvdata(pdev, nfc);
> + nfc->csnum = -1;
> + nfc->controller.ops = &anfc_nand_controller_ops;
Add a blank line here please
> + res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> + nfc->base = devm_ioremap_resource(&pdev->dev, res);
> + if (IS_ERR(nfc->base))
> + return PTR_ERR(nfc->base);
and here
> + nfc->irq = platform_get_irq(pdev, 0);
> + if (nfc->irq < 0) {
> + dev_err(&pdev->dev, "platform_get_irq failed\n");
> + return -ENXIO;
> + }
and here
> + dma_set_mask(&pdev->dev, DMA_BIT_MASK(64));
Is this really needed?
> + err = devm_request_irq(&pdev->dev, nfc->irq, anfc_irq_handler,
> + 0, "arasannfc", nfc);
> + if (err)
> + return err;
Add a blank line here too.
> + nfc->clk_sys = devm_clk_get(&pdev->dev, "clk_sys");
> + if (IS_ERR(nfc->clk_sys)) {
> + dev_err(&pdev->dev, "sys clock not found.\n");
> + return PTR_ERR(nfc->clk_sys);
> + }
> +
> + nfc->clk_flash = devm_clk_get(&pdev->dev, "clk_flash");
> + if (IS_ERR(nfc->clk_flash)) {
> + dev_err(&pdev->dev, "flash clock not found.\n");
> + return PTR_ERR(nfc->clk_flash);
> + }
> +
> + err = clk_prepare_enable(nfc->clk_sys);
> + if (err) {
> + dev_err(&pdev->dev, "Unable to enable sys clock.\n");
> + return err;
> + }
> +
> + err = clk_prepare_enable(nfc->clk_flash);
> + if (err) {
> + dev_err(&pdev->dev, "Unable to enable flash clock.\n");
> + goto clk_dis_sys;
> + }
> +
> + for_each_available_child_of_node(np, child) {
> + anand_chip = devm_kzalloc(&pdev->dev, sizeof(*anand_chip),
> + GFP_KERNEL);
> + if (!anand_chip) {
> + of_node_put(child);
> + err = -ENOMEM;
> + goto nandchip_clean_up;
> + }
and here.
> + err = anfc_nand_chip_init(nfc, anand_chip, child);
> + if (err) {
> + devm_kfree(&pdev->dev, anand_chip);
We usually try to avoid calling devm_kfree(). I guess you do it to
avoid keeping the chip obj around when init failed, but this should
be rare enough so we can actually ignore it and let the mem allocated
for the NFC lifetime.
> + continue;
> + }
> +
> + list_add_tail(&anand_chip->node, &nfc->chips);
> + }
> + return 0;
> +
> +nandchip_clean_up:
> + list_for_each_entry(anand_chip, &nfc->chips, node)
> + nand_release(&anand_chip->chip);
Blank line here.
> + clk_disable_unprepare(nfc->clk_flash);
Blank line here.
> +clk_dis_sys:
> + clk_disable_unprepare(nfc->clk_sys);
> +
> + return err;
> +}
Regards,
Boris
Hi Boris,
> -----Original Message-----
> From: Boris Brezillon [mailto:[email protected]]
> Sent: Friday, November 9, 2018 1:38 PM
> To: Naga Sureshkumar Relli <[email protected]>
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; Michal Simek
> <[email protected]>; [email protected]; [email protected];
> [email protected]; [email protected]
> Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan NAND
> Flash Controller
>
> Hi Naga,
>
> Just a preliminary review. I still think we have problems with how you execute NAND
> operations. You seem to assume that read/write operation are always page write/read operation
> with a size aligned on a page size. This is wrong, either the controller is able to execute the
> exact operation that has been requested or it returns -ENOTSUPP.
Are you pointing the anfc_read_param_get_feature_sp_read_type_exec()?
Where I am reading a length of page size even for get_features op.
Is that you are pointing?
>
> On Fri, 9 Nov 2018 10:30:41 +0530
> Naga Sureshkumar Relli <[email protected]> wrote:
>
> > +
> > +/**
> > + * struct anfc_nand_chip - Defines the nand chip related information
> > + * @node: Used to store NAND chips into a list.
> > + * @chip: NAND chip information structure.
> > + * @strength: Bch or Hamming mode enable/disable.
>
> The name is still confusing. BTW, can't you deduce Hamming vs BCH from the strength?
> Hamming strength is 1, while BCH strengths seem to start at 4.
Yes we can deduce. I will try that.
.
>
> > + * @ecc_strength: Ecc strength 4.8/12/16.
>
> ^/
>
> > + * @eccval: Ecc config value.
> > + * @spare_caddr_cycles: Column address cycle information for spare area.
> > + * @pktsize: Packet size for read / write operation.
> > + * @csnum: chipselect number to be used.
> > + * @spktsize: Packet size in ddr mode for status operation.
> > + * @inftimeval: Data interface and timing mode information
> > + */
> > +struct anfc_nand_chip {
> > + struct list_head node;
> > + struct nand_chip chip;
> > + bool strength;
> > + u32 ecc_strength;
> > + u32 eccval;
> > + u16 spare_caddr_cycles;
> > + u32 pktsize;
> > + int csnum;
> > + u32 spktsize;
> > + u32 inftimeval;
> > +};
> > +
> > +/**
> > + * struct anfc_nand_controller - Defines the Arasan NAND flash controller
> > + * driver instance
> > + * @controller: base controller structure.
> > + * @chips: list of all nand chips attached to the ctrler.
> > + * @dev: Pointer to the device structure.
> > + * @base: Virtual address of the NAND flash device.
> > + * @clk_sys: Pointer to the system clock.
> > + * @clk_flash: Pointer to the flash clock.
> > + * @dma: Dma enable/disable.
> > + * @buf: Buffer used for read/write byte operations.
> > + * @irq: irq number
>
> You should need this field. Just get the irq num in you probe path an register an irq handler
> with devm_request_irq().
You mean to say, instead of putting irq in anfc_nand_controller structure, update the code like below snippet, right?
irq = platform_get_irq(pdev, 0);
if (irq < 0) {
dev_err(dev, "failed to retrieve irq\n");
return irq;
}
devm_request_irq(&pdev->dev, irq, anfc_irq_handler, 0, "arasannfc", nfc);
>
> > + * @bufshift: Variable used for indexing buffer operation
> > + * @csnum: Chip select number currently inuse.
>
> ^ in use
>
> > + * @event: Completion event for nand status events.
> > + * @status: Status of the flash device.
> > + * @prog: Used to initiate controller operations.
> > + */
> > +struct anfc_nand_controller {
> > + struct nand_controller controller;
> > + struct list_head chips;
> > + struct device *dev;
> > + void __iomem *base;
> > + struct clk *clk_sys;
> > + struct clk *clk_flash;
> > + int irq;
> > + int csnum;
> > + struct completion event;
> > + int status;
> > + u8 buf[TEMP_BUF_SIZE];
>
> Allocate this buf dynamically.
Ok
>
> > + u32 eccval;
> > +};
>
> > +static int anfc_page_write_type_exec(struct nand_chip *chip,
> > + const struct nand_subop *subop) {
> > + const struct nand_op_instr *instr;
> > + struct anfc_nand_chip *achip = to_anfc_nand(chip);
> > + struct anfc_nand_controller *nfc = to_anfc(chip->controller);
> > + unsigned int op_id, len;
> > + struct anfc_op nfc_op = {};
> > + struct mtd_info *mtd = nand_to_mtd(chip);
> > +
> > + anfc_parse_instructions(chip, subop, &nfc_op);
> > + instr = nfc_op.data_instr;
> > + op_id = nfc_op.data_instr_idx;
> > + anfc_prepare_cmd(chip, nfc_op.cmds[0], nfc_op.cmds[1], 1,
> > + mtd->writesize, nfc_op.naddrcycles);
> > + anfc_setpagecoladdr(nfc, nfc_op.row, nfc_op.col);
> > + if (!nfc_op.data_instr)
> > + return 0;
> > +
> > + len = nand_subop_get_data_len(subop, op_id);
> > + anfc_write_data_op(chip, (char *)instr->ctx.data.buf.out,
>
> ^ extra white space here
> and please drop the cast.
Ok
>
> Can you please run checkpatch --strict prior to submitting patches?
I ran it with --strict while submitting the patch, but I didn't find any warning.
Anyway I will correct it.
>
> > + mtd->writesize,
> > + DIV_ROUND_UP(mtd->writesize, achip->pktsize),
>
> No, that's wrong. You should use instr->ctx.data.len here, and the
> DIV_ROUND_UP() thing is a bit scary. That means you might be writing more data than
> requested.
Ok. Got it.
>
> > + achip->pktsize);
> > +
> > + return 0;
> > +}
> > +
>
> > +
> > +static int anfc_probe(struct platform_device *pdev) {
> > + struct anfc_nand_controller *nfc;
> > + struct anfc_nand_chip *anand_chip;
> > + struct device_node *np = pdev->dev.of_node, *child;
> > + struct resource *res;
> > + int err;
> > +
> > + nfc = devm_kzalloc(&pdev->dev, sizeof(*nfc), GFP_KERNEL);
> > + if (!nfc)
> > + return -ENOMEM;
> > +
> > + nand_controller_init(&nfc->controller);
> > + INIT_LIST_HEAD(&nfc->chips);
> > + init_completion(&nfc->event);
> > + nfc->dev = &pdev->dev;
> > + platform_set_drvdata(pdev, nfc);
> > + nfc->csnum = -1;
> > + nfc->controller.ops = &anfc_nand_controller_ops;
>
> Add a blank line here please
Ok, I will update
>
> > + res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > + nfc->base = devm_ioremap_resource(&pdev->dev, res);
> > + if (IS_ERR(nfc->base))
> > + return PTR_ERR(nfc->base);
>
> and here
Ok, I will update
>
> > + nfc->irq = platform_get_irq(pdev, 0);
> > + if (nfc->irq < 0) {
> > + dev_err(&pdev->dev, "platform_get_irq failed\n");
> > + return -ENXIO;
> > + }
>
> and here
Ok, I will update
>
> > + dma_set_mask(&pdev->dev, DMA_BIT_MASK(64));
>
> Is this really needed?
Yes, As our DMA supports 64bit addressing. It is needed
>
> > + err = devm_request_irq(&pdev->dev, nfc->irq, anfc_irq_handler,
> > + 0, "arasannfc", nfc);
> > + if (err)
> > + return err;
>
> Add a blank line here too.
Ok, I will update
>
> > + nfc->clk_sys = devm_clk_get(&pdev->dev, "clk_sys");
> > + if (IS_ERR(nfc->clk_sys)) {
> > + dev_err(&pdev->dev, "sys clock not found.\n");
> > + return PTR_ERR(nfc->clk_sys);
> > + }
> > +
> > + nfc->clk_flash = devm_clk_get(&pdev->dev, "clk_flash");
> > + if (IS_ERR(nfc->clk_flash)) {
> > + dev_err(&pdev->dev, "flash clock not found.\n");
> > + return PTR_ERR(nfc->clk_flash);
> > + }
> > +
> > + err = clk_prepare_enable(nfc->clk_sys);
> > + if (err) {
> > + dev_err(&pdev->dev, "Unable to enable sys clock.\n");
> > + return err;
> > + }
> > +
> > + err = clk_prepare_enable(nfc->clk_flash);
> > + if (err) {
> > + dev_err(&pdev->dev, "Unable to enable flash clock.\n");
> > + goto clk_dis_sys;
> > + }
> > +
> > + for_each_available_child_of_node(np, child) {
> > + anand_chip = devm_kzalloc(&pdev->dev, sizeof(*anand_chip),
> > + GFP_KERNEL);
> > + if (!anand_chip) {
> > + of_node_put(child);
> > + err = -ENOMEM;
> > + goto nandchip_clean_up;
> > + }
>
> and here.
Ok, I will update
>
> > + err = anfc_nand_chip_init(nfc, anand_chip, child);
> > + if (err) {
> > + devm_kfree(&pdev->dev, anand_chip);
>
> We usually try to avoid calling devm_kfree(). I guess you do it to avoid keeping the chip obj
> around when init failed, but this should be rare enough so we can actually ignore it and let the
> mem allocated for the NFC lifetime.
Ok. I understood.
>
> > + continue;
> > + }
> > +
> > + list_add_tail(&anand_chip->node, &nfc->chips);
> > + }
> > + return 0;
> > +
> > +nandchip_clean_up:
> > + list_for_each_entry(anand_chip, &nfc->chips, node)
> > + nand_release(&anand_chip->chip);
>
> Blank line here.
Ok, I will update
>
> > + clk_disable_unprepare(nfc->clk_flash);
>
> Blank line here.
Ok, I will update
>
> > +clk_dis_sys:
> > + clk_disable_unprepare(nfc->clk_sys);
> > +
> > + return err;
> > +}
>
> Regards,
>
> Boris
Boris/Miquel, could you please review the remaining code as well, if you feel
There is still something to improve.
One concern I have is especially anfc_read_page_hwecc(), there I am checking erased pages bit flips.
Since Arasan NAND controller doesn't have multibit error detection beyond 24-bit( it can correct upto 24 bit errors),
I took some error count as default value(16, I put this based on the error count that I got while reading erased page on Micron device).
I will just read the error count register and compare this value with the default error count(16) and if it is more
Than default then I am checking for erased page bit flips.
I am doubting that this will not work in all cases.
In my case it is just working because the error count that it got on an erased page is 16.
Could you please suggest a way to do detect erased_page bit flips.
Thanks,
Naga Sureshkumar Relli
Hi Naga,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on mtd/nand/next]
[also build test WARNING on v4.20-rc1 next-20181109]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Naga-Sureshkumar-Relli/dt-bindings-mtd-arasan-Add-device-tree-binding-documentation/20181110-034106
base: git://git.infradead.org/linux-mtd.git nand/next
config: sh-allyesconfig (attached as .config)
compiler: sh4-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.2.0 make.cross ARCH=sh
All warnings (new ones prefixed by >>):
In file included from include/linux/scatterlist.h:9:0,
from include/linux/dma-mapping.h:11,
from drivers/mtd//nand/raw/arasan_nand.c:12:
drivers/mtd//nand/raw/arasan_nand.c: In function 'anfc_rw_dma_op':
>> drivers/mtd//nand/raw/arasan_nand.c:353:16: warning: right shift count >= width of type [-Wshift-count-overflow]
writel((paddr >> 32), nfc->base + DMA_ADDR1_OFST);
^
arch/sh/include/asm/io.h:31:77: note: in definition of macro '__raw_writel'
#define __raw_writel(v,a) (__chk_io_ptr(a), *(volatile u32 __force *)(a) = (v))
^
arch/sh/include/asm/io.h:46:62: note: in expansion of macro 'ioswabl'
#define writel_relaxed(v,c) ((void)__raw_writel((__force u32)ioswabl(v),c))
^~~~~~~
arch/sh/include/asm/io.h:56:32: note: in expansion of macro 'writel_relaxed'
#define writel(v,a) ({ wmb(); writel_relaxed((v),(a)); })
^~~~~~~~~~~~~~
>> drivers/mtd//nand/raw/arasan_nand.c:353:2: note: in expansion of macro 'writel'
writel((paddr >> 32), nfc->base + DMA_ADDR1_OFST);
^~~~~~
vim +353 drivers/mtd//nand/raw/arasan_nand.c
325
326 static void anfc_rw_dma_op(struct mtd_info *mtd, u8 *buf, int len,
327 bool do_read, u32 prog, int pktcount, int pktsize)
328 {
329 dma_addr_t paddr;
330 struct nand_chip *chip = mtd_to_nand(mtd);
331 struct anfc_nand_controller *nfc = to_anfc(chip->controller);
332 struct anfc_nand_chip *achip = to_anfc_nand(chip);
333 u32 eccintr = 0, dir;
334
335 if (pktsize == 0)
336 pktsize = len;
337
338 anfc_setpktszcnt(nfc, pktsize, pktcount);
339
340 if (!achip->strength)
341 eccintr = MBIT_ERROR;
342
343 if (do_read)
344 dir = DMA_FROM_DEVICE;
345 else
346 dir = DMA_TO_DEVICE;
347 paddr = dma_map_single(nfc->dev, buf, len, dir);
348 if (dma_mapping_error(nfc->dev, paddr)) {
349 dev_err(nfc->dev, "Read buffer mapping error");
350 return;
351 }
352 writel(paddr, nfc->base + DMA_ADDR0_OFST);
> 353 writel((paddr >> 32), nfc->base + DMA_ADDR1_OFST);
354 anfc_enable_intrs(nfc, (XFER_COMPLETE | eccintr));
355 writel(prog, nfc->base + PROG_OFST);
356 anfc_wait_for_event(nfc);
357 dma_unmap_single(nfc->dev, paddr, len, dir);
358 }
359
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
On Mon, 12 Nov 2018 11:55:36 +0100
Martin Lund <[email protected]> wrote:
> Hi Naga,
>
> Just a few review comments for the v12 version.
>
> On Fri, Nov 9, 2018 at 6:00 AM Naga Sureshkumar Relli
> <[email protected]> wrote:
> > +#define PKT_OFST 0x00
> > +#define PKT_CNT_SHIFT 12
> > +
> > +#define MEM_ADDR1_OFST 0x04
> > +#define MEM_ADDR2_OFST 0x08
>
> For the sake of readability I think *_OFFSET is preferred, especially
> since the driver already includes short macro names. I think this is
> similar to the EVNT vs EVENT point.
> The majority of drivers in the Linux kernel do not shorten OFFSET -> OFST.
>
>
> > +static void anfc_rw_pio_op(struct mtd_info *mtd, u8 *buf, int len,
> > + bool do_read, int prog, int pktcount, int pktsize)
> > +{
> > + struct nand_chip *chip = mtd_to_nand(mtd);
> > + struct anfc_nand_controller *nfc = to_anfc(chip->controller);
> > + struct anfc_nand_chip *achip = to_anfc_nand(chip);
> > + u32 *bufptr = (u32 *)buf;
> > + u32 cnt = 0, intr = 0;
> > +
> > + anfc_config_dma(nfc, 0);
> > +
> > + if (pktsize == 0)
> > + pktsize = len;
> > +
> > + anfc_setpktszcnt(nfc, pktsize, pktcount);
> > +
> > + if (!achip->strength)
> > + intr = MBIT_ERROR;
> > +
> > + if (do_read)
> > + intr |= READ_READY;
> > + else
> > + intr |= WRITE_READY;
> > + anfc_enable_intrs(nfc, intr);
> > + writel(prog, nfc->base + PROG_OFST);
> > + while (cnt < pktcount) {
> > + anfc_wait_for_event(nfc);
> > + cnt++;
> > + if (cnt == pktcount)
> > + anfc_enable_intrs(nfc, XFER_COMPLETE);
> > + if (do_read)
> > + ioread32_rep(nfc->base + DATA_PORT_OFST, bufptr,
> > + pktsize / 4);
> > + else
> > + iowrite32_rep(nfc->base + DATA_PORT_OFST, bufptr,
> > + pktsize / 4);
> > + bufptr += (pktsize / 4);
> > + if (cnt < pktcount)
> > + anfc_enable_intrs(nfc, intr);
> > + }
> > + anfc_wait_for_event(nfc);
> > +}
>
> Throughout the driver all calls to anfc_wait_for_event() ignores the
> timeout return value. It would be nice to see some error handling in
> case it times out - at minimum consider printing out an error message
> since timeout on NAND operations are fairly critical and should
> generally not occur. Perhaps even an argument can be made for
> returning -ETIMEDOUT in case of timeout.
Yes please, check anfc_wait_for_event() return code and propagate the
error to the upper layer.
Hi Naga,
Just a few review comments for the v12 version.
On Fri, Nov 9, 2018 at 6:00 AM Naga Sureshkumar Relli
<[email protected]> wrote:
> +#define PKT_OFST 0x00
> +#define PKT_CNT_SHIFT 12
> +
> +#define MEM_ADDR1_OFST 0x04
> +#define MEM_ADDR2_OFST 0x08
For the sake of readability I think *_OFFSET is preferred, especially
since the driver already includes short macro names. I think this is
similar to the EVNT vs EVENT point.
The majority of drivers in the Linux kernel do not shorten OFFSET -> OFST.
> +static void anfc_rw_pio_op(struct mtd_info *mtd, u8 *buf, int len,
> + bool do_read, int prog, int pktcount, int pktsize)
> +{
> + struct nand_chip *chip = mtd_to_nand(mtd);
> + struct anfc_nand_controller *nfc = to_anfc(chip->controller);
> + struct anfc_nand_chip *achip = to_anfc_nand(chip);
> + u32 *bufptr = (u32 *)buf;
> + u32 cnt = 0, intr = 0;
> +
> + anfc_config_dma(nfc, 0);
> +
> + if (pktsize == 0)
> + pktsize = len;
> +
> + anfc_setpktszcnt(nfc, pktsize, pktcount);
> +
> + if (!achip->strength)
> + intr = MBIT_ERROR;
> +
> + if (do_read)
> + intr |= READ_READY;
> + else
> + intr |= WRITE_READY;
> + anfc_enable_intrs(nfc, intr);
> + writel(prog, nfc->base + PROG_OFST);
> + while (cnt < pktcount) {
> + anfc_wait_for_event(nfc);
> + cnt++;
> + if (cnt == pktcount)
> + anfc_enable_intrs(nfc, XFER_COMPLETE);
> + if (do_read)
> + ioread32_rep(nfc->base + DATA_PORT_OFST, bufptr,
> + pktsize / 4);
> + else
> + iowrite32_rep(nfc->base + DATA_PORT_OFST, bufptr,
> + pktsize / 4);
> + bufptr += (pktsize / 4);
> + if (cnt < pktcount)
> + anfc_enable_intrs(nfc, intr);
> + }
> + anfc_wait_for_event(nfc);
> +}
Throughout the driver all calls to anfc_wait_for_event() ignores the
timeout return value. It would be nice to see some error handling in
case it times out - at minimum consider printing out an error message
since timeout on NAND operations are fairly critical and should
generally not occur. Perhaps even an argument can be made for
returning -ETIMEDOUT in case of timeout.
I'm putting a focus on this because I see the original non-upstream
Arasan driver sometimes timeout on NAND operations when I stress test
it via the UBI stress test. Not sure what the cause for the timeout is
yet but either way an error message would have been helpful.
Br, Martin
Hi Boris & Martin,
> -----Original Message-----
> From: Boris Brezillon [mailto:[email protected]]
> Sent: Monday, November 12, 2018 4:28 PM
> To: Martin Lund <[email protected]>
> Cc: Naga Sureshkumar Relli <[email protected]>; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; Michal Simek <[email protected]>; [email protected];
> [email protected]; [email protected]; [email protected]
> Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan NAND
> Flash Controller
>
> On Mon, 12 Nov 2018 11:55:36 +0100
> Martin Lund <[email protected]> wrote:
>
> > Hi Naga,
> >
> > Just a few review comments for the v12 version.
> >
> > On Fri, Nov 9, 2018 at 6:00 AM Naga Sureshkumar Relli
> > <[email protected]> wrote:
> > > +#define PKT_OFST 0x00
> > > +#define PKT_CNT_SHIFT 12
> > > +
> > > +#define MEM_ADDR1_OFST 0x04
> > > +#define MEM_ADDR2_OFST 0x08
> >
> > For the sake of readability I think *_OFFSET is preferred, especially
> > since the driver already includes short macro names. I think this is
> > similar to the EVNT vs EVENT point.
> > The majority of drivers in the Linux kernel do not shorten OFFSET -> OFST.
> >
> >
> > > +static void anfc_rw_pio_op(struct mtd_info *mtd, u8 *buf, int len,
> > > + bool do_read, int prog, int pktcount, int
> > > +pktsize) {
> > > + struct nand_chip *chip = mtd_to_nand(mtd);
> > > + struct anfc_nand_controller *nfc = to_anfc(chip->controller);
> > > + struct anfc_nand_chip *achip = to_anfc_nand(chip);
> > > + u32 *bufptr = (u32 *)buf;
> > > + u32 cnt = 0, intr = 0;
> > > +
> > > + anfc_config_dma(nfc, 0);
> > > +
> > > + if (pktsize == 0)
> > > + pktsize = len;
> > > +
> > > + anfc_setpktszcnt(nfc, pktsize, pktcount);
> > > +
> > > + if (!achip->strength)
> > > + intr = MBIT_ERROR;
> > > +
> > > + if (do_read)
> > > + intr |= READ_READY;
> > > + else
> > > + intr |= WRITE_READY;
> > > + anfc_enable_intrs(nfc, intr);
> > > + writel(prog, nfc->base + PROG_OFST);
> > > + while (cnt < pktcount) {
> > > + anfc_wait_for_event(nfc);
> > > + cnt++;
> > > + if (cnt == pktcount)
> > > + anfc_enable_intrs(nfc, XFER_COMPLETE);
> > > + if (do_read)
> > > + ioread32_rep(nfc->base + DATA_PORT_OFST, bufptr,
> > > + pktsize / 4);
> > > + else
> > > + iowrite32_rep(nfc->base + DATA_PORT_OFST, bufptr,
> > > + pktsize / 4);
> > > + bufptr += (pktsize / 4);
> > > + if (cnt < pktcount)
> > > + anfc_enable_intrs(nfc, intr);
> > > + }
> > > + anfc_wait_for_event(nfc);
> > > +}
> >
> > Throughout the driver all calls to anfc_wait_for_event() ignores the
> > timeout return value. It would be nice to see some error handling in
> > case it times out - at minimum consider printing out an error message
> > since timeout on NAND operations are fairly critical and should
> > generally not occur. Perhaps even an argument can be made for
> > returning -ETIMEDOUT in case of timeout.
>
> Yes please, check anfc_wait_for_event() return code and propagate the error to the upper layer.
Ok, I will update
Thanks,
Naga Sureshkumar Relli
Hi Boris & Miquel,
I am updating the driver by addressing your comments, and I have one concern, especially in anfc_read_page_hwecc(),
there I am checking for erased pages bit flips.
Since Arasan NAND controller doesn't have multibit error detection beyond 24-bit( it can correct up to 24 bit),
i.e. there is no indication from controller to detect uncorrectable error beyond 24bit.
So I took some error count as default value(MULTI_BIT_ERR_CNT 16, I put this based on the error count that
I got while reading erased page on Micron device).
And during a page read, will just read the error count register and compare this value with the default error count(16) and if it is more
Than default then I am checking for erased page bit flips.
I am doubting that this will not work in all cases.
In my case it is just working because the error count that it got on an erased page is 16.
Could you please suggest a way to do detect erased_page bit flips when reading a page with HW-ECC?.
> +
> +/*
> + * Arasan NAND controller can't detect errors beyond 24-bit in BCH
> + * For an erased page we observed that multibit error count as 16
> + * with 24-bit ECC. so if the count is equal to or greater than 16
> + * then we can say that its an uncorrectable ECC error.
> + */
> +#define MULTI_BIT_ERR_CNT 16
> +
>
> +}
> +
> +static int anfc_read_page_hwecc(struct nand_chip *chip, u8 *buf,
> + int oob_required, int page)
> +{
> + struct anfc_nand_controller *nfc = to_anfc(chip->controller);
> + struct anfc_nand_chip *achip = to_anfc_nand(chip);
> + struct mtd_info *mtd = nand_to_mtd(chip);
> + u8 *ecc_code = chip->ecc.code_buf;
> + u8 *p;
> + int eccsize = chip->ecc.size;
> + int eccbytes = chip->ecc.bytes;
> + int stat = 0, i;
> + u32 ret;
> + unsigned int max_bitflips = 0;
> + u32 eccsteps = chip->ecc.steps;
> + u32 one_bit_err = 0, multi_bit_err = 0;
> +
> + anfc_set_eccsparecmd(nfc, achip, NAND_CMD_RNDOUT,
> NAND_CMD_RNDOUTSTART);
> + anfc_config_ecc(nfc, true);
> +
> + ret = nand_read_page_op(chip, page, 0, buf, mtd->writesize);
> + if (ret)
> + return ret;
> +
> + anfc_config_ecc(nfc, false);
> + if (achip->strength) {
> + /*
> + * In BCH mode Arasan NAND controller can correct ECC upto
> + * 24-bit Beyond that, it can't even detect errors.
> + */
> + multi_bit_err = readl(nfc->base + ECC_ERR_CNT_OFST);
> + multi_bit_err = ((multi_bit_err & PAGE_ERR_CNT_MASK) >> 8);
> + } else {
> + /*
> + * In Hamming mode Arasan NAND controller can correct ECC upto
> + * 1-bit and can detect upto 2-bit errors.
> + */
> + one_bit_err = readl(nfc->base + ECC_ERR_CNT_1BIT_OFST);
> + multi_bit_err = readl(nfc->base + ECC_ERR_CNT_2BIT_OFST);
> + /* Clear ecc error count register 1Bit, 2Bit */
> + writel(0x0, nfc->base + ECC_ERR_CNT_1BIT_OFST);
> + writel(0x0, nfc->base + ECC_ERR_CNT_2BIT_OFST);
> + }
> +
> + if (oob_required)
> + chip->ecc.read_oob(chip, page);
> +
> + if (multi_bit_err >= MULTI_BIT_ERR_CNT) {
> + if (!oob_required)
> + chip->ecc.read_oob(chip, page);
> +
> + mtd_ooblayout_get_eccbytes(mtd, ecc_code, chip->oob_poi, 0,
> + chip->ecc.total);
> + p = buf;
> + for (i = 0; eccsteps; eccsteps--, i += eccbytes,
> + p += eccsize) {
> + stat = nand_check_erased_ecc_chunk(p,
> + chip->ecc.size,
> + &ecc_code[i],
> + eccbytes,
> + NULL, 0,
> + chip->ecc.strength);
> + if (stat < 0) {
> + mtd->ecc_stats.failed++;
> + } else {
> + mtd->ecc_stats.corrected += stat;
> + max_bitflips = max_t(unsigned int, max_bitflips,
> + stat);
> + }
> + }
> + }
> +
> + return max_bitflips;
> +}
> +
> +
Thanks,
Naga Sureshkumar Relli.
Hi Naga,
Thank you for the patch! Perhaps something to improve:
url: https://github.com/0day-ci/linux/commits/Naga-Sureshkumar-Relli/dt-bindings-mtd-arasan-Add-device-tree-binding-documentation/20181110-034106
base: git://git.infradead.org/linux-mtd.git nand/next
smatch warnings:
drivers/mtd/nand/raw/arasan_nand.c:353 anfc_rw_dma_op() warn: right shifting more than type allows 32 vs 32
drivers/mtd/nand/raw/arasan_nand.c:1032 anfc_setup_data_interface() warn: unsigned 'sdr->mode' is never less than zero.
# https://github.com/0day-ci/linux/commit/8db68ae6047a392d3e4092cbd6d3051eec1edd47
git remote add linux-review https://github.com/0day-ci/linux
git remote update linux-review
git checkout 8db68ae6047a392d3e4092cbd6d3051eec1edd47
vim +353 drivers/mtd/nand/raw/arasan_nand.c
8db68ae6 Naga Sureshkumar Relli 2018-11-09 325
8db68ae6 Naga Sureshkumar Relli 2018-11-09 326 static void anfc_rw_dma_op(struct mtd_info *mtd, u8 *buf, int len,
8db68ae6 Naga Sureshkumar Relli 2018-11-09 327 bool do_read, u32 prog, int pktcount, int pktsize)
8db68ae6 Naga Sureshkumar Relli 2018-11-09 328 {
8db68ae6 Naga Sureshkumar Relli 2018-11-09 329 dma_addr_t paddr;
8db68ae6 Naga Sureshkumar Relli 2018-11-09 330 struct nand_chip *chip = mtd_to_nand(mtd);
8db68ae6 Naga Sureshkumar Relli 2018-11-09 331 struct anfc_nand_controller *nfc = to_anfc(chip->controller);
8db68ae6 Naga Sureshkumar Relli 2018-11-09 332 struct anfc_nand_chip *achip = to_anfc_nand(chip);
8db68ae6 Naga Sureshkumar Relli 2018-11-09 333 u32 eccintr = 0, dir;
8db68ae6 Naga Sureshkumar Relli 2018-11-09 334
8db68ae6 Naga Sureshkumar Relli 2018-11-09 335 if (pktsize == 0)
8db68ae6 Naga Sureshkumar Relli 2018-11-09 336 pktsize = len;
8db68ae6 Naga Sureshkumar Relli 2018-11-09 337
8db68ae6 Naga Sureshkumar Relli 2018-11-09 338 anfc_setpktszcnt(nfc, pktsize, pktcount);
8db68ae6 Naga Sureshkumar Relli 2018-11-09 339
8db68ae6 Naga Sureshkumar Relli 2018-11-09 340 if (!achip->strength)
8db68ae6 Naga Sureshkumar Relli 2018-11-09 341 eccintr = MBIT_ERROR;
8db68ae6 Naga Sureshkumar Relli 2018-11-09 342
8db68ae6 Naga Sureshkumar Relli 2018-11-09 343 if (do_read)
8db68ae6 Naga Sureshkumar Relli 2018-11-09 344 dir = DMA_FROM_DEVICE;
8db68ae6 Naga Sureshkumar Relli 2018-11-09 345 else
8db68ae6 Naga Sureshkumar Relli 2018-11-09 346 dir = DMA_TO_DEVICE;
8db68ae6 Naga Sureshkumar Relli 2018-11-09 347 paddr = dma_map_single(nfc->dev, buf, len, dir);
8db68ae6 Naga Sureshkumar Relli 2018-11-09 348 if (dma_mapping_error(nfc->dev, paddr)) {
8db68ae6 Naga Sureshkumar Relli 2018-11-09 349 dev_err(nfc->dev, "Read buffer mapping error");
8db68ae6 Naga Sureshkumar Relli 2018-11-09 350 return;
8db68ae6 Naga Sureshkumar Relli 2018-11-09 351 }
8db68ae6 Naga Sureshkumar Relli 2018-11-09 352 writel(paddr, nfc->base + DMA_ADDR0_OFST);
8db68ae6 Naga Sureshkumar Relli 2018-11-09 @353 writel((paddr >> 32), nfc->base + DMA_ADDR1_OFST);
^^^^^^^^^^^^^
This is zero. Which is probably intended and fine... I was hoping it
would have the other line 1032 warning in the email...
8db68ae6 Naga Sureshkumar Relli 2018-11-09 354 anfc_enable_intrs(nfc, (XFER_COMPLETE | eccintr));
8db68ae6 Naga Sureshkumar Relli 2018-11-09 355 writel(prog, nfc->base + PROG_OFST);
8db68ae6 Naga Sureshkumar Relli 2018-11-09 356 anfc_wait_for_event(nfc);
8db68ae6 Naga Sureshkumar Relli 2018-11-09 357 dma_unmap_single(nfc->dev, paddr, len, dir);
8db68ae6 Naga Sureshkumar Relli 2018-11-09 358 }
8db68ae6 Naga Sureshkumar Relli 2018-11-09 359
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
Hi Naga,
Naga Sureshkumar Relli <[email protected]> wrote on Thu, 15 Nov 2018
09:34:16 +0000:
> Hi Boris & Miquel,
>
> I am updating the driver by addressing your comments, and I have one concern, especially in anfc_read_page_hwecc(),
> there I am checking for erased pages bit flips.
> Since Arasan NAND controller doesn't have multibit error detection beyond 24-bit( it can correct up to 24 bit),
> i.e. there is no indication from controller to detect uncorrectable error beyond 24bit.
> So I took some error count as default value(MULTI_BIT_ERR_CNT 16, I put this based on the error count that
> I got while reading erased page on Micron device).
> And during a page read, will just read the error count register and compare this value with the default error count(16) and if it is more
> Than default then I am checking for erased page bit flips.
> I am doubting that this will not work in all cases.
> In my case it is just working because the error count that it got on an erased page is 16.
> Could you please suggest a way to do detect erased_page bit flips when reading a page with HW-ECC?.
So the ECC engine is broken by design.
I think you should determine a number of bitflips (16 looks nice to me)
over which you declare the page bad anyway.
Now, this is generic logic: anytime a page is declared bad, you should
re-read the page in raw mode and check for the number of bitflips
manually (thanks to the helpers in the core). Again, if the number of BF
is above 16, we can assume the page is bad and increment ->ecc.failed
accordingly.
Thanks,
Miquèl
On Thu, 15 Nov 2018 09:34:16 +0000
Naga Sureshkumar Relli <[email protected]> wrote:
> Hi Boris & Miquel,
>
> I am updating the driver by addressing your comments, and I have one concern, especially in anfc_read_page_hwecc(),
> there I am checking for erased pages bit flips.
> Since Arasan NAND controller doesn't have multibit error detection beyond 24-bit( it can correct up to 24 bit),
> i.e. there is no indication from controller to detect uncorrectable error beyond 24bit.
Do you mean that you can't detect uncorrectable errors, or just that
it's not 100% sure to detect errors above max_strength?
> So I took some error count as default value(MULTI_BIT_ERR_CNT 16, I put this based on the error count that
> I got while reading erased page on Micron device).
> And during a page read, will just read the error count register and compare this value with the default error count(16) and if it is more
> Than default then I am checking for erased page bit flips.
Hm, that's wrong, especially if you set ecc_strength to something > 16.
> I am doubting that this will not work in all cases.
It definitely doesn't.
> In my case it is just working because the error count that it got on an erased page is 16.
> Could you please suggest a way to do detect erased_page bit flips when reading a page with HW-ECC?.
I'm a bit lost. Is the problem only about bitflips in erase pages, or
is it also impacting reads of written pages that lead to uncorrectable
errors.
Don't you have a bit (or several bits) reporting when the ECC engine
was not able to correct data? I you do, you should base the "detect
bitflips in erase pages" logic on this information.
H Boris,
> -----Original Message-----
> From: Boris Brezillon [mailto:[email protected]]
> Sent: Monday, November 19, 2018 1:13 AM
> To: Naga Sureshkumar Relli <[email protected]>
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected]; linux-
> [email protected]; [email protected]; [email protected]; Michal Simek
> <[email protected]>
> Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan NAND
> Flash Controller
>
> On Thu, 15 Nov 2018 09:34:16 +0000
> Naga Sureshkumar Relli <[email protected]> wrote:
>
> > Hi Boris & Miquel,
> >
> > I am updating the driver by addressing your comments, and I have one
> > concern, especially in anfc_read_page_hwecc(), there I am checking for erased pages bit flips.
> > Since Arasan NAND controller doesn't have multibit error detection
> > beyond 24-bit( it can correct up to 24 bit), i.e. there is no indication from controller to detect
> uncorrectable error beyond 24bit.
>
> Do you mean that you can't detect uncorrectable errors, or just that it's not 100% sure to detect
> errors above max_strength?
Yes, in Arasan NAND controller there is no way to detect uncorrectable errors beyond 24-bit.
>
> > So I took some error count as default value(MULTI_BIT_ERR_CNT 16, I
> > put this based on the error count that I got while reading erased page on Micron device).
> > And during a page read, will just read the error count register and
> > compare this value with the default error count(16) and if it is more Than default then I am
> checking for erased page bit flips.
>
> Hm, that's wrong, especially if you set ecc_strength to something > 16.
Ok
>
> > I am doubting that this will not work in all cases.
>
> It definitely doesn't.
Ok
>
> > In my case it is just working because the error count that it got on an erased page is 16.
> > Could you please suggest a way to do detect erased_page bit flips when reading a page with
> HW-ECC?.
>
> I'm a bit lost. Is the problem only about bitflips in erase pages, or is it also impacting reads of
> written pages that lead to uncorrectable errors.
Yes, it is for both. But in case of read errors that we can't detect beyond 24-bit, then the answer from HW design team
Is that the flash part is bad.
Unfortunately till now we haven't ran into that situation(read errors of written pages beyond 24-bit).
But we are hitting this because of erased page reading(needed in case of ubifs).
>
> Don't you have a bit (or several bits) reporting when the ECC engine was not able to correct
> data? I you do, you should base the "detect bitflips in erase pages" logic on this information.
Bit reporting for several bit errors is there only for Hamming(1bit correction and 2bit detection) but not in BCH.
Thanks,
Naga Sureshkumar Relli
On Mon, 19 Nov 2018 06:20:28 +0000
Naga Sureshkumar Relli <[email protected]> wrote:
> H Boris,
>
> > -----Original Message-----
> > From: Boris Brezillon [mailto:[email protected]]
> > Sent: Monday, November 19, 2018 1:13 AM
> > To: Naga Sureshkumar Relli <[email protected]>
> > Cc: [email protected]; [email protected]; [email protected];
> > [email protected]; [email protected]; [email protected]; linux-
> > [email protected]; [email protected]; [email protected]; Michal Simek
> > <[email protected]>
> > Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan NAND
> > Flash Controller
> >
> > On Thu, 15 Nov 2018 09:34:16 +0000
> > Naga Sureshkumar Relli <[email protected]> wrote:
> >
> > > Hi Boris & Miquel,
> > >
> > > I am updating the driver by addressing your comments, and I have one
> > > concern, especially in anfc_read_page_hwecc(), there I am checking for erased pages bit flips.
> > > Since Arasan NAND controller doesn't have multibit error detection
> > > beyond 24-bit( it can correct up to 24 bit), i.e. there is no indication from controller to detect
> > uncorrectable error beyond 24bit.
> >
> > Do you mean that you can't detect uncorrectable errors, or just that it's not 100% sure to detect
> > errors above max_strength?
> Yes, in Arasan NAND controller there is no way to detect uncorrectable errors beyond 24-bit.
So how do you detect uncorrectable errors when the strength is less than
24bits?
> >
> > > So I took some error count as default value(MULTI_BIT_ERR_CNT 16, I
> > > put this based on the error count that I got while reading erased page on Micron device).
> > > And during a page read, will just read the error count register and
> > > compare this value with the default error count(16) and if it is more Than default then I am
> > checking for erased page bit flips.
> >
> > Hm, that's wrong, especially if you set ecc_strength to something > 16.
> Ok
> >
> > > I am doubting that this will not work in all cases.
> >
> > It definitely doesn't.
> Ok
> >
> > > In my case it is just working because the error count that it got on an erased page is 16.
> > > Could you please suggest a way to do detect erased_page bit flips when reading a page with
> > HW-ECC?.
> >
> > I'm a bit lost. Is the problem only about bitflips in erase pages, or is it also impacting reads of
> > written pages that lead to uncorrectable errors.
> Yes, it is for both. But in case of read errors that we can't detect beyond 24-bit, then the answer from HW design team
> Is that the flash part is bad.
> Unfortunately till now we haven't ran into that situation(read errors of written pages beyond 24-bit).
Can you please run nandbiterrs (availaible in mtd-utils). I fear your
device won't pass the test.
> But we are hitting this because of erased page reading(needed in case of ubifs).
>
> >
> > Don't you have a bit (or several bits) reporting when the ECC engine was not able to correct
> > data? I you do, you should base the "detect bitflips in erase pages" logic on this information.
> Bit reporting for several bit errors is there only for Hamming(1bit correction and 2bit detection) but not in BCH.
>
Then I tend to agree with Miquel: your ECC engine is broken, and I'm
not even sure how to deal with that yet.
Hi Boris,
> -----Original Message-----
> From: Boris Brezillon [mailto:[email protected]]
> Sent: Monday, November 19, 2018 1:33 PM
> To: Naga Sureshkumar Relli <[email protected]>
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected]; linux-
> [email protected]; [email protected]; [email protected]; Michal Simek
> <[email protected]>
> Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan NAND
> Flash Controller
>
> On Mon, 19 Nov 2018 06:20:28 +0000
> Naga Sureshkumar Relli <[email protected]> wrote:
>
> > H Boris,
> >
> > > -----Original Message-----
> > > From: Boris Brezillon [mailto:[email protected]]
> > > Sent: Monday, November 19, 2018 1:13 AM
> > > To: Naga Sureshkumar Relli <[email protected]>
> > > Cc: [email protected]; [email protected]; [email protected];
> > > [email protected]; [email protected];
> > > [email protected]; linux- [email protected];
> > > [email protected]; [email protected]; Michal Simek
> > > <[email protected]>
> > > Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support
> > > for Arasan NAND Flash Controller
> > >
> > > On Thu, 15 Nov 2018 09:34:16 +0000
> > > Naga Sureshkumar Relli <[email protected]> wrote:
> > >
> > > > Hi Boris & Miquel,
> > > >
> > > > I am updating the driver by addressing your comments, and I have
> > > > one concern, especially in anfc_read_page_hwecc(), there I am checking for erased pages
> bit flips.
> > > > Since Arasan NAND controller doesn't have multibit error detection
> > > > beyond 24-bit( it can correct up to 24 bit), i.e. there is no
> > > > indication from controller to detect
> > > uncorrectable error beyond 24bit.
> > >
> > > Do you mean that you can't detect uncorrectable errors, or just that
> > > it's not 100% sure to detect errors above max_strength?
> > Yes, in Arasan NAND controller there is no way to detect uncorrectable errors beyond 24-
> bit.
>
> So how do you detect uncorrectable errors when the strength is less than
> 24bits?
Below or equal to the level of ECC strength, controller will definitely correct.
But beyond the level of ECC strength, it won't even detect the errors.
>
> > >
> > > > So I took some error count as default value(MULTI_BIT_ERR_CNT 16, I
> > > > put this based on the error count that I got while reading erased page on Micron device).
> > > > And during a page read, will just read the error count register and
> > > > compare this value with the default error count(16) and if it is more Than default then I
> am
> > > checking for erased page bit flips.
> > >
> > > Hm, that's wrong, especially if you set ecc_strength to something > 16.
> > Ok
> > >
> > > > I am doubting that this will not work in all cases.
> > >
> > > It definitely doesn't.
> > Ok
> > >
> > > > In my case it is just working because the error count that it got on an erased page is 16.
> > > > Could you please suggest a way to do detect erased_page bit flips when reading a page
> with
> > > HW-ECC?.
> > >
> > > I'm a bit lost. Is the problem only about bitflips in erase pages, or is it also impacting reads
> of
> > > written pages that lead to uncorrectable errors.
> > Yes, it is for both. But in case of read errors that we can't detect beyond 24-bit, then the
> answer from HW design team
> > Is that the flash part is bad.
> > Unfortunately till now we haven't ran into that situation(read errors of written pages beyond
> 24-bit).
>
> Can you please run nandbiterrs (availaible in mtd-utils). I fear your
> device won't pass the test.
Yes, nandbiterror test is passing till 24bit, after that it is failing.
>
> > But we are hitting this because of erased page reading(needed in case of ubifs).
> >
> > >
> > > Don't you have a bit (or several bits) reporting when the ECC engine was not able to
> correct
> > > data? I you do, you should base the "detect bitflips in erase pages" logic on this information.
> > Bit reporting for several bit errors is there only for Hamming(1bit correction and 2bit
> detection) but not in BCH.
> >
>
> Then I tend to agree with Miquel: your ECC engine is broken, and I'm
> not even sure how to deal with that yet.
So as per the Miquel's suggestion, can I proceed to add the below one?
"you should re-read the page in raw mode and check for the number of bitflips manually (thanks to the helpers in the core). Again, if the number of BF is above 16, we can assume the page is bad and increment ->ecc.failed accordingly."
Thanks,
Naga Sureshkumar Relli
Hi Naga,
Boris Brezillon <[email protected]> wrote on Tue, 20 Nov 2018
12:02:44 +0100:
> On Tue, 20 Nov 2018 07:02:08 +0000
> Naga Sureshkumar Relli <[email protected]> wrote:
>
>
> > >
> > > Can you please run nandbiterrs (availaible in mtd-utils). I fear your
> > > device won't pass the test.
> > Yes, nandbiterror test is passing till 24bit, after that it is failing.
>
> Can you paste the output of nandbiterrs please?
Apparently 'nandbiterrs -i 'just crashes the kernel because of a
segmentation fault. Please run this test (from the mtd-utils package)
and fix this issue. Then we would like to see the output.
>
> > >
> > > > But we are hitting this because of erased page reading(needed in case of ubifs).
> > > >
> > > > >
> > > > > Don't you have a bit (or several bits) reporting when the ECC engine was not able to
> > > correct
> > > > > data? I you do, you should base the "detect bitflips in erase pages" logic on this information.
> > > > Bit reporting for several bit errors is there only for Hamming(1bit correction and 2bit
> > > detection) but not in BCH.
> > > >
> > >
> > > Then I tend to agree with Miquel: your ECC engine is broken, and I'm
> > > not even sure how to deal with that yet.
> > So as per the Miquel's suggestion, can I proceed to add the below one?
> > "you should re-read the page in raw mode and check for the number of bitflips manually (thanks to the helpers in the core). Again, if the number of BF is above 16, we can assume the page is bad and increment ->ecc.failed accordingly."
>
> But that's just partially fixing the problem. And you didn't answer my
> previous question: what happens when you configure the ECC engine in,
> say 12bit/1024 and you end up with uncorrectable errors (more than 12
> bitflips in a 1k block). What's the number reported ECC_ERR_CNT? Is it
> set to 13?
Please dump this register, and eventually what's the value of the
Packet_bound_Err_count field ([0:7]) for each iteration of nandbiterrs -i.
If there is no way, when the status bit is set, to discriminate if the
data is reliable or was not corrected at all, it is gonna be a real
issue and I don't think we want to support such engine.
Thanks,
Miquèl
On Tue, 20 Nov 2018 07:02:08 +0000
Naga Sureshkumar Relli <[email protected]> wrote:
> >
> > Can you please run nandbiterrs (availaible in mtd-utils). I fear your
> > device won't pass the test.
> Yes, nandbiterror test is passing till 24bit, after that it is failing.
Can you paste the output of nandbiterrs please?
> >
> > > But we are hitting this because of erased page reading(needed in case of ubifs).
> > >
> > > >
> > > > Don't you have a bit (or several bits) reporting when the ECC engine was not able to
> > correct
> > > > data? I you do, you should base the "detect bitflips in erase pages" logic on this information.
> > > Bit reporting for several bit errors is there only for Hamming(1bit correction and 2bit
> > detection) but not in BCH.
> > >
> >
> > Then I tend to agree with Miquel: your ECC engine is broken, and I'm
> > not even sure how to deal with that yet.
> So as per the Miquel's suggestion, can I proceed to add the below one?
> "you should re-read the page in raw mode and check for the number of bitflips manually (thanks to the helpers in the core). Again, if the number of BF is above 16, we can assume the page is bad and increment ->ecc.failed accordingly."
But that's just partially fixing the problem. And you didn't answer my
previous question: what happens when you configure the ECC engine in,
say 12bit/1024 and you end up with uncorrectable errors (more than 12
bitflips in a 1k block). What's the number reported ECC_ERR_CNT? Is it
set to 13?
On Fri, 9 Nov 2018 10:30:41 +0530
Naga Sureshkumar Relli <[email protected]> wrote:
> +static int anfc_setup_data_interface(struct nand_chip *chip, int csline,
> + const struct nand_data_interface *conf)
> +{
> + struct anfc_nand_controller *nfc = to_anfc(chip->controller);
> + struct anfc_nand_chip *achip = to_anfc_nand(chip);
> + int err;
> + const struct nand_sdr_timings *sdr;
> + u32 inftimeval;
> + bool change_sdr_clk = false;
> +
> + if (csline == NAND_DATA_IFACE_CHECK_ONLY)
> + return 0;
> +
> + /*
> + * If the controller is already in the same mode as flash device
> + * then no need to change the timing mode again.
> + */
> + sdr = nand_get_sdr_timings(conf);
> + if (IS_ERR(sdr))
> + return PTR_ERR(sdr);
> +
> + if (sdr->mode < 0)
> + return -ENOTSUPP;
> +
> + inftimeval = sdr->mode & 7;
> + if (sdr->mode >= 2 && sdr->mode <= 5)
> + change_sdr_clk = true;
> + /*
> + * SDR timing modes 2-5 will not work for the arasan nand when
> + * freq > 90 MHz, so reduce the freq in SDR modes 2-5 to < 90Mhz
What's the freq for mode 0 and 1?
> + */
> + if (change_sdr_clk) {
> + clk_disable_unprepare(nfc->clk_sys);
> + err = clk_set_rate(nfc->clk_sys, SDR_MODE_DEFLT_FREQ);
You should not change the clk rate here. It should be done when the
chip is selected, so that, if you ever have 2 different chips connected
to the same controller, you can adapt the rate when they are accessed.
> + if (err) {
> + dev_err(nfc->dev, "Can't set the clock rate\n");
> + return err;
> + }
> + err = clk_prepare_enable(nfc->clk_sys);
> + if (err) {
> + dev_err(nfc->dev, "Unable to enable sys clock.\n");
> + clk_disable_unprepare(nfc->clk_sys);
> + return err;
> + }
> + }
> + achip->inftimeval = inftimeval;
> +
> + return 0;
> +}
> +
> +static int anfc_nand_attach_chip(struct nand_chip *chip)
> +{
> + struct mtd_info *mtd = nand_to_mtd(chip);
> + struct anfc_nand_chip *achip = to_anfc_nand(chip);
> + u32 ret;
> +
> + if (mtd->writesize <= SZ_512)
> + achip->spare_caddr_cycles = 1;
> + else
> + achip->spare_caddr_cycles = 2;
> +
> + chip->ecc.calc_buf = kmalloc(mtd->oobsize, GFP_KERNEL);
> + chip->ecc.code_buf = kmalloc(mtd->oobsize, GFP_KERNEL);
Those bufs are allocated but never freed (memleak). Also, are you sure
you really need them.
Hi Boris & Miquel,
> -----Original Message-----
> From: Miquel Raynal [mailto:[email protected]]
> Sent: Tuesday, November 20, 2018 6:06 PM
> To: Boris Brezillon <[email protected]>
> Cc: Naga Sureshkumar Relli <[email protected]>; [email protected];
> [email protected]; [email protected]; [email protected]; linux-
> [email protected]; [email protected]; [email protected];
> [email protected]; Michal Simek <[email protected]>
> Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan
> NAND Flash Controller
>
> Hi Naga,
>
> Boris Brezillon <[email protected]> wrote on Tue, 20 Nov 2018
> 12:02:44 +0100:
>
> > On Tue, 20 Nov 2018 07:02:08 +0000
> > Naga Sureshkumar Relli <[email protected]> wrote:
> >
> >
> > > >
> > > > Can you please run nandbiterrs (availaible in mtd-utils). I fear your
> > > > device won't pass the test.
> > > Yes, nandbiterror test is passing till 24bit, after that it is failing.
> >
> > Can you paste the output of nandbiterrs please?
>
> Apparently 'nandbiterrs -i 'just crashes the kernel because of a segmentation fault. Please run
> this test (from the mtd-utils package) and fix this issue. Then we would like to see the output.
Here is the output of mtd_nandbiterrs,
[ 1830.546807] mtd_nandbiterrs: verify_page
[ 1830.551924] mtd_nandbiterrs: Successfully corrected 8 bit errors per subpage
[ 1830.558961] mtd_nandbiterrs: Inserted biterror @ 2/5
[ 1830.563917] mtd_nandbiterrs: rewrite page
[ 1830.568216] mtd_nandbiterrs: read_page
[ 1830.572155] mtd_nandbiterrs: verify_page
[ 1830.576531] mtd_nandbiterrs: Successfully corrected 9 bit errors per subpage
[ 1830.583568] mtd_nandbiterrs: Inserted biterror @ 2/2
[ 1830.588527] mtd_nandbiterrs: rewrite page
[ 1830.592881] mtd_nandbiterrs: read_page
[ 1830.596825] mtd_nandbiterrs: verify_page
[ 1830.601197] mtd_nandbiterrs: Successfully corrected 10 bit errors per subpage
[ 1830.608326] mtd_nandbiterrs: Inserted biterror @ 2/0
[ 1830.613279] mtd_nandbiterrs: rewrite page
[ 1830.617585] mtd_nandbiterrs: read_page
[ 1830.621524] mtd_nandbiterrs: verify_page
[ 1830.625900] mtd_nandbiterrs: Successfully corrected 11 bit errors per subpage
[ 1830.633027] mtd_nandbiterrs: Inserted biterror @ 3/7
[ 1830.637984] mtd_nandbiterrs: rewrite page
[ 1830.642281] mtd_nandbiterrs: read_page
[ 1830.646223] mtd_nandbiterrs: verify_page
[ 1830.650595] mtd_nandbiterrs: Successfully corrected 12 bit errors per subpage
[ 1830.657724] mtd_nandbiterrs: Inserted biterror @ 3/6
[ 1830.662677] mtd_nandbiterrs: rewrite page
[ 1830.666983] mtd_nandbiterrs: read_page
[ 1830.670922] mtd_nandbiterrs: verify_page
[ 1830.675296] mtd_nandbiterrs: Successfully corrected 13 bit errors per subpage
[ 1830.682417] mtd_nandbiterrs: Inserted biterror @ 3/5
[ 1830.687373] mtd_nandbiterrs: rewrite page
[ 1830.691671] mtd_nandbiterrs: read_page
[ 1830.695610] mtd_nandbiterrs: verify_page
[ 1830.699983] mtd_nandbiterrs: Successfully corrected 14 bit errors per subpage
[ 1830.707113] mtd_nandbiterrs: Inserted biterror @ 3/2
[ 1830.712067] mtd_nandbiterrs: rewrite page
[ 1830.716494] mtd_nandbiterrs: read_page
[ 1830.720459] mtd_nandbiterrs: verify_page
[ 1830.724841] mtd_nandbiterrs: Successfully corrected 15 bit errors per subpage
[ 1830.731963] mtd_nandbiterrs: Inserted biterror @ 3/0
[ 1830.736920] mtd_nandbiterrs: rewrite page
[ 1830.741161] mtd_nandbiterrs: read_page
[ 1830.745107] mtd_nandbiterrs: verify_page
[ 1830.749478] mtd_nandbiterrs: Successfully corrected 16 bit errors per subpage
[ 1830.756607] mtd_nandbiterrs: Inserted biterror @ 4/2
[ 1830.761564] mtd_nandbiterrs: rewrite page
[ 1830.765924] mtd_nandbiterrs: read_page
[ 1830.769858] mtd_nandbiterrs: verify_page
[ 1830.774232] mtd_nandbiterrs: Successfully corrected 17 bit errors per subpage
[ 1830.781362] mtd_nandbiterrs: Inserted biterror @ 4/0
[ 1830.786318] mtd_nandbiterrs: rewrite page
[ 1830.790558] mtd_nandbiterrs: read_page
[ 1830.794496] mtd_nandbiterrs: verify_page
[ 1830.798867] mtd_nandbiterrs: Successfully corrected 18 bit errors per subpage
[ 1830.805997] mtd_nandbiterrs: Inserted biterror @ 5/7
[ 1830.810949] mtd_nandbiterrs: rewrite page
[ 1830.815249] mtd_nandbiterrs: read_page
[ 1830.819189] mtd_nandbiterrs: verify_page
[ 1830.823561] mtd_nandbiterrs: Successfully corrected 19 bit errors per subpage
[ 1830.830690] mtd_nandbiterrs: Inserted biterror @ 5/2
[ 1830.835646] mtd_nandbiterrs: rewrite page
[ 1830.839943] mtd_nandbiterrs: read_page
[ 1830.843886] mtd_nandbiterrs: verify_page
[ 1830.848252] mtd_nandbiterrs: Successfully corrected 20 bit errors per subpage
[ 1830.855378] mtd_nandbiterrs: Inserted biterror @ 5/0
[ 1830.860331] mtd_nandbiterrs: rewrite page
[ 1830.864580] mtd_nandbiterrs: read_page
[ 1830.868522] mtd_nandbiterrs: verify_page
[ 1830.872890] mtd_nandbiterrs: Successfully corrected 21 bit errors per subpage
[ 1830.880023] mtd_nandbiterrs: Inserted biterror @ 6/6
[ 1830.884975] mtd_nandbiterrs: rewrite page
[ 1830.889224] mtd_nandbiterrs: read_page
[ 1830.893158] mtd_nandbiterrs: verify_page
[ 1830.897536] mtd_nandbiterrs: Successfully corrected 22 bit errors per subpage
[ 1830.904663] mtd_nandbiterrs: Inserted biterror @ 6/2
[ 1830.909619] mtd_nandbiterrs: rewrite page
[ 1830.913950] mtd_nandbiterrs: read_page
[ 1830.917893] mtd_nandbiterrs: verify_page
[ 1830.922261] mtd_nandbiterrs: Successfully corrected 23 bit errors per subpage
[ 1830.929384] mtd_nandbiterrs: Inserted biterror @ 6/0
[ 1830.934340] mtd_nandbiterrs: rewrite page
[ 1830.938579] mtd_nandbiterrs: read_page
[ 1830.942519] mtd_nandbiterrs: verify_page
[ 1830.946884] mtd_nandbiterrs: Successfully corrected 24 bit errors per subpage
[ 1830.954010] mtd_nandbiterrs: Inserted biterror @ 7/7
[ 1830.958963] mtd_nandbiterrs: rewrite page
[ 1830.963264] mtd_nandbiterrs: read_page
[ 1830.967143] mtd_nandbiterrs: verify_page
[ 1830.971061] mtd_nandbiterrs: Error: page offset 0, expected 25, got 00
[ 1830.977584] mtd_nandbiterrs: Error: page offset 1, expected a5, got 00
[ 1830.984103] mtd_nandbiterrs: Error: page offset 2, expected 65, got 00
[ 1830.990621] mtd_nandbiterrs: Error: page offset 3, expected e5, got 00
[ 1830.997141] mtd_nandbiterrs: Error: page offset 4, expected 05, got 00
[ 1831.003659] mtd_nandbiterrs: Error: page offset 5, expected 85, got 00
[ 1831.010179] mtd_nandbiterrs: Error: page offset 6, expected 45, got 00
[ 1831.016695] mtd_nandbiterrs: Error: page offset 7, expected c5, got 45
[ 1831.023665] mtd_nandbiterrs: ECC failure, read data is incorrect despite read success
modprobe: can't load module mtd_nandbiterrs (kernel/drivers/mtd/tests/mtd_nandbiterrs.ko): Input/output error
---> Test fail, unable to start nand_mtd_nandbiterrs client on the target
I ran this on v12 series, but it didn't work straight away. I changed the code to make it work for this test.
I found one problem that, the driver will work always if the page programming sequence 0x80 followed by 0x10.
i.e.
[1]:nand_prog_page_op(chip, page, 0, buf, mtd->writesize)-> this op sequence is working with this driver.
[2]: nand_prog_page_begin_op(chip, page, 0, NULL, 0) -> this op sequence is not working with this driver.
The Arasan NAND controller is expecting 0x80 as first opcode and 0x10 as second opcode in the command register (off: 0xFF10000C).
Hence in v11 series, I have added a check such that if the command is 0x080, then hardcode the second command as 0x10.
But as per the Boris comments, I removed that hardcoding and it is working only if the write sequence is [1] as mentioned above.
>
> >
> > > >
> > > > > But we are hitting this because of erased page reading(needed in case of ubifs).
> > > > >
> > > > > >
> > > > > > Don't you have a bit (or several bits) reporting when the ECC engine was not
> able to
> > > > correct
> > > > > > data? I you do, you should base the "detect bitflips in erase pages" logic on this
> information.
> > > > > Bit reporting for several bit errors is there only for Hamming(1bit correction and
> 2bit
> > > > detection) but not in BCH.
> > > > >
> > > >
> > > > Then I tend to agree with Miquel: your ECC engine is broken, and I'm
> > > > not even sure how to deal with that yet.
> > > So as per the Miquel's suggestion, can I proceed to add the below one?
> > > "you should re-read the page in raw mode and check for the number of bitflips manually
> (thanks to the helpers in the core). Again, if the number of BF is above 16, we can assume the
> page is bad and increment ->ecc.failed accordingly."
> >
> > But that's just partially fixing the problem. And you didn't answer my
> > previous question: what happens when you configure the ECC engine in,
> > say 12bit/1024 and you end up with uncorrectable errors (more than 12
> > bitflips in a 1k block). What's the number reported ECC_ERR_CNT? Is it
> > set to 13?
>
> Please dump this register, and eventually what's the value of the Packet_bound_Err_count
> field ([0:7]) for each iteration of nandbiterrs -i.
> If there is no way, when the status bit is set, to discriminate if the data is reliable or was not
> corrected at all, it is gonna be a real issue and I don't think we want to support such engine.
On each iteration the error count value that I got during this test, is equal to the number of error bits introduced
i.e. for 1-bit error, the error count is 1
.......
24-bit errors, the error count is 24
But after that the error count is 0.
Thanks,
Naga Sureshkumar Relli
>
>
> Thanks,
> Miquèl
Hi Boris,
> -----Original Message-----
> From: Boris Brezillon [mailto:[email protected]]
> Sent: Tuesday, November 20, 2018 9:55 PM
> To: Naga Sureshkumar Relli <[email protected]>
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; Michal Simek
> <[email protected]>; [email protected]; [email protected]; linux-
> [email protected]; [email protected]
> Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan
> NAND Flash Controller
>
> On Fri, 9 Nov 2018 10:30:41 +0530
> Naga Sureshkumar Relli <[email protected]> wrote:
>
> > +static int anfc_setup_data_interface(struct nand_chip *chip, int csline,
> > + const struct nand_data_interface *conf) {
> > + struct anfc_nand_controller *nfc = to_anfc(chip->controller);
> > + struct anfc_nand_chip *achip = to_anfc_nand(chip);
> > + int err;
> > + const struct nand_sdr_timings *sdr;
> > + u32 inftimeval;
> > + bool change_sdr_clk = false;
> > +
> > + if (csline == NAND_DATA_IFACE_CHECK_ONLY)
> > + return 0;
> > +
> > + /*
> > + * If the controller is already in the same mode as flash device
> > + * then no need to change the timing mode again.
> > + */
> > + sdr = nand_get_sdr_timings(conf);
> > + if (IS_ERR(sdr))
> > + return PTR_ERR(sdr);
> > +
> > + if (sdr->mode < 0)
> > + return -ENOTSUPP;
> > +
> > + inftimeval = sdr->mode & 7;
> > + if (sdr->mode >= 2 && sdr->mode <= 5)
> > + change_sdr_clk = true;
> > + /*
> > + * SDR timing modes 2-5 will not work for the arasan nand when
> > + * freq > 90 MHz, so reduce the freq in SDR modes 2-5 to < 90Mhz
>
> What's the freq for mode 0 and 1?
It is 100MHz in SDR mode 0 and 1.
>
> > + */
> > + if (change_sdr_clk) {
> > + clk_disable_unprepare(nfc->clk_sys);
> > + err = clk_set_rate(nfc->clk_sys, SDR_MODE_DEFLT_FREQ);
>
> You should not change the clk rate here. It should be done when the chip is selected, so that,
> if you ever have 2 different chips connected to the same controller, you can adapt the rate
> when they are accessed.
Ok, got it. I will update.
>
> > + if (err) {
> > + dev_err(nfc->dev, "Can't set the clock rate\n");
> > + return err;
> > + }
> > + err = clk_prepare_enable(nfc->clk_sys);
> > + if (err) {
> > + dev_err(nfc->dev, "Unable to enable sys clock.\n");
> > + clk_disable_unprepare(nfc->clk_sys);
> > + return err;
> > + }
> > + }
> > + achip->inftimeval = inftimeval;
> > +
> > + return 0;
> > +}
> > +
> > +static int anfc_nand_attach_chip(struct nand_chip *chip) {
> > + struct mtd_info *mtd = nand_to_mtd(chip);
> > + struct anfc_nand_chip *achip = to_anfc_nand(chip);
> > + u32 ret;
> > +
> > + if (mtd->writesize <= SZ_512)
> > + achip->spare_caddr_cycles = 1;
> > + else
> > + achip->spare_caddr_cycles = 2;
> > +
> > + chip->ecc.calc_buf = kmalloc(mtd->oobsize, GFP_KERNEL);
> > + chip->ecc.code_buf = kmalloc(mtd->oobsize, GFP_KERNEL);
>
> Those bufs are allocated but never freed (memleak). Also, are you sure you really need them.
These bufs are freed in nand_release(), which is calling from anfc_remove().
And chip->ecc.code_buf, is used in anfc_read_page_hwecc().
What we are doing here is, extract ECC code from OOB and place it in ecv.code_buf, and passing this info to nand_check_ecc_chunk(buf, chip->ecc.size, &ecc_code[i], eccbytes, NULL, 0,chip->ecc.strength).
i.e. just to store ECC code from OOB area.
And chip->ecc.calc_buf is no where used in the driver, I will remove it.
Thanks,
Naga Sureshkumar Relli.
Hi Boris & Miquel,
An update to my comments on thread https://lkml.org/lkml/2018/11/15/656.
In this I said, will take a default error count value as 16 and during page read, will check the error count
Register value with this and if it is equal to or greater than the default count(16) then I am checking for
Erased pages.
But bit[7:0] in ECC_Error_Count_Register(0x38) will update for each error occurred.
Link: https://www.xilinx.com/html_docs/registers/ug1087/ug1087-zynq-ultrascale-registers.html and
check for NAND module, ECC_Error_Count_Register.
I mean previously I dependent on Total error count value (bit[16:8]), but we can simply check for bit[7:0]
To see the error occurred or not.
I tried with this approach and I don't see any issues with that.
I ran ubifs with this and I am able to see the bit[7:0] count is updated for erased page read and then will
Use nand_chech_erased_ecc_chunk() to see the bitflips.
If it is ok, I will update the driver and will send new patch, but do you have any other comments on v12?
Thanks,
Naga Sureshkumar Relli
> -----Original Message-----
> From: linux-mtd [mailto:[email protected]] On Behalf Of Naga
> Sureshkumar Relli
> Sent: Friday, November 23, 2018 7:24 PM
> To: Miquel Raynal <[email protected]>; Boris Brezillon
> <[email protected]>
> Cc: [email protected]; [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; Michal Simek
> <[email protected]>; [email protected]; [email protected]
> Subject: RE: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan
> NAND Flash Controller
>
> Hi Boris & Miquel,
>
> > -----Original Message-----
> > From: Miquel Raynal [mailto:[email protected]]
> > Sent: Tuesday, November 20, 2018 6:06 PM
> > To: Boris Brezillon <[email protected]>
> > Cc: Naga Sureshkumar Relli <[email protected]>; [email protected];
> > [email protected]; [email protected]; [email protected]; linux-
> > [email protected]; [email protected]; [email protected];
> > [email protected]; Michal Simek <[email protected]>
> > Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan
> > NAND Flash Controller
> >
> > Hi Naga,
> >
> > Boris Brezillon <[email protected]> wrote on Tue, 20 Nov 2018
> > 12:02:44 +0100:
> >
> > > On Tue, 20 Nov 2018 07:02:08 +0000
> > > Naga Sureshkumar Relli <[email protected]> wrote:
> > >
> > >
> > > > >
> > > > > Can you please run nandbiterrs (availaible in mtd-utils). I fear your
> > > > > device won't pass the test.
> > > > Yes, nandbiterror test is passing till 24bit, after that it is failing.
> > >
> > > Can you paste the output of nandbiterrs please?
> >
> > Apparently 'nandbiterrs -i 'just crashes the kernel because of a segmentation fault. Please
> run
> > this test (from the mtd-utils package) and fix this issue. Then we would like to see the
> output.
> Here is the output of mtd_nandbiterrs,
> [ 1830.546807] mtd_nandbiterrs: verify_page
> [ 1830.551924] mtd_nandbiterrs: Successfully corrected 8 bit errors per subpage
> [ 1830.558961] mtd_nandbiterrs: Inserted biterror @ 2/5
> [ 1830.563917] mtd_nandbiterrs: rewrite page
> [ 1830.568216] mtd_nandbiterrs: read_page
> [ 1830.572155] mtd_nandbiterrs: verify_page
> [ 1830.576531] mtd_nandbiterrs: Successfully corrected 9 bit errors per subpage
> [ 1830.583568] mtd_nandbiterrs: Inserted biterror @ 2/2
> [ 1830.588527] mtd_nandbiterrs: rewrite page
> [ 1830.592881] mtd_nandbiterrs: read_page
> [ 1830.596825] mtd_nandbiterrs: verify_page
> [ 1830.601197] mtd_nandbiterrs: Successfully corrected 10 bit errors per subpage
> [ 1830.608326] mtd_nandbiterrs: Inserted biterror @ 2/0
> [ 1830.613279] mtd_nandbiterrs: rewrite page
> [ 1830.617585] mtd_nandbiterrs: read_page
> [ 1830.621524] mtd_nandbiterrs: verify_page
> [ 1830.625900] mtd_nandbiterrs: Successfully corrected 11 bit errors per subpage
> [ 1830.633027] mtd_nandbiterrs: Inserted biterror @ 3/7
> [ 1830.637984] mtd_nandbiterrs: rewrite page
> [ 1830.642281] mtd_nandbiterrs: read_page
> [ 1830.646223] mtd_nandbiterrs: verify_page
> [ 1830.650595] mtd_nandbiterrs: Successfully corrected 12 bit errors per subpage
> [ 1830.657724] mtd_nandbiterrs: Inserted biterror @ 3/6
> [ 1830.662677] mtd_nandbiterrs: rewrite page
> [ 1830.666983] mtd_nandbiterrs: read_page
> [ 1830.670922] mtd_nandbiterrs: verify_page
> [ 1830.675296] mtd_nandbiterrs: Successfully corrected 13 bit errors per subpage
> [ 1830.682417] mtd_nandbiterrs: Inserted biterror @ 3/5
> [ 1830.687373] mtd_nandbiterrs: rewrite page
> [ 1830.691671] mtd_nandbiterrs: read_page
> [ 1830.695610] mtd_nandbiterrs: verify_page
> [ 1830.699983] mtd_nandbiterrs: Successfully corrected 14 bit errors per subpage
> [ 1830.707113] mtd_nandbiterrs: Inserted biterror @ 3/2
> [ 1830.712067] mtd_nandbiterrs: rewrite page
> [ 1830.716494] mtd_nandbiterrs: read_page
> [ 1830.720459] mtd_nandbiterrs: verify_page
> [ 1830.724841] mtd_nandbiterrs: Successfully corrected 15 bit errors per subpage
> [ 1830.731963] mtd_nandbiterrs: Inserted biterror @ 3/0
> [ 1830.736920] mtd_nandbiterrs: rewrite page
> [ 1830.741161] mtd_nandbiterrs: read_page
> [ 1830.745107] mtd_nandbiterrs: verify_page
> [ 1830.749478] mtd_nandbiterrs: Successfully corrected 16 bit errors per subpage
> [ 1830.756607] mtd_nandbiterrs: Inserted biterror @ 4/2
> [ 1830.761564] mtd_nandbiterrs: rewrite page
> [ 1830.765924] mtd_nandbiterrs: read_page
> [ 1830.769858] mtd_nandbiterrs: verify_page
> [ 1830.774232] mtd_nandbiterrs: Successfully corrected 17 bit errors per subpage
> [ 1830.781362] mtd_nandbiterrs: Inserted biterror @ 4/0
> [ 1830.786318] mtd_nandbiterrs: rewrite page
> [ 1830.790558] mtd_nandbiterrs: read_page
> [ 1830.794496] mtd_nandbiterrs: verify_page
> [ 1830.798867] mtd_nandbiterrs: Successfully corrected 18 bit errors per subpage
> [ 1830.805997] mtd_nandbiterrs: Inserted biterror @ 5/7
> [ 1830.810949] mtd_nandbiterrs: rewrite page
> [ 1830.815249] mtd_nandbiterrs: read_page
> [ 1830.819189] mtd_nandbiterrs: verify_page
> [ 1830.823561] mtd_nandbiterrs: Successfully corrected 19 bit errors per subpage
> [ 1830.830690] mtd_nandbiterrs: Inserted biterror @ 5/2
> [ 1830.835646] mtd_nandbiterrs: rewrite page
> [ 1830.839943] mtd_nandbiterrs: read_page
> [ 1830.843886] mtd_nandbiterrs: verify_page
> [ 1830.848252] mtd_nandbiterrs: Successfully corrected 20 bit errors per subpage
> [ 1830.855378] mtd_nandbiterrs: Inserted biterror @ 5/0
> [ 1830.860331] mtd_nandbiterrs: rewrite page
> [ 1830.864580] mtd_nandbiterrs: read_page
> [ 1830.868522] mtd_nandbiterrs: verify_page
> [ 1830.872890] mtd_nandbiterrs: Successfully corrected 21 bit errors per subpage
> [ 1830.880023] mtd_nandbiterrs: Inserted biterror @ 6/6
> [ 1830.884975] mtd_nandbiterrs: rewrite page
> [ 1830.889224] mtd_nandbiterrs: read_page
> [ 1830.893158] mtd_nandbiterrs: verify_page
> [ 1830.897536] mtd_nandbiterrs: Successfully corrected 22 bit errors per subpage
> [ 1830.904663] mtd_nandbiterrs: Inserted biterror @ 6/2
> [ 1830.909619] mtd_nandbiterrs: rewrite page
> [ 1830.913950] mtd_nandbiterrs: read_page
> [ 1830.917893] mtd_nandbiterrs: verify_page
> [ 1830.922261] mtd_nandbiterrs: Successfully corrected 23 bit errors per subpage
> [ 1830.929384] mtd_nandbiterrs: Inserted biterror @ 6/0
> [ 1830.934340] mtd_nandbiterrs: rewrite page
> [ 1830.938579] mtd_nandbiterrs: read_page
> [ 1830.942519] mtd_nandbiterrs: verify_page
> [ 1830.946884] mtd_nandbiterrs: Successfully corrected 24 bit errors per subpage
> [ 1830.954010] mtd_nandbiterrs: Inserted biterror @ 7/7
> [ 1830.958963] mtd_nandbiterrs: rewrite page
> [ 1830.963264] mtd_nandbiterrs: read_page
> [ 1830.967143] mtd_nandbiterrs: verify_page
> [ 1830.971061] mtd_nandbiterrs: Error: page offset 0, expected 25, got 00
> [ 1830.977584] mtd_nandbiterrs: Error: page offset 1, expected a5, got 00
> [ 1830.984103] mtd_nandbiterrs: Error: page offset 2, expected 65, got 00
> [ 1830.990621] mtd_nandbiterrs: Error: page offset 3, expected e5, got 00
> [ 1830.997141] mtd_nandbiterrs: Error: page offset 4, expected 05, got 00
> [ 1831.003659] mtd_nandbiterrs: Error: page offset 5, expected 85, got 00
> [ 1831.010179] mtd_nandbiterrs: Error: page offset 6, expected 45, got 00
> [ 1831.016695] mtd_nandbiterrs: Error: page offset 7, expected c5, got 45
> [ 1831.023665] mtd_nandbiterrs: ECC failure, read data is incorrect despite read success
> modprobe: can't load module mtd_nandbiterrs
> (kernel/drivers/mtd/tests/mtd_nandbiterrs.ko): Input/output error
> ---> Test fail, unable to start nand_mtd_nandbiterrs client on the target
> I ran this on v12 series, but it didn't work straight away. I changed the code to make it work
> for this test.
> I found one problem that, the driver will work always if the page programming sequence 0x80
> followed by 0x10.
> i.e.
> [1]:nand_prog_page_op(chip, page, 0, buf, mtd->writesize)-> this op sequence is working
> with this driver.
> [2]: nand_prog_page_begin_op(chip, page, 0, NULL, 0) -> this op sequence is not working
> with this driver.
> The Arasan NAND controller is expecting 0x80 as first opcode and 0x10 as second opcode in
> the command register (off: 0xFF10000C).
> Hence in v11 series, I have added a check such that if the command is 0x080, then hardcode
> the second command as 0x10.
> But as per the Boris comments, I removed that hardcoding and it is working only if the write
> sequence is [1] as mentioned above.
>
> >
> > >
> > > > >
> > > > > > But we are hitting this because of erased page reading(needed in case of ubifs).
> > > > > >
> > > > > > >
> > > > > > > Don't you have a bit (or several bits) reporting when the ECC engine was not
> > able to
> > > > > correct
> > > > > > > data? I you do, you should base the "detect bitflips in erase pages" logic on this
> > information.
> > > > > > Bit reporting for several bit errors is there only for Hamming(1bit correction and
> > 2bit
> > > > > detection) but not in BCH.
> > > > > >
> > > > >
> > > > > Then I tend to agree with Miquel: your ECC engine is broken, and I'm
> > > > > not even sure how to deal with that yet.
> > > > So as per the Miquel's suggestion, can I proceed to add the below one?
> > > > "you should re-read the page in raw mode and check for the number of bitflips manually
> > (thanks to the helpers in the core). Again, if the number of BF is above 16, we can assume
> the
> > page is bad and increment ->ecc.failed accordingly."
> > >
> > > But that's just partially fixing the problem. And you didn't answer my
> > > previous question: what happens when you configure the ECC engine in,
> > > say 12bit/1024 and you end up with uncorrectable errors (more than 12
> > > bitflips in a 1k block). What's the number reported ECC_ERR_CNT? Is it
> > > set to 13?
> >
> > Please dump this register, and eventually what's the value of the Packet_bound_Err_count
> > field ([0:7]) for each iteration of nandbiterrs -i.
> > If there is no way, when the status bit is set, to discriminate if the data is reliable or was not
> > corrected at all, it is gonna be a real issue and I don't think we want to support such engine.
> On each iteration the error count value that I got during this test, is equal to the number of
> error bits introduced
> i.e. for 1-bit error, the error count is 1
> .......
> 24-bit errors, the error count is 24
> But after that the error count is 0.
>
> Thanks,
> Naga Sureshkumar Relli
> >
> >
> > Thanks,
> > Miquèl
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
Hi Naga,
Naga Sureshkumar Relli <[email protected]> wrote on Wed, 12 Dec 2018
05:27:03 +0000:
> Hi Boris & Miquel,
>
> An update to my comments on thread https://lkml.org/lkml/2018/11/15/656.
> In this I said, will take a default error count value as 16 and during page read, will check the error count
> Register value with this and if it is equal to or greater than the default count(16) then I am checking for
> Erased pages.
> But bit[7:0] in ECC_Error_Count_Register(0x38) will update for each error occurred.
> Link: https://www.xilinx.com/html_docs/registers/ug1087/ug1087-zynq-ultrascale-registers.html and
> check for NAND module, ECC_Error_Count_Register.
>
> I mean previously I dependent on Total error count value (bit[16:8]), but we can simply check for bit[7:0]
> To see the error occurred or not.
> I tried with this approach and I don't see any issues with that.
> I ran ubifs with this and I am able to see the bit[7:0] count is updated for erased page read and then will
> Use nand_chech_erased_ecc_chunk() to see the bitflips.
>
> If it is ok, I will update the driver and will send new patch, but do you have any other comments on v12?
Is 'nandbiterrs -i' running correctly now?
Thanks,
Miquèl
Hi Miquel,
> -----Original Message-----
> From: Miquel Raynal [mailto:[email protected]]
> Sent: Wednesday, December 12, 2018 1:42 PM
> To: Naga Sureshkumar Relli <[email protected]>
> Cc: Boris Brezillon <[email protected]>; [email protected]; [email protected]; linux-
> [email protected]; [email protected]; [email protected];
> [email protected]; Michal Simek <[email protected]>;
> [email protected]; [email protected]
> Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan
> NAND Flash Controller
>
> Hi Naga,
>
> Naga Sureshkumar Relli <[email protected]> wrote on Wed, 12 Dec 2018
> 05:27:03 +0000:
>
> > Hi Boris & Miquel,
> >
> > An update to my comments on thread https://lkml.org/lkml/2018/11/15/656.
> > In this I said, will take a default error count value as 16 and during
> > page read, will check the error count Register value with this and if
> > it is equal to or greater than the default count(16) then I am checking for Erased pages.
> > But bit[7:0] in ECC_Error_Count_Register(0x38) will update for each error occurred.
> > Link:
> > https://www.xilinx.com/html_docs/registers/ug1087/ug1087-zynq-ultrascale-
> registers.html and check for NAND module, ECC_Error_Count_Register.
> >
> > I mean previously I dependent on Total error count value (bit[16:8]),
> > but we can simply check for bit[7:0] To see the error occurred or not.
> > I tried with this approach and I don't see any issues with that.
> > I ran ubifs with this and I am able to see the bit[7:0] count is
> > updated for erased page read and then will Use nand_chech_erased_ecc_chunk() to see the
> bitflips.
> >
> > If it is ok, I will update the driver and will send new patch, but do you have any other
> comments on v12?
>
> Is 'nandbiterrs -i' running correctly now?
Yes, but with some changes in driver.
I have added the log and changes done in https://lkml.org/lkml/2018/11/23/705.
Do you see any issues with that approach?
Thanks,
Naga Sureshkumar Relli.
Hi Naga,
Naga Sureshkumar Relli <[email protected]> wrote on Wed, 12 Dec 2018
09:04:16 +0000:
> Hi Miquel,
>
> > -----Original Message-----
> > From: Miquel Raynal [mailto:[email protected]]
> > Sent: Wednesday, December 12, 2018 1:42 PM
> > To: Naga Sureshkumar Relli <[email protected]>
> > Cc: Boris Brezillon <[email protected]>; [email protected]; [email protected]; linux-
> > [email protected]; [email protected]; [email protected];
> > [email protected]; Michal Simek <[email protected]>;
> > [email protected]; [email protected]
> > Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan
> > NAND Flash Controller
> >
> > Hi Naga,
> >
> > Naga Sureshkumar Relli <[email protected]> wrote on Wed, 12 Dec 2018
> > 05:27:03 +0000:
> >
> > > Hi Boris & Miquel,
> > >
> > > An update to my comments on thread https://lkml.org/lkml/2018/11/15/656.
> > > In this I said, will take a default error count value as 16 and during
> > > page read, will check the error count Register value with this and if
> > > it is equal to or greater than the default count(16) then I am checking for Erased pages.
> > > But bit[7:0] in ECC_Error_Count_Register(0x38) will update for each error occurred.
> > > Link:
> > > https://www.xilinx.com/html_docs/registers/ug1087/ug1087-zynq-ultrascale-
> > registers.html and check for NAND module, ECC_Error_Count_Register.
> > >
> > > I mean previously I dependent on Total error count value (bit[16:8]),
> > > but we can simply check for bit[7:0] To see the error occurred or not.
> > > I tried with this approach and I don't see any issues with that.
> > > I ran ubifs with this and I am able to see the bit[7:0] count is
> > > updated for erased page read and then will Use nand_chech_erased_ecc_chunk() to see the
> > bitflips.
> > >
> > > If it is ok, I will update the driver and will send new patch, but do you have any other
> > comments on v12?
> >
> > Is 'nandbiterrs -i' running correctly now?
> Yes, but with some changes in driver.
> I have added the log and changes done in https://lkml.org/lkml/2018/11/23/705.
No, I don't see a working nandbiterrs there, sorry.
Thanks,
Miquèl
Hi Miquel,
> -----Original Message-----
> From: Miquel Raynal [mailto:[email protected]]
> Sent: Wednesday, December 12, 2018 2:40 PM
> To: Naga Sureshkumar Relli <[email protected]>
> Cc: Boris Brezillon <[email protected]>; [email protected]; [email protected]; linux-
> [email protected]; [email protected]; [email protected];
> [email protected]; Michal Simek <[email protected]>;
> [email protected]; [email protected]
> Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan
> NAND Flash Controller
>
> Hi Naga,
>
> Naga Sureshkumar Relli <[email protected]> wrote on Wed, 12 Dec 2018
> 09:04:16 +0000:
>
> > Hi Miquel,
> >
> > > -----Original Message-----
> > > From: Miquel Raynal [mailto:[email protected]]
> > > Sent: Wednesday, December 12, 2018 1:42 PM
> > > To: Naga Sureshkumar Relli <[email protected]>
> > > Cc: Boris Brezillon <[email protected]>; [email protected];
> > > [email protected]; linux- [email protected];
> > > [email protected]; [email protected];
> > > [email protected]; Michal Simek <[email protected]>;
> > > [email protected]; [email protected]
> > > Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support
> > > for Arasan NAND Flash Controller
> > >
> > > Hi Naga,
> > >
> > > Naga Sureshkumar Relli <[email protected]> wrote on Wed, 12 Dec
> > > 2018
> > > 05:27:03 +0000:
> > >
> > > > Hi Boris & Miquel,
> > > >
> > > > An update to my comments on thread https://lkml.org/lkml/2018/11/15/656.
> > > > In this I said, will take a default error count value as 16 and
> > > > during page read, will check the error count Register value with
> > > > this and if it is equal to or greater than the default count(16) then I am checking for
> Erased pages.
> > > > But bit[7:0] in ECC_Error_Count_Register(0x38) will update for each error occurred.
> > > > Link:
> > > > https://www.xilinx.com/html_docs/registers/ug1087/ug1087-zynq-ultr
> > > > ascale-
> > > registers.html and check for NAND module, ECC_Error_Count_Register.
> > > >
> > > > I mean previously I dependent on Total error count value
> > > > (bit[16:8]), but we can simply check for bit[7:0] To see the error occurred or not.
> > > > I tried with this approach and I don't see any issues with that.
> > > > I ran ubifs with this and I am able to see the bit[7:0] count is
> > > > updated for erased page read and then will Use
> > > > nand_chech_erased_ecc_chunk() to see the
> > > bitflips.
> > > >
> > > > If it is ok, I will update the driver and will send new patch, but
> > > > do you have any other
> > > comments on v12?
> > >
> > > Is 'nandbiterrs -i' running correctly now?
> > Yes, but with some changes in driver.
> > I have added the log and changes done in https://lkml.org/lkml/2018/11/23/705.
>
> No, I don't see a working nandbiterrs there, sorry.
The log that I have attached is from mtd_nandbiterrs test
So as per ARASAN controller ECC mechanism, it will correct upto 24-bit. After that the test will fail.
I am running mtd-utils nandbiterr test now. Will let you know once I completed that.
Thanks,
Naga Sureshkumar Relli
>
>
> Thanks,
> Miquèl
Hi Naga,
Naga Sureshkumar Relli <[email protected]> wrote on Wed, 12 Dec 2018
13:07:42 +0000:
> Hi Miquel,
>
> > -----Original Message-----
> > From: Miquel Raynal [mailto:[email protected]]
> > Sent: Wednesday, December 12, 2018 2:40 PM
> > To: Naga Sureshkumar Relli <[email protected]>
> > Cc: Boris Brezillon <[email protected]>; [email protected]; [email protected]; linux-
> > [email protected]; [email protected]; [email protected];
> > [email protected]; Michal Simek <[email protected]>;
> > [email protected]; [email protected]
> > Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan
> > NAND Flash Controller
> >
> > Hi Naga,
> >
> > Naga Sureshkumar Relli <[email protected]> wrote on Wed, 12 Dec 2018
> > 09:04:16 +0000:
> >
> > > Hi Miquel,
> > >
> > > > -----Original Message-----
> > > > From: Miquel Raynal [mailto:[email protected]]
> > > > Sent: Wednesday, December 12, 2018 1:42 PM
> > > > To: Naga Sureshkumar Relli <[email protected]>
> > > > Cc: Boris Brezillon <[email protected]>; [email protected];
> > > > [email protected]; linux- [email protected];
> > > > [email protected]; [email protected];
> > > > [email protected]; Michal Simek <[email protected]>;
> > > > [email protected]; [email protected]
> > > > Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support
> > > > for Arasan NAND Flash Controller
> > > >
> > > > Hi Naga,
> > > >
> > > > Naga Sureshkumar Relli <[email protected]> wrote on Wed, 12 Dec
> > > > 2018
> > > > 05:27:03 +0000:
> > > >
> > > > > Hi Boris & Miquel,
> > > > >
> > > > > An update to my comments on thread https://lkml.org/lkml/2018/11/15/656.
> > > > > In this I said, will take a default error count value as 16 and
> > > > > during page read, will check the error count Register value with
> > > > > this and if it is equal to or greater than the default count(16) then I am checking for
> > Erased pages.
> > > > > But bit[7:0] in ECC_Error_Count_Register(0x38) will update for each error occurred.
> > > > > Link:
> > > > > https://www.xilinx.com/html_docs/registers/ug1087/ug1087-zynq-ultr
> > > > > ascale-
> > > > registers.html and check for NAND module, ECC_Error_Count_Register.
> > > > >
> > > > > I mean previously I dependent on Total error count value
> > > > > (bit[16:8]), but we can simply check for bit[7:0] To see the error occurred or not.
> > > > > I tried with this approach and I don't see any issues with that.
> > > > > I ran ubifs with this and I am able to see the bit[7:0] count is
> > > > > updated for erased page read and then will Use
> > > > > nand_chech_erased_ecc_chunk() to see the
> > > > bitflips.
> > > > >
> > > > > If it is ok, I will update the driver and will send new patch, but
> > > > > do you have any other
> > > > comments on v12?
> > > >
> > > > Is 'nandbiterrs -i' running correctly now?
> > > Yes, but with some changes in driver.
> > > I have added the log and changes done in https://lkml.org/lkml/2018/11/23/705.
> >
> > No, I don't see a working nandbiterrs there, sorry.
> The log that I have attached is from mtd_nandbiterrs test
> So as per ARASAN controller ECC mechanism, it will correct upto 24-bit. After that the test will fail.
There is a distinction between:
1/ The driver fails to correct more than 24-bit and advertise the
caller that the page read is somehow corrupted.
2/ The driver fails to correct more than 24-bit but does not complain
about it. In our case, the caller (the test tool) will compare the
page written and read: if it do not match it means the driver is
broken because the driver reported a successful operation despite
the fact that it returned a corrupted page.
You are in the second case, we expect the driver to behave like in 1/.
>
> I am running mtd-utils nandbiterr test now. Will let you know once I completed that.
Yes please, prefer using the userspace tools.
Thanks,
Miquèl
Hi Miquel,
> -----Original Message-----
> From: Miquel Raynal [mailto:[email protected]]
> Sent: Wednesday, December 12, 2018 6:48 PM
> To: Naga Sureshkumar Relli <[email protected]>
> Cc: Boris Brezillon <[email protected]>; [email protected]; [email protected]; linux-
> [email protected]; [email protected]; [email protected];
> [email protected]; Michal Simek <[email protected]>;
> [email protected]; [email protected]
> Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan
> NAND Flash Controller
>
> Hi Naga,
>
> Naga Sureshkumar Relli <[email protected]> wrote on Wed, 12 Dec 2018
> 13:07:42 +0000:
>
> > Hi Miquel,
> >
> > > -----Original Message-----
> > > From: Miquel Raynal [mailto:[email protected]]
> > > Sent: Wednesday, December 12, 2018 2:40 PM
> > > To: Naga Sureshkumar Relli <[email protected]>
> > > Cc: Boris Brezillon <[email protected]>; [email protected];
> > > [email protected]; linux- [email protected];
> > > [email protected]; [email protected];
> > > [email protected]; Michal Simek <[email protected]>;
> > > [email protected]; [email protected]
> > > Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support
> > > for Arasan NAND Flash Controller
> > >
> > > Hi Naga,
> > >
> > > Naga Sureshkumar Relli <[email protected]> wrote on Wed, 12 Dec
> > > 2018
> > > 09:04:16 +0000:
> > >
> > > > Hi Miquel,
> > > >
> > > > > -----Original Message-----
> > > > > From: Miquel Raynal [mailto:[email protected]]
> > > > > Sent: Wednesday, December 12, 2018 1:42 PM
> > > > > To: Naga Sureshkumar Relli <[email protected]>
> > > > > Cc: Boris Brezillon <[email protected]>;
> > > > > [email protected]; [email protected]; linux- [email protected];
> > > > > [email protected]; [email protected];
> > > > > [email protected]; Michal Simek <[email protected]>;
> > > > > [email protected]; [email protected]
> > > > > Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add
> > > > > support for Arasan NAND Flash Controller
> > > > >
> > > > > Hi Naga,
> > > > >
> > > > > Naga Sureshkumar Relli <[email protected]> wrote on Wed, 12
> > > > > Dec
> > > > > 2018
> > > > > 05:27:03 +0000:
> > > > >
> > > > > > Hi Boris & Miquel,
> > > > > >
> > > > > > An update to my comments on thread https://lkml.org/lkml/2018/11/15/656.
> > > > > > In this I said, will take a default error count value as 16
> > > > > > and during page read, will check the error count Register
> > > > > > value with this and if it is equal to or greater than the
> > > > > > default count(16) then I am checking for
> > > Erased pages.
> > > > > > But bit[7:0] in ECC_Error_Count_Register(0x38) will update for each error
> occurred.
> > > > > > Link:
> > > > > > https://www.xilinx.com/html_docs/registers/ug1087/ug1087-zynq-
> > > > > > ultr
> > > > > > ascale-
> > > > > registers.html and check for NAND module, ECC_Error_Count_Register.
> > > > > >
> > > > > > I mean previously I dependent on Total error count value
> > > > > > (bit[16:8]), but we can simply check for bit[7:0] To see the error occurred or not.
> > > > > > I tried with this approach and I don't see any issues with that.
> > > > > > I ran ubifs with this and I am able to see the bit[7:0] count
> > > > > > is updated for erased page read and then will Use
> > > > > > nand_chech_erased_ecc_chunk() to see the
> > > > > bitflips.
> > > > > >
> > > > > > If it is ok, I will update the driver and will send new patch,
> > > > > > but do you have any other
> > > > > comments on v12?
> > > > >
> > > > > Is 'nandbiterrs -i' running correctly now?
> > > > Yes, but with some changes in driver.
> > > > I have added the log and changes done in https://lkml.org/lkml/2018/11/23/705.
> > >
> > > No, I don't see a working nandbiterrs there, sorry.
> > The log that I have attached is from mtd_nandbiterrs test So as per
> > ARASAN controller ECC mechanism, it will correct upto 24-bit. After that the test will fail.
>
> There is a distinction between:
> 1/ The driver fails to correct more than 24-bit and advertise the
> caller that the page read is somehow corrupted.
> 2/ The driver fails to correct more than 24-bit but does not complain
> about it. In our case, the caller (the test tool) will compare the
> page written and read: if it do not match it means the driver is
> broken because the driver reported a successful operation despite
> the fact that it returned a corrupted page.
>
> You are in the second case, we expect the driver to behave like in 1/.
I tried with mtd-utils(nandbiterr test) and here is the output of that
root@xilinx-zc1751-dc2-2018_1:~# nandbiterrs -i /dev/mtd0
incremental biterrors test
Successfully corrected 0 bit errors per subpage
Inserted biterror @ 1/7
Successfully corrected 1 bit errors per subpage
Inserted biterror @ 3/7
Successfully corrected 2 bit errors per subpage
Insertedbiterror @ 5/7
Successfully corrected 3 bit errors per subpage
Inserted biterror @ 7/7
Successfully corrected 4 bit errors per subpage
Inserted biterror @ 8/7
Successfully corrected 5 bit errors per subpage
Inserted biterror @ 10/7
Successfully corrected 6 bit errors per subpage
Inserted biterror @ 12/7
Successfully corrected 7 bit errors per subpage
Inserted biterror @ 14/7
Successfully corrected 8 bit errors per subpage
Inserted biterror @ 17/7
Successfully corrected 9 bit errors per subpage
Inserted biterror @ 19/7
Successfully corrected 10 bit errors per subpage
Inserted biterror @ 21/7
Successfully corrected 11 bit errors per subpage
Inserted biterror @ 23/7
Successfully corrected 12 bit errors per subpage
Inserted biterror @ 24/7
Successfully corrected 13 bit errors per subpage
Inserted biterror @ 26/7
Successfully corrected 14 bit errors per subpage
Inserted biterror @ 28/7
Successfully corrected 15 bit errors per subpage
Inserted biterror @ 30/7
Successfully corrected 16 bit errors per subpage
Inserted biterror @ 32/7
Successfully corrected 17 bit errors per subpage
Inserted biterror @ 34/7
Successfully corrected 18 bit errors per subpage
Inserted biterror @ 36/7
Successfully corrected 19 bit errors per subpage
Inserted biterror @ 38/7
Successfully corrected 20 bit errors per subpage
Inserted biterror @ 41/7
Successfully corrected 21 bit errors per subpage
Inserted biterror @ 43/7
Successfully corrected 22 bit errors per subpage
Inserted biterror @ 45/7
Successfully corrected 23 bit errors per subpage
Inserted biterror @ 47/7
Successfully corrected 24 bit errors per subpage
Inserted biterror @ 48/7
Successfully corrected 25 bit errors per subpage
Inserted biterror @ 50/7
ECC failure, invalid data despite read success
root@xilinx-zc1751-dc2-2018_1:~#
But even in this case also, driver is saying ECC failure but read success.
That means controller is able to detect errors on read page up to 24 bit only.
After that there is no way to say to the upper layers that the page is bad because of the limitation in the controller.
Could you please suggest any alternative to report the errors in that case?
Thanks,
Naga Sureshkumar Relli
>
> >
> > I am running mtd-utils nandbiterr test now. Will let you know once I completed that.
>
> Yes please, prefer using the userspace tools.
>
>
> Thanks,
> Miquèl
Hi Naga,
[...]
> Inserted biterror @ 48/7
> Successfully corrected 25 bit errors per subpage
> Inserted biterror @ 50/7
> ECC failure, invalid data despite read success
> root@xilinx-zc1751-dc2-2018_1:~#
>
> But even in this case also, driver is saying ECC failure but read success.
> That means controller is able to detect errors on read page up to 24 bit only.
> After that there is no way to say to the upper layers that the page is bad because of the limitation in the controller.
This is more than a "limitation", the design is broken. I am not sure
how to support such controller, and I am not sure if we even want to.
> Could you please suggest any alternative to report the errors in that case?
Shall we support the controller without the hw ECC engine? Boris, any
thoughts?
Thanks,
Miquèl
Hi Miquel,
> -----Original Message-----
> From: Miquel Raynal [mailto:[email protected]]
> Sent: Monday, December 17, 2018 10:11 PM
> To: Naga Sureshkumar Relli <[email protected]>
> Cc: Boris Brezillon <[email protected]>; [email protected]; [email protected]; linux-
> [email protected]; [email protected]; [email protected];
> [email protected]; Michal Simek <[email protected]>;
> [email protected]; [email protected]
> Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan
> NAND Flash Controller
>
> Hi Naga,
>
> [...]
>
> > Inserted biterror @ 48/7
> > Successfully corrected 25 bit errors per subpage Inserted biterror @
> > 50/7 ECC failure, invalid data despite read success
> > root@xilinx-zc1751-dc2-2018_1:~#
> >
> > But even in this case also, driver is saying ECC failure but read success.
> > That means controller is able to detect errors on read page up to 24 bit only.
> > After that there is no way to say to the upper layers that the page is bad because of the
> limitation in the controller.
>
> This is more than a "limitation", the design is broken. I am not sure how to support such
> controller, and I am not sure if we even want to.
The number of errors that are correctable is limited by a parameter 't'(total number of errors),
If there is a condition that the number of errors greater than 't', then the controller won't be able to detect that.
I guess this concept is same for other controllers as well.
In Arasan it is limited to 24-bit.
Even, in case of Hamming, it is 1-bit error correction and 2-bit error detection.
What will happen if there are multiple errors(greater than 2-bit)?
Thanks,
Naga Sureshkumar Relli
>
> > Could you please suggest any alternative to report the errors in that case?
>
> Shall we support the controller without the hw ECC engine? Boris, any thoughts?
>
>
> Thanks,
> Miquèl
Hi Naga,
+ Martin
Naga Sureshkumar Relli <[email protected]> wrote on Tue, 18 Dec 2018
05:33:53 +0000:
> Hi Miquel,
>
> > -----Original Message-----
> > From: Miquel Raynal [mailto:[email protected]]
> > Sent: Monday, December 17, 2018 10:11 PM
> > To: Naga Sureshkumar Relli <[email protected]>
> > Cc: Boris Brezillon <[email protected]>; [email protected]; [email protected]; linux-
> > [email protected]; [email protected]; [email protected];
> > [email protected]; Michal Simek <[email protected]>;
> > [email protected]; [email protected]
> > Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan
> > NAND Flash Controller
> >
> > Hi Naga,
> >
> > [...]
> >
> > > Inserted biterror @ 48/7
> > > Successfully corrected 25 bit errors per subpage Inserted biterror @
> > > 50/7 ECC failure, invalid data despite read success
> > > root@xilinx-zc1751-dc2-2018_1:~#
> > >
> > > But even in this case also, driver is saying ECC failure but read success.
> > > That means controller is able to detect errors on read page up to 24 bit only.
> > > After that there is no way to say to the upper layers that the page is bad because of the
> > limitation in the controller.
> >
> > This is more than a "limitation", the design is broken. I am not sure how to support such
> > controller, and I am not sure if we even want to.
>
> The number of errors that are correctable is limited by a parameter 't'(total number of errors),
> If there is a condition that the number of errors greater than 't', then the controller won't be able to detect that.
> I guess this concept is same for other controllers as well.
> In Arasan it is limited to 24-bit.
>
> Even, in case of Hamming, it is 1-bit error correction and 2-bit error detection.
> What will happen if there are multiple errors(greater than 2-bit)?
Ok let's use the Hamming comparison in your ECC engine case.
-> hamming:
* 0 bf: everything is fine
* 1 bf: will be detected, corrected, signaled
* 2 bf: will be detected, not corrected, signaled
* 3+ bf: don't care
-> BCH:
* 0 bf: everything is fine
* 1-24 bf: will be detected, corrected, signaled
* 25 bf: everything is fine
* 26+ bf: don't care
Do you see the problem?
In the 25 bf case, the controller is reporting that everything went
fine while it should report that it detected an uncorrectable
situation.
Here are two leads to solve this issue, please investigate them both:
1/ Talk to your colleagues that developed the RTL, ask if there is a
hidden/reserved bit for that purpose that is not documented.
2/ Search for a status in the registers that might indicate that an
error occurred, for instance "0 bf corrected" and "bf have been
detected".
NB: I know that, with a BCH ECC engine, error detection at (strength +
1) is not 100% sure but statistically it will almost always be detected
and in this case we need the controller to warn the user!
Thanks,
Miquèl
Hi Miquel,
> -----Original Message-----
> From: Miquel Raynal [mailto:[email protected]]
> Sent: Wednesday, December 19, 2018 7:57 PM
> To: Naga Sureshkumar Relli <[email protected]>
> Cc: Boris Brezillon <[email protected]>; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; Michal Simek <[email protected]>;
> [email protected]; [email protected]; [email protected]
> Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan
> NAND Flash Controller
>
> Hi Naga,
>
> + Martin
>
> Naga Sureshkumar Relli <[email protected]> wrote on Tue, 18 Dec 2018
> 05:33:53 +0000:
>
> > Hi Miquel,
> >
> > > -----Original Message-----
> > > From: Miquel Raynal [mailto:[email protected]]
> > > Sent: Monday, December 17, 2018 10:11 PM
> > > To: Naga Sureshkumar Relli <[email protected]>
> > > Cc: Boris Brezillon <[email protected]>; [email protected];
> > > [email protected]; linux- [email protected];
> > > [email protected]; [email protected];
> > > [email protected]; Michal Simek <[email protected]>;
> > > [email protected]; [email protected]
> > > Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support
> > > for Arasan NAND Flash Controller
> > >
> > > Hi Naga,
> > >
> > > [...]
> > >
> > > > Inserted biterror @ 48/7
> > > > Successfully corrected 25 bit errors per subpage Inserted biterror
> > > > @
> > > > 50/7 ECC failure, invalid data despite read success
> > > > root@xilinx-zc1751-dc2-2018_1:~#
> > > >
> > > > But even in this case also, driver is saying ECC failure but read success.
> > > > That means controller is able to detect errors on read page up to 24 bit only.
> > > > After that there is no way to say to the upper layers that the
> > > > page is bad because of the
> > > limitation in the controller.
> > >
> > > This is more than a "limitation", the design is broken. I am not
> > > sure how to support such controller, and I am not sure if we even want to.
> >
> > The number of errors that are correctable is limited by a parameter
> > 't'(total number of errors), If there is a condition that the number of errors greater than 't',
> then the controller won't be able to detect that.
> > I guess this concept is same for other controllers as well.
> > In Arasan it is limited to 24-bit.
> >
> > Even, in case of Hamming, it is 1-bit error correction and 2-bit error detection.
> > What will happen if there are multiple errors(greater than 2-bit)?
>
> Ok let's use the Hamming comparison in your ECC engine case.
>
> -> hamming:
> * 0 bf: everything is fine
> * 1 bf: will be detected, corrected, signaled
> * 2 bf: will be detected, not corrected, signaled
> * 3+ bf: don't care
>
> -> BCH:
> * 0 bf: everything is fine
> * 1-24 bf: will be detected, corrected, signaled
> * 25 bf: everything is fine
> * 26+ bf: don't care
>
> Do you see the problem?
No.
>
> In the 25 bf case, the controller is reporting that everything went fine while it should report
> that it detected an uncorrectable situation.
>
> Here are two leads to solve this issue, please investigate them both:
> 1/ Talk to your colleagues that developed the RTL, ask if there is a
> hidden/reserved bit for that purpose that is not documented.
I spoke to RTL guys, there is nothing hidden/reserved bit for this purpose.
I tried reading the status registers reserved bits, but they are raz(read as zero)
> 2/ Search for a status in the registers that might indicate that an
> error occurred, for instance "0 bf corrected" and "bf have been
> detected".
I tried reading status registers and other registers as well, but no luck.
>
> NB: I know that, with a BCH ECC engine, error detection at (strength +
> 1) is not 100% sure but statistically it will almost always be detected and in this case we
> need the controller to warn the user!
Ok, I understood now.
Thanks,
Naga Sureshkumar Relli
>
>
> Thanks,
> Miquèl
Hi Boris & Miquel,
Could you please provide your thoughts on this driver to support HW-ECC?
As I said previously, there is no way to detect errors beyond N bit.
I am ok to update the driver based on your inputs.
Thanks,
Naga Sureshkumar Relli
> -----Original Message-----
> From: linux-mtd [mailto:[email protected]] On Behalf Of Naga
> Sureshkumar Relli
> Sent: Friday, December 21, 2018 1:06 PM
> To: Miquel Raynal <[email protected]>
> Cc: [email protected]; [email protected]; [email protected]; martin.lund@keep-it-
> simple.com; [email protected]; Boris Brezillon <[email protected]>;
> [email protected]; [email protected]; Michal Simek
> <[email protected]>; [email protected]; [email protected]
> Subject: RE: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan
> NAND Flash Controller
>
> Hi Miquel,
>
> > -----Original Message-----
> > From: Miquel Raynal [mailto:[email protected]]
> > Sent: Wednesday, December 19, 2018 7:57 PM
> > To: Naga Sureshkumar Relli <[email protected]>
> > Cc: Boris Brezillon <[email protected]>; [email protected];
> > [email protected]; [email protected]; [email protected];
> > [email protected]; [email protected]; Michal Simek
> > <[email protected]>; [email protected];
> > [email protected]; [email protected]
> > Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support
> > for Arasan NAND Flash Controller
> >
> > Hi Naga,
> >
> > + Martin
> >
> > Naga Sureshkumar Relli <[email protected]> wrote on Tue, 18 Dec 2018
> > 05:33:53 +0000:
> >
> > > Hi Miquel,
> > >
> > > > -----Original Message-----
> > > > From: Miquel Raynal [mailto:[email protected]]
> > > > Sent: Monday, December 17, 2018 10:11 PM
> > > > To: Naga Sureshkumar Relli <[email protected]>
> > > > Cc: Boris Brezillon <[email protected]>;
> > > > [email protected]; [email protected]; linux- [email protected];
> > > > [email protected]; [email protected];
> > > > [email protected]; Michal Simek <[email protected]>;
> > > > [email protected]; [email protected]
> > > > Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add
> > > > support for Arasan NAND Flash Controller
> > > >
> > > > Hi Naga,
> > > >
> > > > [...]
> > > >
> > > > > Inserted biterror @ 48/7
> > > > > Successfully corrected 25 bit errors per subpage Inserted
> > > > > biterror @
> > > > > 50/7 ECC failure, invalid data despite read success
> > > > > root@xilinx-zc1751-dc2-2018_1:~#
> > > > >
> > > > > But even in this case also, driver is saying ECC failure but read success.
> > > > > That means controller is able to detect errors on read page up to 24 bit only.
> > > > > After that there is no way to say to the upper layers that the
> > > > > page is bad because of the
> > > > limitation in the controller.
> > > >
> > > > This is more than a "limitation", the design is broken. I am not
> > > > sure how to support such controller, and I am not sure if we even want to.
> > >
> > > The number of errors that are correctable is limited by a parameter
> > > 't'(total number of errors), If there is a condition that the number
> > > of errors greater than 't',
> > then the controller won't be able to detect that.
> > > I guess this concept is same for other controllers as well.
> > > In Arasan it is limited to 24-bit.
> > >
> > > Even, in case of Hamming, it is 1-bit error correction and 2-bit error detection.
> > > What will happen if there are multiple errors(greater than 2-bit)?
> >
> > Ok let's use the Hamming comparison in your ECC engine case.
> >
> > -> hamming:
> > * 0 bf: everything is fine
> > * 1 bf: will be detected, corrected, signaled
> > * 2 bf: will be detected, not corrected, signaled
> > * 3+ bf: don't care
> >
> > -> BCH:
> > * 0 bf: everything is fine
> > * 1-24 bf: will be detected, corrected, signaled
> > * 25 bf: everything is fine
> > * 26+ bf: don't care
> >
> > Do you see the problem?
> No.
> >
> > In the 25 bf case, the controller is reporting that everything went
> > fine while it should report that it detected an uncorrectable situation.
> >
> > Here are two leads to solve this issue, please investigate them both:
> > 1/ Talk to your colleagues that developed the RTL, ask if there is a
> > hidden/reserved bit for that purpose that is not documented.
> I spoke to RTL guys, there is nothing hidden/reserved bit for this purpose.
> I tried reading the status registers reserved bits, but they are raz(read as zero)
>
> > 2/ Search for a status in the registers that might indicate that an
> > error occurred, for instance "0 bf corrected" and "bf have been
> > detected".
> I tried reading status registers and other registers as well, but no luck.
> >
> > NB: I know that, with a BCH ECC engine, error detection at (strength +
> > 1) is not 100% sure but statistically it will almost always be
> > detected and in this case we need the controller to warn the user!
> Ok, I understood now.
>
> Thanks,
> Naga Sureshkumar Relli
> >
> >
> > Thanks,
> > Miquèl
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
Hi Naga,
Naga Sureshkumar Relli <[email protected]> wrote on Mon, 28 Jan 2019
06:04:53 +0000:
> Hi Boris & Miquel,
>
> Could you please provide your thoughts on this driver to support HW-ECC?
> As I said previously, there is no way to detect errors beyond N bit.
> I am ok to update the driver based on your inputs.
We won't support the ECC engine. It simply cannot be used reliably.
I am working on a generic ECC engine object. It's gonna take a few
months until it gets merged but after that you could update the
controller driver to drop any ECC-related function. Although the ECC
engine part is blocking, raw access still look wrong and the driver
still needs changes.
Thanks,
Miquèl
Hi Miquel,
> -----Original Message-----
> From: Miquel Raynal [mailto:[email protected]]
> Sent: Monday, January 28, 2019 2:58 PM
> To: Naga Sureshkumar Relli <[email protected]>
> Cc: [email protected]; [email protected]; [email protected]; martin.lund@keep-it-
> simple.com; [email protected]; Boris Brezillon <[email protected]>;
> [email protected]; [email protected]; Michal Simek
> <[email protected]>; [email protected]; [email protected]
> Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan
> NAND Flash Controller
>
> Hi Naga,
>
> Naga Sureshkumar Relli <[email protected]> wrote on Mon, 28 Jan 2019
> 06:04:53 +0000:
>
> > Hi Boris & Miquel,
> >
> > Could you please provide your thoughts on this driver to support HW-ECC?
> > As I said previously, there is no way to detect errors beyond N bit.
> > I am ok to update the driver based on your inputs.
>
> We won't support the ECC engine. It simply cannot be used reliably.
>
> I am working on a generic ECC engine object. It's gonna take a few months until it gets
> merged but after that you could update the controller driver to drop any ECC-related
> function. Although the ECC engine part is blocking, raw access still look wrong and the
> driver still needs changes.
Thank you for the update.
Yes, currently ECC engine part is blocking.
Also whenever you have time, please provide your comments to the driver v12 patch.
I will update those.
Thanks,
Naga Sureshkumar Relli
>
>
> Thanks,
> Miquèl
On Mon, Jan 28, 2019 at 10:27:39AM +0100, Miquel Raynal wrote:
Hi Miquel,
> Hi Naga,
>
> Naga Sureshkumar Relli <[email protected]> wrote on Mon, 28 Jan 2019
> 06:04:53 +0000:
>
> > Hi Boris & Miquel,
> >
> > Could you please provide your thoughts on this driver to support HW-ECC?
> > As I said previously, there is no way to detect errors beyond N bit.
> > I am ok to update the driver based on your inputs.
>
> We won't support the ECC engine. It simply cannot be used reliably.
>
> I am working on a generic ECC engine object. It's gonna take a few
> months until it gets merged but after that you could update the
> controller driver to drop any ECC-related function. Although the ECC
Could you please let me know that, when can we expect generic ECC engine
update in mtd NAND?
Based on that, i will plan to update the ARASAN NAND driver along with your
comments mentioned above under this update,
as you know there is a limiation in ARASAN NAND controller to detect
ECC errors.
i am following this series https://patchwork.kernel.org/patch/10838705/
Thanks,
Naga Sureshkumar Relli
> engine part is blocking, raw access still look wrong and the driver
> still needs changes.
>
>
> Thanks,
> Miqu?l
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
Hi Naga,
Naga Sureshkumar Relli <[email protected]> wrote on Tue,
18 Jun 2019 22:44:24 -0600:
> On Mon, Jan 28, 2019 at 10:27:39AM +0100, Miquel Raynal wrote:
> Hi Miquel,
>
> > Hi Naga,
> >
> > Naga Sureshkumar Relli <[email protected]> wrote on Mon, 28 Jan 2019
> > 06:04:53 +0000:
> >
> > > Hi Boris & Miquel,
> > >
> > > Could you please provide your thoughts on this driver to support HW-ECC?
> > > As I said previously, there is no way to detect errors beyond N bit.
> > > I am ok to update the driver based on your inputs.
> >
> > We won't support the ECC engine. It simply cannot be used reliably.
> >
> > I am working on a generic ECC engine object. It's gonna take a few
> > months until it gets merged but after that you could update the
> > controller driver to drop any ECC-related function. Although the ECC
>
> Could you please let me know that, when can we expect generic ECC engine
> update in mtd NAND?
> Based on that, i will plan to update the ARASAN NAND driver along with your
> comments mentioned above under this update,
> as you know there is a limiation in ARASAN NAND controller to detect
> ECC errors.
> i am following this series https://patchwork.kernel.org/patch/10838705/
It is gonna take more time than expected. You can stick to the software
engines for now.
Thanks,
Miquèl
Hi Miquel,
> -----Original Message-----
> From: Miquel Raynal <[email protected]>
> Sent: Thursday, June 27, 2019 9:58 PM
> To: Naga Sureshkumar Relli <[email protected]>
> Cc: Naga Sureshkumar Relli <[email protected]>; [email protected]; martin.lund@keep-it-
> simple.com; [email protected]; [email protected]; Boris Brezillon
> <[email protected]>; [email protected]; [email protected];
> Michal Simek <[email protected]>; [email protected]; [email protected];
> [email protected]
> Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan NAND
> Flash Controller
>
> Hi Naga,
>
> Naga Sureshkumar Relli <[email protected]> wrote on Tue,
> 18 Jun 2019 22:44:24 -0600:
>
> > On Mon, Jan 28, 2019 at 10:27:39AM +0100, Miquel Raynal wrote:
> > Hi Miquel,
> >
> > > Hi Naga,
> > >
> > > Naga Sureshkumar Relli <[email protected]> wrote on Mon, 28 Jan
> > > 2019
> > > 06:04:53 +0000:
> > >
> > > > Hi Boris & Miquel,
> > > >
> > > > Could you please provide your thoughts on this driver to support HW-ECC?
> > > > As I said previously, there is no way to detect errors beyond N bit.
> > > > I am ok to update the driver based on your inputs.
> > >
> > > We won't support the ECC engine. It simply cannot be used reliably.
> > >
> > > I am working on a generic ECC engine object. It's gonna take a few
> > > months until it gets merged but after that you could update the
> > > controller driver to drop any ECC-related function. Although the ECC
> >
> > Could you please let me know that, when can we expect generic ECC
> > engine update in mtd NAND?
> > Based on that, i will plan to update the ARASAN NAND driver along with
> > your comments mentioned above under this update, as you know there is
> > a limiation in ARASAN NAND controller to detect ECC errors.
> > i am following this series
> > https://patchwork.kernel.org/patch/10838705/
>
> It is gonna take more time than expected. You can stick to the software engines for now.
Ok. I will update the driver accordingly.
Thanks,
Naga Sureshkumar Relli
>
> Thanks,
> Miquèl