Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755066AbbLAKlq (ORCPT ); Tue, 1 Dec 2015 05:41:46 -0500 Received: from arrakis.dune.hu ([78.24.191.176]:42360 "EHLO arrakis.dune.hu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751140AbbLAKlp (ORCPT ); Tue, 1 Dec 2015 05:41:45 -0500 MIME-Version: 1.0 In-Reply-To: <5659FF4A.7080203@simon.arlott.org.uk> References: <5659FF4A.7080203@simon.arlott.org.uk> From: Jonas Gorski Date: Tue, 1 Dec 2015 11:41:00 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] mtd: brcmnand: Workaround false ECC uncorrectable errors To: Simon Arlott Cc: Brian Norris , Kamal Dasu , David Woodhouse , MTD Maling List , bcm-kernel-feedback-list , Linux Kernel Mailing List , Florian Fainelli Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4938 Lines: 126 Hi, On Sat, Nov 28, 2015 at 8:23 PM, Simon Arlott wrote: > Workaround false ECC uncorrectable errors by checking if the data > has been erased and the OOB data indicates that the data has been > erased. The v4.0 controller on the BCM63168 incorrectly handles > these as uncorrectable errors. > > I don't know which version of the controller handles this scenario > correctly. I'm only able to test this in Hamming mode, so the BCH > version needs to be checked. > > This code is quite complicated because the fact that either the data > buffer or the OOB data may not have been read from the controller, and > both of them are required to determine if the error is incorrect. > > Signed-off-by: Simon Arlott > --- > drivers/mtd/nand/brcmnand/brcmnand.c | 107 ++++++++++++++++++++++++++++++++++- > 1 file changed, 106 insertions(+), 1 deletion(-) > > diff --git a/drivers/mtd/nand/brcmnand/brcmnand.c b/drivers/mtd/nand/brcmnand/brcmnand.c > index 5f26b8a..0857af7 100644 > --- a/drivers/mtd/nand/brcmnand/brcmnand.c > +++ b/drivers/mtd/nand/brcmnand/brcmnand.c > @@ -1387,6 +1387,102 @@ static int brcmnand_dma_trans(struct brcmnand_host *host, u64 addr, u32 *buf, > } > > /* > + * Workaround false ECC uncorrectable errors by checking if the data > + * has been erased and the OOB data indicates that the data has been > + * erased. The controller incorrectly handles these as uncorrectable > + * errors. > + */ > +static void brcmnand_read_ecc_unc_err(struct mtd_info *mtd, > + struct nand_chip *chip, > + int idx, unsigned int trans, > + u32 *buf, u8 *oob_begin, u8 *oob_end, > + u64 *err_addr) > +{ > + struct brcmnand_host *host = chip->priv; > + struct brcmnand_controller *ctrl = host->ctrl; > + u32 *buf_tmp = NULL; > + u8 *oob_tmp = NULL; > + bool buf_erased = false; > + bool oob_erased = false; > + int j; > + > + /* Assume this is fixed in v5.0+ */ > + if (ctrl->nand_version >= 0x0500) > + return; > + > + /* Read OOB data if not already read */ > + if (!oob_begin) { > + oob_tmp = kmalloc(ctrl->max_oob, GFP_KERNEL); > + if (!oob_tmp) > + goto out_free; > + > + oob_begin = oob_tmp; > + oob_end = oob_tmp; > + > + oob_end += read_oob_from_regs(ctrl, idx, oob_tmp, > + mtd->oobsize / trans, > + host->hwcfg.sector_size_1k); > + } > + > + if (is_hamming_ecc(&host->hwcfg)) { > + u8 *oob_offset = oob_begin + 6; > + > + if (oob_offset + 3 < oob_end) > + /* Erased if ECC bytes are all 0xFF, or the data bytes > + * are all 0xFF which should have Hamming codes of 0x00 > + */ > + oob_erased = memchr_inv(oob_offset, 0xFF, 3) == NULL || > + memchr_inv(oob_offset, 0x00, 3) == NULL; > + } else { /* BCH */ > + u8 *oob_offset = oob_end - ctrl->max_oob; > + > + if (oob_offset >= oob_begin) > + /* Erased if ECC bytes are all 0xFF */ > + oob_erased = memchr_inv(oob_offset, 0xFF, > + oob_end - oob_offset) == NULL; > + } > + > + if (!oob_erased) > + goto out_free; > + > + /* Read data buffer if not already read */ > + if (!buf) { > + buf_tmp = kmalloc_array(FC_WORDS, sizeof(*buf_tmp), GFP_KERNEL); > + if (!buf_tmp) > + goto out_free; > + > + buf = buf_tmp; > + > + brcmnand_soc_data_bus_prepare(ctrl->soc); > + > + for (j = 0; j < FC_WORDS; j++, buf++) > + *buf = brcmnand_read_fc(ctrl, j); > + > + brcmnand_soc_data_bus_unprepare(ctrl->soc); > + } > + > + /* Go to start of buffer */ > + buf -= FC_WORDS; > + > + /* Erased if all data bytes are 0xFF */ > + buf_erased = memchr_inv(buf, 0xFF, FC_WORDS) == NULL; > + > + if (!buf_erased) > + goto out_free; We now have a function exactly for that use case in 4.4, nand_check_erased_buf [1], consider using that. This also has the benefit of treating bit flips as correctable as long as the ECC scheme is strong enough. Jonas [1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/mtd/nand/nand_base.c#n1110 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/