As of edbc454 [mtd: driver _read() returns max_bitflips; mtd_read()
returns -EUCLEAN], 'mtd->bitflip_threshold' must be set for mtd devices
having ECC, prior any 'mtd_read()' call.
Otherwise, 'mtd_read()' will falsely return -EUCLEAN.
Normally, 'mtd->bitflip_threshold' is initialized when the MTD is added.
However, this is too late for NAND MTDs, as 'scan_bbt()' is invoked
prior the existing initialization of 'mtd->bitflip_threshold'.
This is a problem since 'scan_bbt()' calls 'mtd_read()', in the case
of a flash-based bad block table.
It resulted in a falsely reported bitflips indication during BBT read,
which lead to constant scrubbing of the flash BBT blocks.
Initialize 'mtd->bitflip_threshold' to its default value (if not already
set by the driver), prior invocation of 'scan_bbt()'.
Reported-by: Sascha Hauer <[email protected]>
Signed-off-by: Shmulik Ladkani <[email protected]>
---
- The issue was introduced in 3.5-rc1 and needs to be merged before 3.5
- Sascha, I've credited you as the reporter, is that ok?
- Sascha, care to test and verify it works for your system?
diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index 61805e7..0a8724e 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -3484,6 +3484,13 @@ int nand_scan_tail(struct mtd_info *mtd)
/* propagate ecc info to mtd_info */
mtd->ecclayout = chip->ecc.layout;
mtd->ecc_strength = chip->ecc.strength;
+ /*
+ * Initialize bitflip_threshold to its default prior scan_bbt() call.
+ * scan_bbt() might invoke mtd_read(), thus bitflip_threshold must be
+ * properly set.
+ */
+ if (!mtd->bitflip_threshold)
+ mtd->bitflip_threshold = mtd->ecc_strength;
/* Check, if we should skip the bad block table scan */
if (chip->options & NAND_SKIP_BBTSCAN)
--
1.7.5.4
On Fri, 2012-06-08 at 18:29 +0300, Shmulik Ladkani wrote:
> As of edbc454 [mtd: driver _read() returns max_bitflips; mtd_read()
> returns -EUCLEAN], 'mtd->bitflip_threshold' must be set for mtd devices
> having ECC, prior any 'mtd_read()' call.
> Otherwise, 'mtd_read()' will falsely return -EUCLEAN.
Pushed to l2-mtd.git, thanks! Sascha - does it fix the issue for you?
--
Best Regards,
Artem Bityutskiy
On Fri, Jun 08, 2012 at 06:29:06PM +0300, Shmulik Ladkani wrote:
> As of edbc454 [mtd: driver _read() returns max_bitflips; mtd_read()
> returns -EUCLEAN], 'mtd->bitflip_threshold' must be set for mtd devices
> having ECC, prior any 'mtd_read()' call.
> Otherwise, 'mtd_read()' will falsely return -EUCLEAN.
>
> Normally, 'mtd->bitflip_threshold' is initialized when the MTD is added.
>
> However, this is too late for NAND MTDs, as 'scan_bbt()' is invoked
> prior the existing initialization of 'mtd->bitflip_threshold'.
>
> This is a problem since 'scan_bbt()' calls 'mtd_read()', in the case
> of a flash-based bad block table.
> It resulted in a falsely reported bitflips indication during BBT read,
> which lead to constant scrubbing of the flash BBT blocks.
>
> Initialize 'mtd->bitflip_threshold' to its default value (if not already
> set by the driver), prior invocation of 'scan_bbt()'.
>
> Reported-by: Sascha Hauer <[email protected]>
> Signed-off-by: Shmulik Ladkani <[email protected]>
> ---
> - The issue was introduced in 3.5-rc1 and needs to be merged before 3.5
> - Sascha, I've credited you as the reporter, is that ok?
Sure, that's ok.
> - Sascha, care to test and verify it works for your system?
Yes, this puts my system back to work
Tested-by: Sascha Hauer <[email protected]>
>
> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
> index 61805e7..0a8724e 100644
> --- a/drivers/mtd/nand/nand_base.c
> +++ b/drivers/mtd/nand/nand_base.c
> @@ -3484,6 +3484,13 @@ int nand_scan_tail(struct mtd_info *mtd)
> /* propagate ecc info to mtd_info */
> mtd->ecclayout = chip->ecc.layout;
> mtd->ecc_strength = chip->ecc.strength;
> + /*
> + * Initialize bitflip_threshold to its default prior scan_bbt() call.
> + * scan_bbt() might invoke mtd_read(), thus bitflip_threshold must be
> + * properly set.
> + */
> + if (!mtd->bitflip_threshold)
> + mtd->bitflip_threshold = mtd->ecc_strength;
>
> /* Check, if we should skip the bad block table scan */
> if (chip->options & NAND_SKIP_BBTSCAN)
> --
> 1.7.5.4
>
--
Pengutronix e.K. | |
Industrial Linux Solutions | http://www.pengutronix.de/ |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
On Sat, 2012-06-09 at 11:43 +0200, Sascha Hauer wrote:
>
> > - The issue was introduced in 3.5-rc1 and needs to be merged before
> 3.5
> > - Sascha, I've credited you as the reporter, is that ok?
>
> Sure, that's ok.
>
> > - Sascha, care to test and verify it works for your system?
>
> Yes, this puts my system back to work
>
> Tested-by: Sascha Hauer <[email protected]>
Thanks. I've pushed this to linux-mtd.git and will ask Linus to pull it
shortly.
--
dwmw2
On 06/08/2012 08:29 AM, Shmulik Ladkani wrote:
> As of edbc454 [mtd: driver _read() returns max_bitflips; mtd_read()
> returns -EUCLEAN], 'mtd->bitflip_threshold' must be set for mtd devices
> having ECC, prior any 'mtd_read()' call.
> Otherwise, 'mtd_read()' will falsely return -EUCLEAN.
>
> Normally, 'mtd->bitflip_threshold' is initialized when the MTD is added.
>
> However, this is too late for NAND MTDs, as 'scan_bbt()' is invoked
> prior the existing initialization of 'mtd->bitflip_threshold'.
>
> This is a problem since 'scan_bbt()' calls 'mtd_read()', in the case
> of a flash-based bad block table.
> It resulted in a falsely reported bitflips indication during BBT read,
> which lead to constant scrubbing of the flash BBT blocks.
>
> Initialize 'mtd->bitflip_threshold' to its default value (if not already
> set by the driver), prior invocation of 'scan_bbt()'.
>
Reviewed-by: Mike Dunn <[email protected]>
(FWIW).
Thanks again Shmulik!
I do think that the patch to the bad block management code that I sent yesterday
should be merged as well. That would also fix the problem, but it is really a
separate issue. Currently ecc errors are ignored during the BBT creation for
some configurations but not for others (like that of the mxc_nand controller).
It looks like an oversight that they are not ignored, and it makes sense to
ignore them, since there's not much you can do about them at that time, and the
functions higher up the call stack don't try to.
Thanks,
Mike