Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757384AbbKUHqP (ORCPT ); Sat, 21 Nov 2015 02:46:15 -0500 Received: from down.free-electrons.com ([37.187.137.238]:57033 "EHLO mail.free-electrons.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753915AbbKUHqM convert rfc822-to-8bit (ORCPT ); Sat, 21 Nov 2015 02:46:12 -0500 Date: Sat, 21 Nov 2015 08:46:04 +0100 From: Boris Brezillon To: Brian Norris Cc: "Bean Huo =?UTF-8?B?6ZyN5paM5paM?= (beanhuo)" , "linux-kernel@vger.kernel.org" , "linux-mtd@lists.infradead.org" Subject: Re: enhance ONFI table reliability/stable Message-ID: <20151121084604.23fff6f9@bbrezillon> In-Reply-To: <20151120235927.GD64635@google.com> References: <20151118025010.GB140057@google.com> <20151120235927.GD64635@google.com> X-Mailer: Claws Mail 3.11.1 (GTK+ 2.24.27; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4911 Lines: 124 On Fri, 20 Nov 2015 15:59:27 -0800 Brian Norris wrote: > On Thu, Nov 19, 2015 at 04:21:01AM +0000, Bean Huo 霍斌斌 (beanhuo) wrote: > > > On Tue, Jul 21, 2015 at 02:42:34PM +0000, Bean Huo 霍斌斌 (beanhuo) > > > wrote: > > > > Hi, > > > > > > > > Recently, I faced some case about ONFI table reliability, now it used CRC. > > > > If there is bit flips in ONFI parameter pages, parameter backup page will be > > > taken. > > > > For latest linux,default read three copys. > > > > > > > > chip->cmdfunc(mtd, NAND_CMD_PARAM, 0, -1); > > > > for (i = 0; i < 3; i++) { > > > > for (j = 0; j < sizeof(*p); j++) > > > > ((uint8_t *)p)[j] = chip->read_byte(mtd); > > > > if (onfi_crc16(ONFI_CRC_BASE, (uint8_t *)p, 254) == > > > > le16_to_cpu(p->crc)) { > > > > break; > > > > } > > > > } > > > > > > > > However ,with technoogy improvement,for TLC and new generatin MLC,I > > > > think, three copys of > > > > > > Ha, "improvement" :) > > > > > > > Parameter tables is not powerful enough.my question is that if there > > > > is a good method to protect and corrent parameter page. For example,we > > > > can use linux software BCH ecc. Any suggections and input be > > > > welcomed,if you having any concerns about this,don't free tell me. > > > > > > I recall this being brought up at my old job, and I all I can say is... > > > (please pardon my censored language) > > > > > > Yes , you ever told about this. I just follow. > > Sorry for my rude following. > > I only want to share my one suggestion about using software ECC to protect > > ONFI table that read from NAND. I want to hear every MTD expert 's valuable > > Feedback on this. if OK, I can do it. > > Perhaps I'm misunderstanding you, I don't understand how you could > possibly "do it" if it is a circular dependency. You have nowhere to > store ECC/parity data for a parameter page, because you can't actually > read/write the NAND flash until after you know its geometry. Well, while I agree with most of your answer (why the hell are NAND vendors storing the ONFI parameter page, and other sensitive information in normal NAND pages, especially when we're talking about TLC/MLC NANDs???), it's perfectly possible to have ECC in this case, as long as the geometry is known in advance (at least this is true for BCH). Say you have only 3 copies of the parameter page and ECC are stored after that. You can define the following layout: |3 x parameter page size|3 x ECC bytes| Of course this implies reserving the space after the 3 parameter pages for the ECC bytes, which according to the current ONFI spec is not true (you should have at least 3 copies, but you can have more). And we would choose the ECC geometry with this logic: ECC chunk size = sizeof(struct nand_onfi_params) ECC strength = iteratively tested with different pre-defined values This being said, I don't know how you would change the ONFI spec and keep it compatible with the previous version. As I said, the current version of the spec does not reserve any area after the mandatory parameter pages... You'll probably have to add a NAND_CMD_ALT_PARAM to support this kind of thing. > > > > ...that is complete and utter bulls***. An ONFI standard that can't guarantee > > > "reliable enough" parameter pages is no standard at all. > > > > > > To step back a bit: How would one expect to store and retrieve ECC parity > > > data? ...on the NAND flash? But to do that, we have to know the geometry > > > parameters of said NAND flash. How do we figure out the geometry? From the > > > ONFI parameter pages! Nice Catch 22 you have there. > > I realize a non-native English speaker might not understand the "Catch > 22" reference. Wikipedia has a nice summary: > > https://en.wikipedia.org/wiki/Catch-22_(logic) > > Essentially, it's a circular argument, or a contradiction. An > impossibility. > > > > Please encourage your employer never to produce "ONFI-compliant" flash that > > > are this bad. > > I still stand by the above statement. > > But now that I'm in a slightly more charitable mood, there are ways to > improve our ability to recover from slightly corrupted parameter pages > (ECC is not one of them). > > For one, you could do some kind of bit majority. e.g.: > > (1) try pages 1-3 > (2) if none pass the CRC check, then compute bit majority of all 3; if > the CRC of this combined page passes, then use it > (3) ??? Should work too, but it's probably less reliable than BCH ECC (we only have 3 copies :-/). Best Regards, Boris -- Boris Brezillon, Free Electrons Embedded Linux and Kernel engineering http://free-electrons.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/