Received: by 2002:ac0:946b:0:0:0:0:0 with SMTP id j40csp1167591imj; Thu, 14 Feb 2019 02:17:52 -0800 (PST) X-Google-Smtp-Source: AHgI3IYULX0cfLjQ0dQIZZkdHZLKJurVJwJu/JigKtjowHTMtJRG5lgGJm0tio1p2bz9wHHqS7DT X-Received: by 2002:a62:6dc7:: with SMTP id i190mr3229760pfc.166.1550139472671; Thu, 14 Feb 2019 02:17:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1550139472; cv=none; d=google.com; s=arc-20160816; b=OXDbkDsb23VbW6o2YAJrRXO8hEvHMhiJBBsFovs2ZTFYWGsZ7G3tTeS+floozJ1Eu1 B9f27JmNKSwd54xGKRpXSmYZbPtmeWgR0xsqREBq3uKC5iigSMJt4Bqw6302Q6DV8YTM dxSy+/PWSMB3lt2iqsIM1ZWtmTKwMMaCqMk9dKGyyqIsngAHE210PYmCH0QIati6fq34 mnZ861A8TkPykJQD5PqDNmr9IYid8ZeZX8b8KVbAbQIcPPMJNNi1IEg4uCGuPUPCSj22 JJZVT4wLcIjpPkOM9kUYpy6K/ywoYwGtEPzFAIgNZsMTkW7juMwPpAviPhNmNOTkDMl8 t8nw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-transfer-encoding :content-language:accept-language:in-reply-to:references:message-id :date:thread-index:thread-topic:subject:cc:to:from; bh=r7PHkHcsETI8jPmqo+XKvPEfhNaZf6moFDQ3BVubc/0=; b=CruyktJnfhVcLMkoThlDjCWcCIJmX0SdCg5z0HtDiMLCAHqTPZzh1buzsVDabatadX g6L+AARPEbIngJqimisPjxpcP68WgZcpEGWii30umE/YWupEGKK+q80c+yTpsPcLZiyO a7Hf4AiYwqS60b0yShWosTyd1tSS3MaRGuz+3mO55Lps9GIbiLR/h1BX/2rHW3bHx/Tm ZHSfSPHlclf8WPZCr7Hu+oqhgSn+RXwaYNdmSxECUaugoXaf5PX1SM0D+72y655I0rza YFljsntOTnTZknsAMeJvX9w1HqBlg9Z+zUtHfqwNc3DZlxvMAo0XDDDS+D+W8fXwn1Mo +q3Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r4si1878988pgv.245.2019.02.14.02.17.35; Thu, 14 Feb 2019 02:17:52 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388277AbfBNBeV convert rfc822-to-8bit (ORCPT + 99 others); Wed, 13 Feb 2019 20:34:21 -0500 Received: from szxga08-in.huawei.com ([45.249.212.255]:53852 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728073AbfBNBeV (ORCPT ); Wed, 13 Feb 2019 20:34:21 -0500 Received: from DGGEMM405-HUB.china.huawei.com (unknown [172.30.72.54]) by Forcepoint Email with ESMTP id 846197B2D9206E023798; Thu, 14 Feb 2019 09:34:18 +0800 (CST) Received: from DGGEMM528-MBX.china.huawei.com ([169.254.8.114]) by DGGEMM405-HUB.china.huawei.com ([10.3.20.213]) with mapi id 14.03.0415.000; Thu, 14 Feb 2019 09:34:16 +0800 From: "liujian (CE)" To: Tokunori Ikegami , "'Sobon, Przemyslaw'" , 'Boris Brezillon' CC: "keescook@chromium.org" , "marek.vasut@gmail.com" , "richard@nod.at" , "linux-kernel@vger.kernel.org" , "joakim.tjernlund@infinera.com" , "linux-mtd@lists.infradead.org" , "computersforpeace@gmail.com" , "dwmw2@infradead.org" , "ikegami_to@yahoo.co.jp" Subject: RE: Re: [PATCH] cfi: fix deadloop in cfi_cmdset_0002.c do_write_buffer Thread-Topic: Re: [PATCH] cfi: fix deadloop in cfi_cmdset_0002.c do_write_buffer Thread-Index: AQHUuWkt1Ktbb1R8x0aeT193FccfoKXNO3aAgAACWYCABA2QAIACQc+AgADriICAAA4wAIAA9AiAgAkb77A= Date: Thu, 14 Feb 2019 01:34:16 +0000 Message-ID: <4F88C5DDA1E80143B232E89585ACE27D026236F3@DGGEMM528-MBX.china.huawei.com> References: <1548977439-318904-1-git-send-email-liujian56@huawei.com> <20190203092645.18d1495b@bbrezillon> <20190203093509.269bf1e1@bbrezillon> <20190207095635.0fc3b411@kernel.org> <193621849.44066.1549580387922.JavaMail.yahoo@mail.yahoo.co.jp> <632ed76bd3844ceab75066d1f30a7115@EX13D07UWA001.ant.amazon.com> <149101d4bfb9$fdc5a330$f950e990$@gmail.com> In-Reply-To: <149101d4bfb9$fdc5a330$f950e990$@gmail.com> Accept-Language: zh-CN, en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.177.97.126] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Best Regards, liujian > -----Original Message----- > From: Tokunori Ikegami [mailto:ikegami.t@gmail.com] > Sent: Friday, February 08, 2019 10:24 PM > To: 'Sobon, Przemyslaw' ; 'Boris Brezillon' > > Cc: keescook@chromium.org; marek.vasut@gmail.com; richard@nod.at; > linux-kernel@vger.kernel.org; joakim.tjernlund@infinera.com; > linux-mtd@lists.infradead.org; computersforpeace@gmail.com; > dwmw2@infradead.org; liujian (CE) ; > ikegami_to@yahoo.co.jp > Subject: RE: Re: [PATCH] cfi: fix deadloop in cfi_cmdset_0002.c > do_write_buffer > > Hi Przemek-san, > > Thank you so much for your explanation. > > > I have seen a case myself where a value was written, chip changed > > state to "ready" but when I was reading the value was incorrect. > > I also know the similar issues for the both buffer and word write. > Both issues were able to reproduce the write error behavior. > Note: The word write issue is able to reproduce now also. > > Those were resolved by using chip_good() instead to check the state. > > > This can happen as result of intermittent issue with flash. It is hard > > to fall into scenario when testing on limited number of devices but > > with large enough population you can see that. > > If possible I would like to know the issue detail and its cause also. > > > Another situation > > is when a flash chip reaches its maximum number of writes. So for > > example a chip is designed for 100k writes to a page. Once you reach > > that number of writes you can have invalid data written to flash but > > chip itself reports everything was good and switches to "ready" state. > > Yes I see. > > Regards, > Ikegami > > > -----Original Message----- > > From: linux-mtd [mailto:linux-mtd-bounces@lists.infradead.org] On > > Behalf Of Sobon, Przemyslaw > > Sent: Friday, February 8, 2019 8:51 AM > > To: ikegami_to@yahoo.co.jp; Boris Brezillon > > Cc: keescook@chromium.org; marek.vasut@gmail.com; > > ikegami@allied-telesis.co.jp; richard@nod.at; > > linux-kernel@vger.kernel.org; joakim.tjernlund@infinera.com; > > linux-mtd@lists.infradead.org; computersforpeace@gmail.com; > > dwmw2@infradead.org; Liu Jian > > Subject: RE: Re: [PATCH] cfi: fix deadloop in cfi_cmdset_0002.c > > do_write_buffer > > > > Hi Ikegami, > > > > I have seen a case myself where a value was written, chip changed > > state to "ready" but when I was reading the value was incorrect. > > This can happen as result of intermittent issue with flash. It is hard > > to fall into scenario when testing on limited number of devices but > > with large enough population you can see that. Another situation is > > when a flash chip reaches its maximum number of writes. So for example > > a chip is designed for 100k writes to a page. Once you reach that > > number of writes you can have invalid data written to flash but chip > > itself reports everything was good and switches to "ready" state. > > > > Hope this explanation is clear. Please let me know. > > > > Regards, > > Przemek > > > > > -----Original Message----- > > > From: ikegami_to@yahoo.co.jp > > > Sent: Thursday, February 7, 2019 3:00 PM > > > > > > Hi Przemek-san, > > > > > > Could you please explain the case detail that the value is written > > incorrectly? > > > I think that the value is only written correctly except a bug. > > > > > > Regards, > > > Ikegami > > > > > > --- boris.brezillon@collabora.com wrote --- : > > > > Hi Sobon, > > > > > > > > On Tue, 5 Feb 2019 22:28:44 +0000 > > > > "Sobon, Przemyslaw" wrote: > > > > > > > > > > From: Boris Brezillon > > > > > > Sent: Sunday, February 3, 2019 12:35 AM > > > > > > > +Przemyslaw > > > > > > > > > > > > > > On Fri, 1 Feb 2019 07:30:39 +0800 Liu Jian > > > > > > > wrote: > > > > > > > > > > > > > > > In function do_write_buffer(), in the for loop, there is a > > > > > > > > case > > > > > > > > chip_ready() returns 1 while chip_good() returns 0, so it > > > > > > > > never break the loop. > > > > > > > > To fix this, chip_good() is enough and it should timeout > > > > > > > > if > > it > > > > > > > > stay bad for a while. > > > > > > > > > > > > > > Looks like Przemyslaw reported and fixed the same problem. > > > > > > > > > > > > > > > > > > > > > > > Fixes: dfeae1073583(mtd: cfi_cmdset_0002: Change write > > > > > > > > buffer to check correct value) > > > > > > > > > > > > > > Can you put the Fixes tag on a single, and the format is > > > > > > > > > > > > > > Fixes: ("message") > > > > > > > > > > > > > > > Signed-off-by: Yi Huaijie > > > > > > > > Signed-off-by: Liu Jian > > > > > > > > > > > > > > [1]http://patchwork.ozlabs.org/patch/1025566/ > > > > > > > So, do I need to send a v2 patch? Or use Przemyslaw's new patch http://patchwork.ozlabs.org/patch/1038395/ > > > > > > > > --- > > > > > > > > drivers/mtd/chips/cfi_cmdset_0002.c | 6 +++--- > > > > > > > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > > > > > > > > > > > > > diff --git a/drivers/mtd/chips/cfi_cmdset_0002.c > > > > > > > > b/drivers/mtd/chips/cfi_cmdset_0002.c > > > > > > > > index 72428b6..818e94b 100644 > > > > > > > > --- a/drivers/mtd/chips/cfi_cmdset_0002.c > > > > > > > > +++ b/drivers/mtd/chips/cfi_cmdset_0002.c > > > > > > > > @@ -1876,14 +1876,14 @@ static int __xipram > > do_write_buffer(struct map_info *map, struct flchip *chip, > > > > > > > > continue; > > > > > > > > } > > > > > > > > > > > > > > > > - if (time_after(jiffies, timeo) && !chip_ready(map, > > adr)) > > > > > > > > - break; > > > > > > > > - > > > > > > > > if (chip_good(map, adr, datum)) { > > > > > > > > xip_enable(map, chip, adr); > > > > > > > > goto op_done; > > > > > > > > } > > > > > > > > > > > > > > > > + if (time_after(jiffies, timeo)) > > > > > > > > + break; > > > > > > > > + > > > > > > > > /* Latency issues. Drop the lock, wait a while > > > > > > > > and > > retry */ > > > > > > > > UDELAY(map, chip, adr, 1); > > > > > > > > } > > > > > > > > > > > > > > > > > > > BTW, the patch itself looks good to me. Ikegami, can you > > > > > > confirm > > it does the right thing? > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Boris > > > > > > > > > > > > > > > > One comment to this patch. If value is written incorrectly > > > > > quickly we will be stuck in the loop even though nothing is going to > change. > > > > > For example a value was written incorrectly after 1us, the loop > > > > > was set to 1ms, function will return after 1ms, this solution is > > > > > not optimized for performance. I considered same when working on > > > > > this > > change and decided to do it different way. > > > > > > > > Seems like you're right if we assume that checking for GOOD state > > > > does not require a delay after the READY check, but if that's not > > > > the case and an extra delay is actually required, you might end up > > > > with a BAD status while it could have turned GOOD at some point > > > > with the 'check only for GOOD state until we timeout' approach. > > > > > > > > TBH, I don't know how CFI flashes work, so I'll let you guys sort > > > > this out. > > > > > > > > Regards, > > > > > > > > Boris > > > > > > > > ______________________________________________________ > > > > Linux MTD discussion mailing list > > > > http://lists.infradead.org/mailman/listinfo/linux-mtd/ > > > > > > > > > > > > ______________________________________________________ > > Linux MTD discussion mailing list > > http://lists.infradead.org/mailman/listinfo/linux-mtd/