Received: by 2002:a25:683:0:0:0:0:0 with SMTP id 125csp256180ybg; Fri, 12 Jun 2020 00:11:43 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyEfUdV4DRPPzMW86/IZI0ORbKLDX6er05lBOjJVNZuCgEFd36YJ2eTFhE1fOL1gyDC0tTH X-Received: by 2002:a17:906:4b50:: with SMTP id j16mr12659608ejv.415.1591945903607; Fri, 12 Jun 2020 00:11:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1591945903; cv=none; d=google.com; s=arc-20160816; b=p7Pbb8O+dKZOoyU7CWOb7lUhML0923HwZe5hLOf16EhQcdy0HrJVAYiUQMbPTCQiIp JasV3ofrWB79b5hkLaDRPcTpHXwFp49T/gnFcfvLbDNA2M3QhAFYtRV1Z19qsA5nFnaZ 0WxQuuh+XEbPE6aAuMlaM6MPw2xObTOn37M3OpkdwWBb9ldOm0pW65IiTZQ8QDQ82Clw FtjB1WdQlBufvtNDnxw25v7Pf/7ooX7Glw+AQ4NkqZLIRnRzrAvrWZ5UF6bed7F94dRi kroZgQq+qjLh+jZduB/h6IVHGUMWRvtst/WvxW23dqgG5Dq6YNclL7WhL7h7O0bmkGmf hxMw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date; bh=nqwHwRsCgyJuRqdekPcEukn73BHKs6BAG9lVcQ9nttU=; b=fhy6Nk9f7nbdXLb0UKqw86p6jtVZNPUZq/Wh8Rl0xW2HNc552UITOam9NUG60Dn6qd dhbveCQ+J9e9odSP4UStgnhVby8UkAQCSglI1lrh5Y4JHeynKLOUPpwvO1q2uxC0HPHG TqDq/5Sm1QnzORpwDHC0n0WPhlnXmHGbOMvWcjxtcJ7K9ImM4TyxJwQwV46gyMhCir8x JgDto9cF4SZgM8fICAkJxiKQlOFqclYT4ji/tvVCZBym9LecaPVWGnHivFz4evk3I8Bi HuZ4384c4LbZ2QVeDuyOzdT5/q2qagQ/o4LxLds3yZp/K6GBQEl+bEQRZa5elfDjdRgs vj9w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id e20si3509282ejx.262.2020.06.12.00.11.21; Fri, 12 Jun 2020 00:11:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726447AbgFLHHe convert rfc822-to-8bit (ORCPT + 99 others); Fri, 12 Jun 2020 03:07:34 -0400 Received: from relay1-d.mail.gandi.net ([217.70.183.193]:10175 "EHLO relay1-d.mail.gandi.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726332AbgFLHHd (ORCPT ); Fri, 12 Jun 2020 03:07:33 -0400 X-Originating-IP: 91.224.148.103 Received: from xps13 (unknown [91.224.148.103]) (Authenticated sender: miquel.raynal@bootlin.com) by relay1-d.mail.gandi.net (Postfix) with ESMTPSA id E578E240006; Fri, 12 Jun 2020 07:07:29 +0000 (UTC) Date: Fri, 12 Jun 2020 09:07:28 +0200 From: Miquel Raynal To: Kamal Dasu Cc: Brian Norris , Richard Weinberger , Vignesh Raghavendra , MTD Maling List , bcm-kernel-feedback-list , Linux Kernel Mailing List Subject: Re: [PATCH 2/2] mtd: rawnand: brcmnand: Ecc error handling on EDU transfers Message-ID: <20200612090728.043b6baf@xps13> In-Reply-To: References: <20200611054454.2547-1-kdasu.kdev@gmail.com> <20200611054454.2547-2-kdasu.kdev@gmail.com> <20200611092707.75da8c6a@xps13> Organization: Bootlin X-Mailer: Claws Mail 3.17.4 (GTK+ 2.24.32; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Kamal, Kamal Dasu wrote on Thu, 11 Jun 2020 12:04:29 -0400: > On Thu, Jun 11, 2020 at 3:27 AM Miquel Raynal wrote: > > > > Hi Kamal, > > > > Kamal Dasu wrote on Thu, 11 Jun 2020 01:44:54 > > -0400: > > > > > Implemented ECC correctable and uncorrectable error handling for EDU > > > > Implement? > > > > > reads. If ECC correctable bitflips are encountered on EDU transfer, > > > > extra space ^ > > > > > read page again using pio, This is needed due to a controller lmitation > > > > s/pio/PIO/ > > > > > where read and corrected data is not transferred to the DMA buffer on ECC > > > errors. This holds true for ECC correctable errors beyond set threshold. > > > > error. > > > > Not sure what the last sentence means? > > > > NAND controller allows for setting a correctable ECC threshold number > of bits beyond which it will actually report the error to the driver. > e.g. for BCH-4 the threshold is 3, so 3-bit and 4-bit errors will > generate correctable ECC interrupt however 1-bit and 2-bit errors will > be corrected silently. > From the above example EDU hardware will not transfer corrected data > to the DMA buffer for 3-bit and 4-bit errors that get reported. So > once we detect > the error duing EDU we read the page again using pio. Ok I see what you mean, can't you fake the threshold instead? The NAND controller in Linux is not supposed to handle this threshold, the NAND core is in charge. So what the controller driver should do is just: increase the number of bitflips + return the maximum number or bitflip or increase the failure counter. Is this already the case? > > > > > > > Fixes: a5d53ad26a8b ("mtd: rawnand: brcmnand: Add support for flash-edu for dma transfers") > > > Signed-off-by: Kamal Dasu > > > --- > > > > Minor nits below :) > > > > > drivers/mtd/nand/raw/brcmnand/brcmnand.c | 26 ++++++++++++++++++++++++ > > > 1 file changed, 26 insertions(+) > > > > > > diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c > > > index 0c1d6e543586..d7daa83c8a58 100644 > > > --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c > > > +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c > > > @@ -1855,6 +1855,22 @@ static int brcmnand_edu_trans(struct brcmnand_host *host, u64 addr, u32 *buf, > > > edu_writel(ctrl, EDU_STOP, 0); /* force stop */ > > > edu_readl(ctrl, EDU_STOP); > > > > > > + if (ret == 0 && edu_cmd == EDU_CMD_READ) { > > > > !ret > > > > > + u64 err_addr = 0; > > > + > > > + /* > > > + * check for ecc errors here, subpage ecc erros are > > > + * retained in ecc error addr register > > > > s/ecc/ECC/ > > s/erros/errors/ > > s/addr/address/ > > > > > + */ > > > + err_addr = brcmnand_get_uncorrecc_addr(ctrl); > > > + if (!err_addr) { > > > + err_addr = brcmnand_get_correcc_addr(ctrl); > > > + if (err_addr) > > > + ret = -EUCLEAN; > > > + } else > > > + ret = -EBADMSG; > > > > I don't like very much to see these values being used within NAND > > controller drivers but I see it's already the cause, so I guess I can s/cause/case/ > > live with that. > > > > > + } > > > + > > > return ret; > > > } > > > > > > @@ -2058,6 +2074,7 @@ static int brcmnand_read(struct mtd_info *mtd, struct nand_chip *chip, > > > u64 err_addr = 0; > > > int err; > > > bool retry = true; > > > + bool edu_read = false; > > > > > > dev_dbg(ctrl->dev, "read %llx -> %p\n", (unsigned long long)addr, buf); > > > > > > @@ -2075,6 +2092,10 @@ static int brcmnand_read(struct mtd_info *mtd, struct nand_chip *chip, > > > else > > > return -EIO; > > > } > > > + > > > + if (has_edu(ctrl)) > > > + edu_read = true; > > > > You don't need this extra value, you already have the cmd parameter > > which tells you if it is a read or a write. You might even want to > > create a if block so set dir and edu_cmd and eventually a local > > edu_read if you think it still makes sense. > > > > I needed the value since dma and edu read has multiple conditions like > oob is not included, buffer is aligned, virtual address is good. This > indicates to > the if (mtd_is_bitflip(err)) block that the error was from an edu > transaction that happened.This way all ecc error handling for dma, > edu, pio is in one place. > Also there is more controller version specific logic for read error > handling in there and this allows us to maintain the hierarchy how we > handle both correctable > and uncorrectable error. Fair enough. > > > > + > > > } else { > > > if (oob) > > > memset(oob, 0x99, mtd->oobsize); > > > @@ -2122,6 +2143,11 @@ static int brcmnand_read(struct mtd_info *mtd, struct nand_chip *chip, > > > if (mtd_is_bitflip(err)) { > > > unsigned int corrected = brcmnand_count_corrected(ctrl); > > > > > > + /* in case of edu correctable error we read again using pio */ > > > > s/edu/EDU/ ? > > s/pio/PIO/ > > > > > + if (edu_read) > > > + err = brcmnand_read_by_pio(mtd, chip, addr, trans, buf, > > > + oob, &err_addr); > > > + > > > dev_dbg(ctrl->dev, "corrected error at 0x%llx\n", > > > (unsigned long long)err_addr); > > > mtd->ecc_stats.corrected += corrected; > > > > Will fix all the other typos. > > > Thanks, > > Miquèl > > Thanks > Kamal Thanks, Miquèl