Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp1854862ybf; Sun, 1 Mar 2020 19:59:06 -0800 (PST) X-Google-Smtp-Source: ADFU+vsQbQBW2BsjoRBCera7ZVg1WpFeugrAAM4xdU5ZmoVsMW4wwz3o9dFrhiMyBJgJYl1fnkl0 X-Received: by 2002:a05:6830:57:: with SMTP id d23mr3910009otp.224.1583121546797; Sun, 01 Mar 2020 19:59:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583121546; cv=none; d=google.com; s=arc-20160816; b=Ey64cHv+/ifMQyy2TamIEsikSMmDnT9au5g1bQpEcrUmq+tUb9Lbt+S58TEj4bAWhL lOkoficbLRgP7QjUmVKk9076qLzEXP+jjuFXXVX9Eajm9AwVa4OPDCiCFyG3xEs+mS58 PICh/XAg+clYCfclY4x5Kf1SpHY5PisXTn5SeX4rsxSHL5wVM4IeLAH8sb1SgwBj0uRy DQRq8qTqLUEZgDoNbev2NvYNL51RB61sENAMBQ2xaUcZcjKsbcBIOy21Em8/367WiT2L vSnKIPYvEO5dku88LMIUsGVGkSSU73lYIwEflbO3QzH1lwZ9EUIGufpBM+tKMrcY6v5o Varg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:references:cc:to :subject; bh=cqdxlmF7uwdpjhy1+RjjfZ2b1RzBd6hjQfnxHQfhBwI=; b=Z/Uzm9fZPyZoZ8GxKOdgzOSeHCykNAcMjZI9r91CzVZgNQVcBQ6txDiHCx6v+Iikll KZLL2MeajwmLB7UyAwjysrRagWr3mA/wq+LnLf1cEuwcJ+5zIUHEMpVR89qdJ+9XbSVo fgUPlmGcuZZhZDd8KFZ9r29tit3pEuEaHZsagSsTBtuVZQjIIxQHdJ44/B/jAXf3sWLY BT6TG2qHz/bC0pwThAWM8sXRG50K1f9vadWItUMPLA+284mQa7uiD3IkQDymZQIJeDfF HIZbC6LHgTHw7vXt5XrD5c7QurSXJC8icu6NZBAZht0M5VXHncdNGk/ptWQbL04Ot4af xV+g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s207si4251408oih.255.2020.03.01.19.58.55; Sun, 01 Mar 2020 19:59:06 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726892AbgCBD6r (ORCPT + 99 others); Sun, 1 Mar 2020 22:58:47 -0500 Received: from szxga04-in.huawei.com ([45.249.212.190]:11125 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726758AbgCBD6r (ORCPT ); Sun, 1 Mar 2020 22:58:47 -0500 Received: from DGGEMS413-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id 138C5119A387DF3CC13B; Mon, 2 Mar 2020 11:58:45 +0800 (CST) Received: from [127.0.0.1] (10.173.221.98) by DGGEMS413-HUB.china.huawei.com (10.3.19.213) with Microsoft SMTP Server id 14.3.439.0; Mon, 2 Mar 2020 11:58:36 +0800 Subject: Re: [PATCH] ubifs: Don't discard nodes in recovery when ecc err detected To: Richard Weinberger CC: Richard Weinberger , Sascha Hauer , "zhangyi (F)" , , LKML References: <1582293853-136727-1-git-send-email-chengzhihao1@huawei.com> From: Zhihao Cheng Message-ID: <58b11ca2-6b91-52b3-bc75-d44abb202cfb@huawei.com> Date: Mon, 2 Mar 2020 11:58:35 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.173.221.98] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 在 2020/3/2 4:46, Richard Weinberger 写道: > Zhihao Cheng, > > On Fri, Feb 21, 2020 at 2:57 PM Zhihao Cheng wrote: >> The following process will lead TNC to find no corresponding inode node >> (Reproduce method see Link): > Please help me to understand what exactly is going on. > >> 1. Garbage collection. >> 1) move valid inode nodes from leb A to leb B >> (The leb number of B has been written as GC type bud node in log) >> 2) unmap leb A, and corresponding peb is erased >> (GCed inode nodes exist only on leb B) > At this point all valid nodes are written to LEB B, right? Yes. > >> 2. Poweroff. A node near the end of the LEB is corrupted before power >> on, which is uncorrectable error of ECC. > If writing nodes to B has finished, these pages should be stable. > How can a power-cut affect the pages where these valid nodes sit? I mean, the uncorrectable ECC error is caused by hardware which may lead to corrupted nodes detected in UBIFS. I found uncorretable ECC errors on my NAND, in the environment of high temperature and humidity. At present, UBIFS ignores all EBADMSG errors, so the corrupted node is only considered in being caused by unfinished writing. I think UBIFS should consider the corrupted area caused by ECC errors in process ubifs_recover_leb(). no_more_nodes() will skip a read-write unit. Maybe the corrupted area is skipped.