2020-02-21 13:58:45

by Zhihao Cheng

[permalink] [raw]
Subject: [PATCH] ubifs: Don't discard nodes in recovery when ecc err detected

The following process will lead TNC to find no corresponding inode node
(Reproduce method see Link):

1. Garbage collection.
1) move valid inode nodes from leb A to leb B
(The leb number of B has been written as GC type bud node in log)
2) unmap leb A, and corresponding peb is erased
(GCed inode nodes exist only on leb B)
2. Poweroff. A node near the end of the LEB is corrupted before power
on, which is uncorrectable error of ECC.
3. Replay and return success.
1) replay_buds -> bud for leb B is the last GC type record in log
2) ubifs_recover_leb:
For SCANNED_A_CORRUPT_NODE error caused by node_B, UBIFS will
discard node_A ... node_C, such nodes are in the same max_io unit.

corrupt

node_A node_B ... node_C
XXXXXXXXXXXXXXXXXXXXXXXXX XXXXXX... ← Replay GC LEB
|← max_write_size →|
4. Finding a missing inode node triggers an error, as follows:
[ 2276.992557] UBIFS error (ubi0:0 pid 2509): ubifs_read_node
[ubifs]: bad node type (255 but expected 2)
[ 2276.996814] UBIFS error (ubi0:0 pid 2509): ubifs_read_node
[ubifs]: bad node at LEB 15:73728, LEB mapping status 0
[ 2276.998439] Not a node, first 24 bytes:
[ 2276.998442] 00000000: ff ff ff ff ff ff ff ff ff ff ff ff

Fix this by returning fail when scan bad data with ecc error detected in
ubifs_recover_leb(). This reduces the fault tolerance of the file system.
In the case of pad node ecc error at the end of LEB without affecting the
file system data, it also returns fail, but the probability is very low.

Signed-off-by: Zhihao Cheng <[email protected]>
Cc: <[email protected]>
Fixes: 1e51764a3c2ac05a23a22b2a95 ("UBIFS: add new flash file system")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206625
---
fs/ubifs/recovery.c | 41 +++++++++++++++++++++++++++++++++++------
1 file changed, 35 insertions(+), 6 deletions(-)

diff --git a/fs/ubifs/recovery.c b/fs/ubifs/recovery.c
index f116f7b3f9e5..e60f23ea4575 100644
--- a/fs/ubifs/recovery.c
+++ b/fs/ubifs/recovery.c
@@ -624,14 +624,28 @@ struct ubifs_scan_leb *ubifs_recover_leb(struct ubifs_info *c, int lnum,
{
int ret = 0, err, len = c->leb_size - offs, start = offs, min_io_unit;
int grouped = jhead == -1 ? 0 : c->jheads[jhead].grouped;
+ int ecc_err = 0;
struct ubifs_scan_leb *sleb;
void *buf = sbuf + offs;

dbg_rcvry("%d:%d, jhead %d, grouped %d", lnum, offs, jhead, grouped);

- sleb = ubifs_start_scan(c, lnum, offs, sbuf);
- if (IS_ERR(sleb))
- return sleb;
+ sleb = kzalloc(sizeof(struct ubifs_scan_leb), GFP_NOFS);
+ if (!sleb)
+ return ERR_PTR(-ENOMEM);
+
+ sleb->lnum = lnum;
+ INIT_LIST_HEAD(&sleb->nodes);
+ sleb->buf = sbuf;
+
+ err = ubifs_leb_read(c, lnum, sbuf + offs, offs, c->leb_size - offs, 0);
+ if (err && err != -EBADMSG) {
+ ubifs_err(c, "cannot read %d bytes from LEB %d:%d, error %d",
+ c->leb_size - offs, lnum, offs, err);
+ kfree(sleb);
+ return ERR_PTR(err);
+ } else if (err == -EBADMSG)
+ ecc_err = 1;

ubifs_assert(c, len >= 8);
while (len >= 8) {
@@ -677,10 +691,24 @@ struct ubifs_scan_leb *ubifs_recover_leb(struct ubifs_info *c, int lnum,
}

if (ret == SCANNED_GARBAGE || ret == SCANNED_A_BAD_PAD_NODE) {
- if (!is_last_write(c, buf, offs))
+ /*
+ * If the garbage area or bad pad node is caused by ecc
+ * error, skipping valid nodes in the aligned max write
+ * size unit will lead tnc to mismatch node (inode,
+ * data, etc.). But if the ecc error infects only a pad
+ * node, which doesn't corrupt data nodes on UBIFS, still
+ * failed. We choose robustness here at the cost of
+ * giving up tolerating some small probability mistakes.
+ */
+ if (!is_last_write(c, buf, offs) || ecc_err)
goto corrupted_rescan;
} else if (ret == SCANNED_A_CORRUPT_NODE) {
- if (!no_more_nodes(c, buf, len, lnum, offs))
+ /*
+ * If the corrupt node is caused by ecc error, skipping
+ * valid nodes in the aligned max write size unit will
+ * lead tnc to mismatch node (inode, data, etc.).
+ */
+ if (!no_more_nodes(c, buf, len, lnum, offs) || ecc_err)
goto corrupted_rescan;
} else if (!is_empty(buf, len)) {
if (!is_last_write(c, buf, offs)) {
@@ -782,7 +810,8 @@ struct ubifs_scan_leb *ubifs_recover_leb(struct ubifs_info *c, int lnum,
ubifs_scanned_corruption(c, lnum, offs, buf);
err = -EUCLEAN;
error:
- ubifs_err(c, "LEB %d scanning failed", lnum);
+ ubifs_err(c, "LEB %d scanning failed%s", lnum,
+ ecc_err ? ", ecc error detected!" : "");
ubifs_scan_destroy(sleb);
return ERR_PTR(err);
}
--
2.7.4


2020-03-01 20:48:04

by Richard Weinberger

[permalink] [raw]
Subject: Re: [PATCH] ubifs: Don't discard nodes in recovery when ecc err detected

Zhihao Cheng,

On Fri, Feb 21, 2020 at 2:57 PM Zhihao Cheng <[email protected]> wrote:
>
> The following process will lead TNC to find no corresponding inode node
> (Reproduce method see Link):

Please help me to understand what exactly is going on.

> 1. Garbage collection.
> 1) move valid inode nodes from leb A to leb B
> (The leb number of B has been written as GC type bud node in log)
> 2) unmap leb A, and corresponding peb is erased
> (GCed inode nodes exist only on leb B)

At this point all valid nodes are written to LEB B, right?

> 2. Poweroff. A node near the end of the LEB is corrupted before power
> on, which is uncorrectable error of ECC.

If writing nodes to B has finished, these pages should be stable.
How can a power-cut affect the pages where these valid nodes sit?

--
Thanks,
//richard

2020-03-02 03:59:06

by Zhihao Cheng

[permalink] [raw]
Subject: Re: [PATCH] ubifs: Don't discard nodes in recovery when ecc err detected

在 2020/3/2 4:46, Richard Weinberger 写道:
> Zhihao Cheng,
>
> On Fri, Feb 21, 2020 at 2:57 PM Zhihao Cheng <[email protected]> wrote:
>> The following process will lead TNC to find no corresponding inode node
>> (Reproduce method see Link):
> Please help me to understand what exactly is going on.
>
>> 1. Garbage collection.
>> 1) move valid inode nodes from leb A to leb B
>> (The leb number of B has been written as GC type bud node in log)
>> 2) unmap leb A, and corresponding peb is erased
>> (GCed inode nodes exist only on leb B)
> At this point all valid nodes are written to LEB B, right?
Yes.
>
>> 2. Poweroff. A node near the end of the LEB is corrupted before power
>> on, which is uncorrectable error of ECC.
> If writing nodes to B has finished, these pages should be stable.
> How can a power-cut affect the pages where these valid nodes sit?
I mean, the uncorrectable ECC error is caused by hardware which may lead
to corrupted nodes detected in UBIFS. I found uncorretable ECC errors on
my NAND, in the environment of high temperature and humidity.

At present, UBIFS ignores all EBADMSG errors, so the corrupted node is
only considered in being caused by unfinished writing. I think UBIFS
should consider the corrupted area caused by ECC errors in process
ubifs_recover_leb(). no_more_nodes() will skip a read-write unit. Maybe
the corrupted area is skipped.



2020-03-02 21:16:18

by Richard Weinberger

[permalink] [raw]
Subject: Re: [PATCH] ubifs: Don't discard nodes in recovery when ecc err detected

On Mon, Mar 2, 2020 at 4:58 AM Zhihao Cheng <[email protected]> wrote:
> I mean, the uncorrectable ECC error is caused by hardware which may lead
> to corrupted nodes detected in UBIFS. I found uncorretable ECC errors on
> my NAND, in the environment of high temperature and humidity.
>
> At present, UBIFS ignores all EBADMSG errors, so the corrupted node is
> only considered in being caused by unfinished writing. I think UBIFS
> should consider the corrupted area caused by ECC errors in process
> ubifs_recover_leb(). no_more_nodes() will skip a read-write unit. Maybe
> the corrupted area is skipped.

Well, if your NAND data is corrupted by your environment UBIFS cannot
do much. Sure, we can paper over some places but at the end of the day
you will always lose.

What if the UBI VID header becomes unreadable or the root node of the
index tree?

--
Thanks,
//richard

2020-03-03 06:13:35

by Zhihao Cheng

[permalink] [raw]
Subject: Re: [PATCH] ubifs: Don't discard nodes in recovery when ecc err detected

在 2020/3/3 5:14, Richard Weinberger 写道:
> On Mon, Mar 2, 2020 at 4:58 AM Zhihao Cheng <[email protected]> wrote:
>> I mean, the uncorrectable ECC error is caused by hardware which may lead
>> to corrupted nodes detected in UBIFS. I found uncorretable ECC errors on
>> my NAND, in the environment of high temperature and humidity.
>>
>> At present, UBIFS ignores all EBADMSG errors, so the corrupted node is
>> only considered in being caused by unfinished writing. I think UBIFS
>> should consider the corrupted area caused by ECC errors in process
>> ubifs_recover_leb(). no_more_nodes() will skip a read-write unit. Maybe
>> the corrupted area is skipped.
> Well, if your NAND data is corrupted by your environment UBIFS cannot
> do much. Sure, we can paper over some places but at the end of the day
> you will always lose.
>
> What if the UBI VID header becomes unreadable or the root node of the
> index tree?
>
Agree, thanks for reminding.