2023-12-28 01:38:17

by Zhihao Cheng

[permalink] [raw]
Subject: [PATCH RFC 00/17] ubifs: Add filesystem repair support

UBIFS repair provides a way to fix inconsistent UBIFS image(which is
corrupted by hardware exceptions or UBIFS realization bugs) and makes
filesystem become consistent, just like fsck tools(eg. fsck.ext4,
fsck.f2fs, fsck.fat, etc.) do.

About why do we need it, how it works, what it can fix or it can not
fix, when and how to use it, see more details in
Documentation/filesystems/ubifs/repair.rst (Patch 17).

Testing on UBIFS repair refers to
https://bugzilla.kernel.org/show_bug.cgi?id=218327

Whatever, we finally have a way to fix inconsistent UBFIS image instead
of formatting UBI when UBIFS becomes inconsistent.

Zhihao Cheng (17):
ubifs: repair: Load filesystem info from volume
ubifs: repair: Scan nodes from volume
ubifs: repair: Remove deleted nodes from valid node tree
ubifs: repair: Add valid nodes into file
ubifs: repair: Filter invalid files
ubifs: repair: Extract reachable directory entries tree
ubifs: repair: Check and correct files' information
ubifs: repair: Record used LEBs
ubifs: repair: Re-write data
ubifs: repair: Create new root dir if there are no scanned files
ubifs: repair: Build TNC
ubifs: Extract a helper function to create lpt
ubifs: repair: Build LPT
ubifs: repair: Clean up log and orphan area
ubifs: repair: Write master node
ubifs: Enable ubifs_repair in '/sys/kernel/debug/ubifs/repair_fs'
Documentation: ubifs: Add ubifs repair whitepaper

Documentation/filesystems/index.rst | 3 +-
.../authentication.rst} | 0
Documentation/filesystems/ubifs/index.rst | 11 +
.../filesystems/{ubifs.rst => ubifs/main.rst} | 0
Documentation/filesystems/ubifs/repair.rst | 235 ++
MAINTAINERS | 5 +-
fs/ubifs/Makefile | 2 +-
fs/ubifs/debug.c | 57 +-
fs/ubifs/debug.h | 2 +
fs/ubifs/journal.c | 39 +-
fs/ubifs/lpt.c | 140 +-
fs/ubifs/repair.c | 2651 +++++++++++++++++
fs/ubifs/repair.h | 176 ++
fs/ubifs/sb.c | 24 +-
fs/ubifs/super.c | 10 +-
fs/ubifs/ubifs.h | 113 +-
16 files changed, 3315 insertions(+), 153 deletions(-)
rename Documentation/filesystems/{ubifs-authentication.rst => ubifs/authentication.rst} (100%)
create mode 100644 Documentation/filesystems/ubifs/index.rst
rename Documentation/filesystems/{ubifs.rst => ubifs/main.rst} (100%)
create mode 100644 Documentation/filesystems/ubifs/repair.rst
create mode 100644 fs/ubifs/repair.c
create mode 100644 fs/ubifs/repair.h

--
2.31.1



2023-12-28 01:38:26

by Zhihao Cheng

[permalink] [raw]
Subject: [PATCH RFC 01/17] ubifs: repair: Load filesystem info from volume

This is the 1/13 step of repairing. Open UBI volume and read UBIFS super
block, which is similar to UBIFS mounting process. If read superblock
failed, repair is failure.

Signed-off-by: Zhihao Cheng <[email protected]>
---
fs/ubifs/Makefile | 2 +-
fs/ubifs/repair.c | 109 ++++++++++++++++++++++++++++++++++++++++++++++
fs/ubifs/super.c | 10 ++---
fs/ubifs/ubifs.h | 5 +++
4 files changed, 120 insertions(+), 6 deletions(-)
create mode 100644 fs/ubifs/repair.c

diff --git a/fs/ubifs/Makefile b/fs/ubifs/Makefile
index 314c80b24a76..88a11f85e150 100644
--- a/fs/ubifs/Makefile
+++ b/fs/ubifs/Makefile
@@ -5,7 +5,7 @@ ubifs-y += shrinker.o journal.o file.o dir.o super.o sb.o io.o
ubifs-y += tnc.o master.o scan.o replay.o log.o commit.o gc.o orphan.o
ubifs-y += budget.o find.o tnc_commit.o compress.o lpt.o lprops.o
ubifs-y += recovery.o ioctl.o lpt_commit.o tnc_misc.o debug.o
-ubifs-y += misc.o sysfs.o
+ubifs-y += misc.o sysfs.o repair.o
ubifs-$(CONFIG_FS_ENCRYPTION) += crypto.o
ubifs-$(CONFIG_UBIFS_FS_XATTR) += xattr.o
ubifs-$(CONFIG_UBIFS_FS_AUTHENTICATION) += auth.o
diff --git a/fs/ubifs/repair.c b/fs/ubifs/repair.c
new file mode 100644
index 000000000000..b722187de74f
--- /dev/null
+++ b/fs/ubifs/repair.c
@@ -0,0 +1,109 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * This file is part of UBIFS.
+ *
+ * Copyright (C) 2023-2024, Huawei Technologies Co, Ltd.
+ *
+ * Authors: Zhihao Cheng <[email protected]>
+ */
+
+/*
+ * This file implements ubifs repair.
+ */
+
+#include <linux/module.h>
+#include "ubifs.h"
+
+int ubifs_repair(const char *dev_name)
+{
+ int err = 0;
+ struct ubifs_info *c;
+ struct ubi_volume_desc *ubi;
+ struct super_block *fake_sb;
+
+ ubi = open_ubi(dev_name, UBI_READWRITE);
+ if (IS_ERR(ubi)) {
+ err = PTR_ERR(ubi);
+ pr_err("cannot open dev %s, err %d\n", dev_name, err);
+ return err;
+ }
+
+ fake_sb = kzalloc(sizeof(struct super_block), GFP_KERNEL);
+ if (!fake_sb) {
+ err = -ENOMEM;
+ goto close_volume;
+ }
+
+ c = alloc_ubifs_info(ubi);
+ if (!c) {
+ err = -ENOMEM;
+ goto free_sb;
+ }
+
+ c->ubi = ubi;
+ c->vfs_sb = fake_sb;
+ c->max_inode_sz = key_max_inode_size(c);
+ if (c->max_inode_sz > MAX_LFS_FILESIZE)
+ c->max_inode_sz = MAX_LFS_FILESIZE;
+
+ err = init_constants_early(c);
+ if (err)
+ goto free_ubifs;
+
+ err = ubifs_debugging_init(c);
+ if (err)
+ goto free_ubifs;
+
+ if (c->ro_media) {
+ err = -EROFS;
+ goto free_debug;
+ }
+
+ err = check_volume_empty(c);
+ if (err)
+ goto free_debug;
+
+ if (c->empty) {
+ ubifs_err(c, "empty filesystem");
+ err = -ENODEV;
+ goto free_debug;
+ }
+
+ c->sbuf = vmalloc(c->leb_size);
+ if (!c->sbuf) {
+ err = -ENOMEM;
+ goto free_debug;
+ }
+
+ /* Step 1: Load filesystem info from volume. */
+ ubifs_msg(c, "Step 1: Load filesystem");
+ err = ubifs_read_superblock(c);
+ if (err)
+ goto free_sup;
+
+ /* TODO: Support repairing authenticated image. */
+ if (le32_to_cpu(c->sup_node->flags) & UBIFS_FLG_AUTHENTICATION) {
+ ubifs_err(c, "not support authentication");
+ err = -EOPNOTSUPP;
+ goto free_sup;
+ }
+
+ err = init_constants_sb(c);
+ if (err)
+ goto free_sup;
+
+ ubifs_msg(c, "Repair success!");
+
+free_sup:
+ kfree(c->sup_node);
+ vfree(c->sbuf);
+free_debug:
+ ubifs_debugging_exit(c);
+free_ubifs:
+ kfree(c);
+free_sb:
+ kfree(fake_sb);
+close_volume:
+ ubi_close_volume(ubi);
+ return err;
+}
diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
index eabb0f44ea3e..ad726523d491 100644
--- a/fs/ubifs/super.c
+++ b/fs/ubifs/super.c
@@ -506,7 +506,7 @@ static int ubifs_sync_fs(struct super_block *sb, int wait)
* requirements. Returns zero in case of success and a negative error code in
* case of failure.
*/
-static int init_constants_early(struct ubifs_info *c)
+int init_constants_early(struct ubifs_info *c)
{
if (c->vi.corrupted) {
ubifs_warn(c, "UBI volume is corrupted - read-only mode");
@@ -670,7 +670,7 @@ static int bud_wbuf_callback(struct ubifs_info *c, int lnum, int free, int pad)
* makes sure they are all right. Returns zero in case of success and a
* negative error code in case of failure.
*/
-static int init_constants_sb(struct ubifs_info *c)
+int init_constants_sb(struct ubifs_info *c)
{
int tmp, err;
long long tmp64;
@@ -934,7 +934,7 @@ static void free_buds(struct ubifs_info *c)
* Returns zero in case of success and a negative error code in case of
* failure.
*/
-static int check_volume_empty(struct ubifs_info *c)
+int check_volume_empty(struct ubifs_info *c)
{
int lnum, err;

@@ -2086,7 +2086,7 @@ const struct super_operations ubifs_super_operations = {
* returns UBI volume description object in case of success and a negative
* error code in case of failure.
*/
-static struct ubi_volume_desc *open_ubi(const char *name, int mode)
+struct ubi_volume_desc *open_ubi(const char *name, int mode)
{
struct ubi_volume_desc *ubi;
int dev, vol;
@@ -2132,7 +2132,7 @@ static struct ubi_volume_desc *open_ubi(const char *name, int mode)
return ERR_PTR(-EINVAL);
}

-static struct ubifs_info *alloc_ubifs_info(struct ubi_volume_desc *ubi)
+struct ubifs_info *alloc_ubifs_info(struct ubi_volume_desc *ubi)
{
struct ubifs_info *c;

diff --git a/fs/ubifs/ubifs.h b/fs/ubifs/ubifs.h
index 3916dc4f30ca..a7ee8010ad66 100644
--- a/fs/ubifs/ubifs.h
+++ b/fs/ubifs/ubifs.h
@@ -2072,6 +2072,11 @@ static inline int ubifs_init_security(struct inode *dentry,

/* super.c */
struct inode *ubifs_iget(struct super_block *sb, unsigned long inum);
+int init_constants_early(struct ubifs_info *c);
+int init_constants_sb(struct ubifs_info *c);
+int check_volume_empty(struct ubifs_info *c);
+struct ubi_volume_desc *open_ubi(const char *name, int mode);
+struct ubifs_info *alloc_ubifs_info(struct ubi_volume_desc *ubi);

/* recovery.c */
int ubifs_recover_master_node(struct ubifs_info *c);
--
2.31.1


2023-12-28 01:38:39

by Zhihao Cheng

[permalink] [raw]
Subject: [PATCH RFC 03/17] ubifs: repair: Remove deleted nodes from valid node tree

This is the 3/13 step of repairing. Traverse nodes from del_inos and
del_dents trees, remove inode nodes and dentry nodes with smaller sqnum
from valid trees.

This step handles deleting case, for example, file A is deleted, deleted
inode node and deleted dentry node are written, if we ignore the deleted
nodes, file A can be recovered after repairing because undeleted inode
node and undeleted dentry node can be scanned. There's an exception, if
deleted inode node and deleted dentry node are reclaimed(by gc) after
deletion, file A is recovered. UBIFS repair cannot solve it, because the
real existence information of nodes depends on TNC, but TNC should not
be depended for UBIFS repair.

Signed-off-by: Zhihao Cheng <[email protected]>
---
fs/ubifs/repair.c | 120 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 120 insertions(+)

diff --git a/fs/ubifs/repair.c b/fs/ubifs/repair.c
index 18920e9896bc..d932c16ec893 100644
--- a/fs/ubifs/repair.c
+++ b/fs/ubifs/repair.c
@@ -930,6 +930,119 @@ static int scan_nodes(struct ubifs_info *c, struct scanned_info *si)
return err;
}

+static struct scanned_ino_node *
+lookup_valid_ino_node(struct ubifs_info *c, struct scanned_info *si,
+ struct scanned_ino_node *target)
+{
+ int cmp;
+ struct scanned_ino_node *ino_node;
+ struct rb_node *p;
+
+ p = si->valid_inos.rb_node;
+ while (p) {
+ ino_node = rb_entry(p, struct scanned_ino_node, rb);
+ cmp = keys_cmp(c, &target->key, &ino_node->key);
+ if (cmp < 0) {
+ p = p->rb_left;
+ } else if (cmp > 0) {
+ p = p->rb_right;
+ } else {
+ if (target->header.sqnum > ino_node->header.sqnum)
+ return ino_node;
+ else
+ return NULL;
+ }
+ }
+
+ return NULL;
+}
+
+static struct scanned_dent_node *
+lookup_valid_dent_node(struct ubifs_info *c, struct scanned_info *si,
+ struct scanned_dent_node *target)
+{
+ int cmp, nlen;
+ struct scanned_dent_node *dent_node;
+ struct rb_node *p;
+
+ p = si->valid_dents.rb_node;
+ while (p) {
+ dent_node = rb_entry(p, struct scanned_dent_node, rb);
+ cmp = keys_cmp(c, &target->key, &dent_node->key);
+ if (cmp < 0) {
+ p = p->rb_left;
+ } else if (cmp > 0) {
+ p = p->rb_right;
+ } else {
+ nlen = min(target->nlen, dent_node->nlen);
+ cmp = strncmp(target->name, dent_node->name, nlen) ? :
+ target->nlen - dent_node->nlen;
+ if (cmp < 0) {
+ p = p->rb_left;
+ } else if (cmp > 0) {
+ p = p->rb_right;
+ } else {
+ if (target->header.sqnum >
+ dent_node->header.sqnum)
+ return dent_node;
+ else
+ return NULL;
+ }
+ }
+ }
+
+ return NULL;
+}
+
+/**
+ * remove_del_nodes - remove deleted nodes from valid node tree.
+ * @c: UBIFS file-system description object
+ * @si: records nodes and files information during scanning
+ *
+ * This function compares sqnum between deleted node and corresponding valid
+ * node, removes valid node from tree if the sqnum of deleted node is bigger.
+ * Deleted ino/dent nodes will be removed from @si->del_inos/@si->del_dents
+ * after this function finished.
+ */
+static void remove_del_nodes(struct ubifs_info *c, struct scanned_info *si)
+{
+ struct scanned_ino_node *del_ino_node, *valid_ino_node;
+ struct scanned_dent_node *del_dent_node, *valid_dent_node;
+ struct rb_node *this;
+
+ this = rb_first(&si->del_inos);
+ while (this) {
+ cond_resched();
+ del_ino_node = rb_entry(this, struct scanned_ino_node, rb);
+ this = rb_next(this);
+
+ valid_ino_node = lookup_valid_ino_node(c, si, del_ino_node);
+ if (valid_ino_node) {
+ rb_erase(&valid_ino_node->rb, &si->valid_inos);
+ kfree(valid_ino_node);
+ }
+
+ rb_erase(&del_ino_node->rb, &si->del_inos);
+ kfree(del_ino_node);
+ }
+
+ this = rb_first(&si->del_dents);
+ while (this) {
+ cond_resched();
+ del_dent_node = rb_entry(this, struct scanned_dent_node, rb);
+ this = rb_next(this);
+
+ valid_dent_node = lookup_valid_dent_node(c, si, del_dent_node);
+ if (valid_dent_node) {
+ rb_erase(&valid_dent_node->rb, &si->valid_dents);
+ kfree(valid_dent_node);
+ }
+
+ rb_erase(&del_dent_node->rb, &si->del_dents);
+ kfree(del_dent_node);
+ }
+}
+
static int do_repair(struct ubifs_info *c)
{
int err = 0;
@@ -940,7 +1053,14 @@ static int do_repair(struct ubifs_info *c)
/* Step 2: Scan valid/deleted nodes from volume. */
ubifs_msg(c, "Step 2: Scan nodes");
err = scan_nodes(c, &si);
+ if (err)
+ goto out;
+
+ /* Step 3: Remove deleted nodes from valid node tree. */
+ ubifs_msg(c, "Step 3: Remove deleted nodes");
+ remove_del_nodes(c, &si);

+out:
destroy_scanned_info(c, &si);
return err;
}
--
2.31.1


2023-12-28 01:38:46

by Zhihao Cheng

[permalink] [raw]
Subject: [PATCH RFC 05/17] ubifs: repair: Filter invalid files

This is the 5/13 step of repairing. Filter out invalid files and drop
them, for example:
1. Nonconsistent file type between inode node and dentry nodes
2. file has no dentry nodes(excepts '/')
3. Encrypted file has no xattr information
4. Non regular file has data nodes
5. Directory/xattr file has more than one dentries
6. Xattr file has no host inode
...

Signed-off-by: Zhihao Cheng <[email protected]>
---
fs/ubifs/journal.c | 39 ++------
fs/ubifs/repair.c | 222 +++++++++++++++++++++++++++++++++++++++++++++
fs/ubifs/ubifs.h | 26 ++++++
3 files changed, 254 insertions(+), 33 deletions(-)

diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c
index f0a5538c84b0..e1d22d13bfa7 100644
--- a/fs/ubifs/journal.c
+++ b/fs/ubifs/journal.c
@@ -409,33 +409,6 @@ static void finish_reservation(struct ubifs_info *c)
up_read(&c->commit_sem);
}

-/**
- * get_dent_type - translate VFS inode mode to UBIFS directory entry type.
- * @mode: inode mode
- */
-static int get_dent_type(int mode)
-{
- switch (mode & S_IFMT) {
- case S_IFREG:
- return UBIFS_ITYPE_REG;
- case S_IFDIR:
- return UBIFS_ITYPE_DIR;
- case S_IFLNK:
- return UBIFS_ITYPE_LNK;
- case S_IFBLK:
- return UBIFS_ITYPE_BLK;
- case S_IFCHR:
- return UBIFS_ITYPE_CHR;
- case S_IFIFO:
- return UBIFS_ITYPE_FIFO;
- case S_IFSOCK:
- return UBIFS_ITYPE_SOCK;
- default:
- BUG();
- }
- return 0;
-}
-
/**
* pack_inode - pack an inode node.
* @c: UBIFS file-system description object
@@ -599,7 +572,7 @@ int ubifs_jnl_update(struct ubifs_info *c, const struct inode *dir,

key_write(c, &dent_key, dent->key);
dent->inum = deletion ? 0 : cpu_to_le64(inode->i_ino);
- dent->type = get_dent_type(inode->i_mode);
+ dent->type = ubifs_get_dent_type(inode->i_mode);
dent->nlen = cpu_to_le16(fname_len(nm));
memcpy(dent->name, fname_name(nm), fname_len(nm));
dent->name[fname_len(nm)] = '\0';
@@ -1095,7 +1068,7 @@ int ubifs_jnl_xrename(struct ubifs_info *c, const struct inode *fst_dir,
dent1->ch.node_type = UBIFS_DENT_NODE;
dent_key_init_flash(c, &dent1->key, snd_dir->i_ino, snd_nm);
dent1->inum = cpu_to_le64(fst_inode->i_ino);
- dent1->type = get_dent_type(fst_inode->i_mode);
+ dent1->type = ubifs_get_dent_type(fst_inode->i_mode);
dent1->nlen = cpu_to_le16(fname_len(snd_nm));
memcpy(dent1->name, fname_name(snd_nm), fname_len(snd_nm));
dent1->name[fname_len(snd_nm)] = '\0';
@@ -1111,7 +1084,7 @@ int ubifs_jnl_xrename(struct ubifs_info *c, const struct inode *fst_dir,
dent2->ch.node_type = UBIFS_DENT_NODE;
dent_key_init_flash(c, &dent2->key, fst_dir->i_ino, fst_nm);
dent2->inum = cpu_to_le64(snd_inode->i_ino);
- dent2->type = get_dent_type(snd_inode->i_mode);
+ dent2->type = ubifs_get_dent_type(snd_inode->i_mode);
dent2->nlen = cpu_to_le16(fname_len(fst_nm));
memcpy(dent2->name, fname_name(fst_nm), fname_len(fst_nm));
dent2->name[fname_len(fst_nm)] = '\0';
@@ -1286,7 +1259,7 @@ int ubifs_jnl_rename(struct ubifs_info *c, const struct inode *old_dir,
dent->ch.node_type = UBIFS_DENT_NODE;
dent_key_init_flash(c, &dent->key, new_dir->i_ino, new_nm);
dent->inum = cpu_to_le64(old_inode->i_ino);
- dent->type = get_dent_type(old_inode->i_mode);
+ dent->type = ubifs_get_dent_type(old_inode->i_mode);
dent->nlen = cpu_to_le16(fname_len(new_nm));
memcpy(dent->name, fname_name(new_nm), fname_len(new_nm));
dent->name[fname_len(new_nm)] = '\0';
@@ -1303,7 +1276,7 @@ int ubifs_jnl_rename(struct ubifs_info *c, const struct inode *old_dir,

if (whiteout) {
dent2->inum = cpu_to_le64(whiteout->i_ino);
- dent2->type = get_dent_type(whiteout->i_mode);
+ dent2->type = ubifs_get_dent_type(whiteout->i_mode);
} else {
/* Make deletion dent */
dent2->inum = 0;
@@ -1756,7 +1729,7 @@ int ubifs_jnl_delete_xattr(struct ubifs_info *c, const struct inode *host,
xent_key_init(c, &xent_key, host->i_ino, nm);
key_write(c, &xent_key, xent->key);
xent->inum = 0;
- xent->type = get_dent_type(inode->i_mode);
+ xent->type = ubifs_get_dent_type(inode->i_mode);
xent->nlen = cpu_to_le16(fname_len(nm));
memcpy(xent->name, fname_name(nm), fname_len(nm));
xent->name[fname_len(nm)] = '\0';
diff --git a/fs/ubifs/repair.c b/fs/ubifs/repair.c
index 7a1732ef903f..5875268135ff 100644
--- a/fs/ubifs/repair.c
+++ b/fs/ubifs/repair.c
@@ -1099,6 +1099,222 @@ static int add_valid_nodes_into_file(struct ubifs_info *c,
return 0;
}

+/**
+ * lookup_file - lookup file according to inode number.
+ * @c: UBIFS file-system description object
+ * @inum: inode number
+ *
+ * This function lookups target file from @c->repair->scanned_files
+ * according to @inum.
+ */
+static struct scanned_file *lookup_file(struct ubifs_info *c, ino_t inum)
+{
+ struct scanned_file *file;
+ struct rb_node *p;
+
+ p = c->repair->scanned_files.rb_node;
+ while (p) {
+ file = rb_entry(p, struct scanned_file, rb);
+
+ if (inum < file->inum)
+ p = p->rb_left;
+ else if (inum > file->inum)
+ p = p->rb_right;
+ else
+ return file;
+ }
+
+ return NULL;
+}
+
+/**
+ * file_is_valid - check whether the file is valid.
+ * @c: UBIFS file-system description object
+ * @file: file object
+ *
+ * This function checks whether given @file is valid, following checks will
+ * be performed:
+ * 1. The file type comes from inode and dentries should be consistent,
+ * inconsistent dentries will be deleted.
+ * 2. All files must have at least one dentries, except '/', '/' doesn't
+ * have dentries. Non '/' file is invalid if it doesn't have dentries.
+ * 3. Directory type or xattr type files only have one dentry. Superfluous
+ * dentries with lower sequence number will be deleted.
+ * 4. Non-regular file doesn't have data nodes. Data nodes are deleted for
+ * non-regular file.
+ * 5. Xattr files should have host inode, otherwise they are invalid.
+ * 6. Encrypted files should have corresponding xattrs, otherwise they are
+ * invalid.
+ *
+ * Returns %true is @file is valid, otherwise %false is returned.
+ */
+static bool file_is_valid(struct ubifs_info *c, struct scanned_file *file)
+{
+ int type;
+ struct rb_node *node;
+ struct scanned_dent_node *dent_node;
+ struct scanned_data_node *data_node;
+ LIST_HEAD(drop_list);
+
+ if (!file->ino.header.exist)
+ return false;
+
+ type = ubifs_get_dent_type(file->ino.mode);
+
+ /* Drop dentry nodes with inconsistent type. */
+ for (node = rb_first(&file->dent_nodes); node; node = rb_next(node)) {
+ int is_xattr = 0;
+
+ cond_resched();
+ dent_node = rb_entry(node, struct scanned_dent_node, rb);
+
+ if (key_type(c, &dent_node->key) == UBIFS_XENT_KEY)
+ is_xattr = 1;
+ if (is_xattr != file->ino.is_xattr || type != dent_node->type)
+ list_add(&dent_node->list, &drop_list);
+ }
+
+ while (!list_empty(&drop_list)) {
+ cond_resched();
+ dent_node = list_entry(drop_list.next, struct scanned_dent_node,
+ list);
+
+ list_del(&dent_node->list);
+ rb_erase(&dent_node->rb, &file->dent_nodes);
+ kfree(dent_node);
+ }
+
+ if (type != UBIFS_ITYPE_DIR && !file->ino.is_xattr)
+ goto skip_dir_and_xattr;
+
+ /*
+ * Make sure that directory/xattr type files only have one dentry.
+ * This work should be done in step 4, but file type could be unknown
+ * for lacking inode information at that time, so do it here.
+ */
+ node = rb_first(&file->dent_nodes);
+ while (node) {
+ cond_resched();
+ dent_node = rb_entry(node, struct scanned_dent_node, rb);
+ node = rb_next(node);
+ if (!node)
+ break;
+
+ rb_erase(&dent_node->rb, &file->dent_nodes);
+ kfree(dent_node);
+ }
+
+skip_dir_and_xattr:
+ if (type == UBIFS_ITYPE_REG && !file->ino.is_xattr)
+ goto skip_non_reg;
+
+ /*
+ * Make sure that non regular type files not have data/trun nodes.
+ * This work should be done in step 4, but file type could be unknown
+ * for lacking inode information at that time, so do it here.
+ */
+ file->trun.header.exist = 0;
+ node = rb_first(&file->data_nodes);
+ while (node) {
+ cond_resched();
+ data_node = rb_entry(node, struct scanned_data_node, rb);
+ node = rb_next(node);
+
+ rb_erase(&data_node->rb, &file->data_nodes);
+ kfree(data_node);
+ }
+
+skip_non_reg:
+ if (file->ino.is_xattr) {
+ struct scanned_file *parent_file;
+
+ node = rb_first(&file->dent_nodes);
+ if (!node)
+ /* The xattr dentry is not found. */
+ return false;
+
+ dent_node = rb_entry(node, struct scanned_dent_node, rb);
+ parent_file = lookup_file(c, key_inum(c, &dent_node->key));
+ if (!parent_file)
+ /* Host inode is not found. */
+ return false;
+
+ if (parent_file->ino.is_encrypted) {
+ int nlen = min(dent_node->nlen,
+ strlen(UBIFS_XATTR_NAME_ENCRYPTION_CONTEXT));
+
+ if (!strncmp(dent_node->name,
+ UBIFS_XATTR_NAME_ENCRYPTION_CONTEXT, nlen))
+ parent_file->has_encrypted_info = true;
+ }
+ } else {
+ /*
+ * Since xattr files are checked in first round, so all
+ * non-xattr files's @has_encrypted_info fields have been
+ * initialized.
+ */
+ if (file->ino.is_encrypted && !file->has_encrypted_info)
+ return false;
+ }
+
+ if (!rb_first(&file->dent_nodes)) {
+ /* Non-root files must have dentries. */
+ if (file->inum != UBIFS_ROOT_INO)
+ return false;
+ } else if (file->inum == UBIFS_ROOT_INO) {
+ /* '/' has no dentries. */
+ return false;
+ }
+
+ return true;
+}
+
+/**
+ * filter_invalid_files - filter out invalid files.
+ * @c: UBIFS file-system description object
+ *
+ * This function filters out invalid files(eg. inconsistent types between
+ * inode and dentry node, or missing inode/dentry node, or encrypted inode
+ * has no encryption related xattrs, etc.).
+ */
+static void filter_invalid_files(struct ubifs_info *c)
+{
+ struct rb_node *node;
+ struct scanned_file *file;
+ LIST_HEAD(invalid_list);
+
+ /* Round 1: Iterate xattr files. */
+ for (node = rb_first(&c->repair->scanned_files); node;
+ node = rb_next(node)) {
+ cond_resched();
+ file = rb_entry(node, struct scanned_file, rb);
+
+ if (file->ino.is_xattr && !file_is_valid(c, file))
+ list_add(&file->list, &invalid_list);
+ }
+
+ /* Round 2: Iterate non-xattr files. */
+ for (node = rb_first(&c->repair->scanned_files); node;
+ node = rb_next(node)) {
+ cond_resched();
+ file = rb_entry(node, struct scanned_file, rb);
+
+ if (!file->ino.is_xattr && !file_is_valid(c, file))
+ list_add(&file->list, &invalid_list);
+ }
+
+ /* Remove invalid files. */
+ while (!list_empty(&invalid_list)) {
+ cond_resched();
+ file = list_entry(invalid_list.next, struct scanned_file, list);
+
+ list_del(&file->list);
+ destroy_file_content(file);
+ rb_erase(&file->rb, &c->repair->scanned_files);
+ kfree(file);
+ }
+}
+
static int do_repair(struct ubifs_info *c)
{
int err = 0;
@@ -1119,6 +1335,12 @@ static int do_repair(struct ubifs_info *c)
/* Step 4: Add valid nodes into file. */
ubifs_msg(c, "Step 4: Add valid nodes into file");
err = add_valid_nodes_into_file(c, &si);
+ if (err)
+ goto out;
+
+ /* Step 5: Drop invalid files. */
+ ubifs_msg(c, "Step 5: Filter invalid files");
+ filter_invalid_files(c);

out:
destroy_scanned_info(c, &si);
diff --git a/fs/ubifs/ubifs.h b/fs/ubifs/ubifs.h
index 014f5ea26b17..449ab220db30 100644
--- a/fs/ubifs/ubifs.h
+++ b/fs/ubifs/ubifs.h
@@ -1829,6 +1829,32 @@ int ubifs_jnl_delete_xattr(struct ubifs_info *c, const struct inode *host,
const struct inode *inode, const struct fscrypt_name *nm);
int ubifs_jnl_change_xattr(struct ubifs_info *c, const struct inode *inode1,
const struct inode *inode2);
+/**
+ * ubifs_get_dent_type - translate VFS inode mode to UBIFS directory entry type.
+ * @mode: inode mode
+ */
+static inline int ubifs_get_dent_type(int mode)
+{
+ switch (mode & S_IFMT) {
+ case S_IFREG:
+ return UBIFS_ITYPE_REG;
+ case S_IFDIR:
+ return UBIFS_ITYPE_DIR;
+ case S_IFLNK:
+ return UBIFS_ITYPE_LNK;
+ case S_IFBLK:
+ return UBIFS_ITYPE_BLK;
+ case S_IFCHR:
+ return UBIFS_ITYPE_CHR;
+ case S_IFIFO:
+ return UBIFS_ITYPE_FIFO;
+ case S_IFSOCK:
+ return UBIFS_ITYPE_SOCK;
+ default:
+ BUG();
+ }
+ return 0;
+}

/* budget.c */
int ubifs_budget_space(struct ubifs_info *c, struct ubifs_budget_req *req);
--
2.31.1


2023-12-28 01:39:00

by Zhihao Cheng

[permalink] [raw]
Subject: [PATCH RFC 06/17] ubifs: repair: Extract reachable directory entries tree

This is the 6/13 step of repairing. Extract reachable directory entries
tree. Make sure that all files can be searched from '/', unreachable
file is deleted. So, all files can be accessible in userspace after
reparing.

Signed-off-by: Zhihao Cheng <[email protected]>
---
fs/ubifs/repair.c | 143 ++++++++++++++++++++++++++++++++++++++++++++++
fs/ubifs/repair.h | 4 ++
2 files changed, 147 insertions(+)

diff --git a/fs/ubifs/repair.c b/fs/ubifs/repair.c
index 5875268135ff..c9435c9aa148 100644
--- a/fs/ubifs/repair.c
+++ b/fs/ubifs/repair.c
@@ -628,6 +628,7 @@ static int update_file(struct ubifs_info *c, struct scanned_file *file,
{
struct scanned_dent_node *dent = (struct scanned_dent_node *)sn;

+ dent->file = file;
err = insert_file_dentry(file, dent);
break;
}
@@ -1315,6 +1316,144 @@ static void filter_invalid_files(struct ubifs_info *c)
}
}

+static bool dentry_is_reachable(struct ubifs_info *c,
+ struct scanned_dent_node *dent_node,
+ struct list_head *path_list)
+{
+ struct scanned_file *parent_file = NULL;
+ struct scanned_dent_node *dn, *parent_dent;
+ struct rb_node *p;
+
+ /* Dentry has already been checked as reachable. */
+ if (dent_node->can_be_found)
+ return true;
+
+ /* Check whether the path is cyclical. */
+ list_for_each_entry(dn, path_list, list) {
+ if (dn == dent_node)
+ return false;
+ }
+ dent_node->can_be_found = true;
+ list_add(&dent_node->list, path_list);
+
+ parent_file = lookup_file(c, key_inum(c, &dent_node->key));
+ /* Parent dentry is not found, unreachable. */
+ if (!parent_file)
+ return false;
+
+ /* Parent dentry is '/', reachable. */
+ if (parent_file->inum == UBIFS_ROOT_INO)
+ return true;
+
+ /*
+ * There are two situations here:
+ * 1. @file is non-xattr type. Since directory type file has only
+ * one dentry, pick first dentry from @parent_file is okay.
+ * 2. @file is xattr type. Since non-xattr files are checked in
+ * first round, so any directory entries from @parent_file must
+ * be reachable.
+ */
+ p = rb_first(&parent_file->dent_nodes);
+ if (!p)
+ return false;
+ parent_dent = rb_entry(p, struct scanned_dent_node, rb);
+
+ return dentry_is_reachable(c, parent_dent, path_list);
+}
+
+/**
+ * file_is_reachable - whether the file can be found from '/'.
+ * @c: UBIFS file-system description object
+ * @file: file object
+ *
+ * This function iterates all directory entries in given @file and checks
+ * whether each dentry is reachable. All unreachable directory entries will
+ * be removed.
+ */
+static bool file_is_reachable(struct ubifs_info *c, struct scanned_file *file)
+{
+ struct rb_node *node;
+ struct scanned_dent_node *dent_node;
+
+ if (file->inum == UBIFS_ROOT_INO)
+ return true;
+
+retry:
+ for (node = rb_first(&file->dent_nodes); node; node = rb_next(node)) {
+ LIST_HEAD(path_list);
+
+ cond_resched();
+ dent_node = rb_entry(node, struct scanned_dent_node, rb);
+
+ if (dentry_is_reachable(c, dent_node, &path_list))
+ continue;
+
+ while (!list_empty(&path_list)) {
+ cond_resched();
+ dent_node = list_entry(path_list.next,
+ struct scanned_dent_node, list);
+
+ list_del(&dent_node->list);
+ rb_erase(&dent_node->rb, &dent_node->file->dent_nodes);
+ kfree(dent_node);
+ }
+
+ /* Since dentry node is removed from rb-tree, rescan rb-tree. */
+ goto retry;
+ }
+
+ if (!rb_first(&file->dent_nodes))
+ return false;
+
+ return true;
+}
+
+/**
+ * check_dentry_tree - extract reachable directory entries.
+ * @c: UBIFS file-system description object
+ *
+ * This function iterates all directory entries and remove those
+ * unreachable ones. 'Unreachable' means that a directory entry can
+ * not be found from '/'.
+ */
+static void extract_dentry_tree(struct ubifs_info *c)
+{
+ struct rb_node *node;
+ struct scanned_file *file;
+ LIST_HEAD(unreachable);
+
+ /* Round 1: Iterate non-xattr files. */
+ for (node = rb_first(&c->repair->scanned_files); node;
+ node = rb_next(node)) {
+ cond_resched();
+ file = rb_entry(node, struct scanned_file, rb);
+
+ if (!file->ino.is_xattr && !file_is_reachable(c, file))
+ list_add(&file->list, &unreachable);
+ }
+
+ /* Round 2: Iterate xattr files. */
+ for (node = rb_first(&c->repair->scanned_files); node;
+ node = rb_next(node)) {
+ cond_resched();
+ file = rb_entry(node, struct scanned_file, rb);
+
+ if (file->ino.is_xattr && !file_is_reachable(c, file))
+ list_add(&file->list, &unreachable);
+ }
+
+ /* Remove unreachable files. */
+ while (!list_empty(&unreachable)) {
+ cond_resched();
+ file = list_entry(unreachable.next, struct scanned_file, list);
+
+ list_del(&file->list);
+ destroy_file_content(file);
+ rb_erase(&file->rb, &c->repair->scanned_files);
+ kfree(file);
+ }
+}
+
static int do_repair(struct ubifs_info *c)
{
int err = 0;
@@ -1342,6 +1481,10 @@ static int do_repair(struct ubifs_info *c)
ubifs_msg(c, "Step 5: Filter invalid files");
filter_invalid_files(c);

+ /* Step 6: Extract reachable directory entries. */
+ ubifs_msg(c, "Step 5: Extract reachable files");
+ extract_dentry_tree(c);
+
out:
destroy_scanned_info(c, &si);
return err;
diff --git a/fs/ubifs/repair.h b/fs/ubifs/repair.h
index 242bff2833bd..05bfe9a2189e 100644
--- a/fs/ubifs/repair.h
+++ b/fs/ubifs/repair.h
@@ -14,6 +14,8 @@
#ifndef __UBIFS_REPAIR_H__
#define __UBIFS_REPAIR_H__

+struct scanned_file;
+
/**
* scanned_node - common header node.
* @exist: whether the node is found by scanning
@@ -67,6 +69,7 @@ struct scanned_ino_node {
* @nlen: name length
* @name: dentry name
* @inum: target inode number
+ * @file: corresponding file
* @rb: link in the trees of:
* 1) valid dentry nodes or deleted dentry node
* 2) all scanned dentry nodes from same file
@@ -80,6 +83,7 @@ struct scanned_dent_node {
unsigned int nlen;
char name[UBIFS_MAX_NLEN];
ino_t inum;
+ struct scanned_file *file;
struct rb_node rb;
struct list_head list;
};
--
2.31.1


2023-12-28 01:39:40

by Zhihao Cheng

[permalink] [raw]
Subject: [PATCH RFC 09/17] ubifs: repair: Re-write data

This is the 9/13 step of repairing. Re-write data. Read data from LEB
and write back data, make sure that all LEB is ended with empty
data(0xFF). It will prevent failed gc scanning in next mounting.

Signed-off-by: Zhihao Cheng <[email protected]>
---
fs/ubifs/repair.c | 117 ++++++++++++++++++++++++++++++++++++----------
fs/ubifs/repair.h | 11 +++++
2 files changed, 104 insertions(+), 24 deletions(-)

diff --git a/fs/ubifs/repair.c b/fs/ubifs/repair.c
index 7d97de6219fa..59b79481974a 100644
--- a/fs/ubifs/repair.c
+++ b/fs/ubifs/repair.c
@@ -32,6 +32,8 @@ struct scanned_info {

static int init_repair_info(struct ubifs_info *c)
{
+ int err;
+
c->repair = kzalloc(sizeof(struct ubifs_repair_info), GFP_KERNEL);
if (!c->repair)
return -ENOMEM;
@@ -39,15 +41,28 @@ static int init_repair_info(struct ubifs_info *c)
c->repair->scanned_files = RB_ROOT;
c->repair->used_lebs = bitmap_zalloc(c->main_lebs, GFP_KERNEL);
if (!c->repair->used_lebs) {
- kfree(c->repair);
- return -ENOMEM;
+ err = -ENOMEM;
+ goto free_repair;
+ }
+ c->repair->lpts = kzalloc(sizeof(struct lprops) * c->main_lebs,
+ GFP_KERNEL);
+ if (!c->repair->lpts) {
+ err = -ENOMEM;
+ goto free_used_lebs;
}

return 0;
+
+free_used_lebs:
+ bitmap_free(c->repair->used_lebs);
+free_repair:
+ kfree(c->repair);
+ return err;
}

static void destroy_repair_info(struct ubifs_info *c)
{
+ kfree(c->repair->lpts);
bitmap_free(c->repair->used_lebs);
kfree(c->repair);
}
@@ -1026,9 +1041,13 @@ static void remove_del_nodes(struct ubifs_info *c, struct scanned_info *si)

valid_ino_node = lookup_valid_ino_node(c, si, del_ino_node);
if (valid_ino_node) {
- int lnum = del_ino_node->header.lnum;
+ int lnum = del_ino_node->header.lnum - c->main_first;
+ int pos = del_ino_node->header.offs +
+ ALIGN(del_ino_node->header.len, 8);

- set_bit(lnum - c->main_first, c->repair->used_lebs);
+ set_bit(lnum, c->repair->used_lebs);
+ c->repair->lpts[lnum].end =
+ max(c->repair->lpts[lnum].end, pos);
rb_erase(&valid_ino_node->rb, &si->valid_inos);
kfree(valid_ino_node);
}
@@ -1045,9 +1064,13 @@ static void remove_del_nodes(struct ubifs_info *c, struct scanned_info *si)

valid_dent_node = lookup_valid_dent_node(c, si, del_dent_node);
if (valid_dent_node) {
- int lnum = del_dent_node->header.lnum;
+ int lnum = del_dent_node->header.lnum - c->main_first;
+ int pos = del_dent_node->header.offs +
+ ALIGN(del_dent_node->header.len, 8);

- set_bit(lnum - c->main_first, c->repair->used_lebs);
+ set_bit(lnum, c->repair->used_lebs);
+ c->repair->lpts[lnum].end =
+ max(c->repair->lpts[lnum].end, pos);
rb_erase(&valid_dent_node->rb, &si->valid_dents);
kfree(valid_dent_node);
}
@@ -1746,21 +1769,37 @@ static char *get_file_name(struct ubifs_info *c, struct scanned_file *file)
return name;
}

+static void parse_node_location(struct ubifs_info *c, struct scanned_node *sn)
+{
+ int lnum, pos;
+
+ lnum = sn->lnum - c->main_first;
+ pos = sn->offs + ALIGN(sn->len, 8);
+
+ set_bit(lnum, c->repair->used_lebs);
+ c->repair->lpts[lnum].end = max(c->repair->lpts[lnum].end, pos);
+}
+
/**
- * record_used_lebs - record used LEBs.
+ * traverse_files_and_nodes - traverse all nodes from valid files.
* @c: UBIFS file-system description object
*
- * This function records all used LEBs which may hold useful nodes, then left
- * unused LEBs could be taken for storing new index tree.
+ * This function traverses all nodes from valid files and does following
+ * things:
+ * 1. Record all used LEBs which may hold useful nodes, then left unused
+ * LEBs could be taken for storing new index tree.
+ * 2. Re-write data to prevent failed gc scanning in the subsequent mounting
+ * process caused by corrupted data.
*/
-static void record_used_lebs(struct ubifs_info *c)
+static int traverse_files_and_nodes(struct ubifs_info *c)
{
- int lnum;
+ int i, err = 0;
struct rb_node *node, *n;
struct scanned_file *file;
struct scanned_dent_node *dent_node;
struct scanned_data_node *data_node;

+ ubifs_msg(c, "Step 8: Record used LEBs");
for (node = rb_first(&c->repair->scanned_files); node;
node = rb_next(node)) {
cond_resched();
@@ -1772,30 +1811,58 @@ static void record_used_lebs(struct ubifs_info *c)
ubifs_get_type_name(ubifs_get_dent_type(file->ino.mode)),
c->vi.ubi_num, c->vi.vol_id);

- lnum = file->ino.header.lnum;
- set_bit(lnum - c->main_first, c->repair->used_lebs);
+ parse_node_location(c, &file->ino.header);

- if (file->trun.header.exist) {
- lnum = file->trun.header.lnum;
- set_bit(lnum - c->main_first, c->repair->used_lebs);
- }
+ if (file->trun.header.exist)
+ parse_node_location(c, &file->trun.header);

for (n = rb_first(&file->data_nodes); n; n = rb_next(n)) {
cond_resched();
data_node = rb_entry(n, struct scanned_data_node, rb);

- lnum = data_node->header.lnum;
- set_bit(lnum - c->main_first, c->repair->used_lebs);
+ parse_node_location(c, &data_node->header);
}

for (n = rb_first(&file->dent_nodes); n; n = rb_next(n)) {
cond_resched();
dent_node = rb_entry(n, struct scanned_dent_node, rb);

- lnum = dent_node->header.lnum;
- set_bit(lnum - c->main_first, c->repair->used_lebs);
+ parse_node_location(c, &dent_node->header);
}
}
+
+ /* Re-write data. */
+ ubifs_msg(c, "Step 9: Re-write data");
+ for (i = 0; i < c->main_lebs; ++i) {
+ int lnum, len, end;
+
+ if (fatal_signal_pending(current))
+ return -EINTR;
+ cond_resched();
+
+ if (!test_bit(i, c->repair->used_lebs))
+ continue;
+
+ lnum = i + c->main_first;
+ dbg_repair("re-write LEB %d, in ubi%d_%d",
+ lnum, c->vi.ubi_num, c->vi.vol_id);
+
+ end = c->repair->lpts[i].end;
+ len = ALIGN(end, c->min_io_size);
+
+ err = ubifs_leb_read(c, lnum, c->sbuf, 0, len, 0);
+ if (err && err != -EBADMSG)
+ return err;
+
+ if (len > end)
+ ubifs_pad(c, c->sbuf + end, len - end);
+
+ err = ubifs_leb_change(c, lnum, c->sbuf, len);
+ if (err)
+ return err;
+ }
+
+ return err;
}

static int do_repair(struct ubifs_info *c)
@@ -1835,9 +1902,11 @@ static int do_repair(struct ubifs_info *c)
if (err)
goto out;

- /* Step 8: Record used LEBs. */
- ubifs_msg(c, "Step 8: Record used LEBs");
- record_used_lebs(c);
+ /*
+ * Step 8: Record used LEBs.
+ * Step 9: Re-write data to clean corrupted data.
+ */
+ err = traverse_files_and_nodes(c);

out:
destroy_scanned_info(c, &si);
diff --git a/fs/ubifs/repair.h b/fs/ubifs/repair.h
index fecf437ff0f7..2ab885fefee0 100644
--- a/fs/ubifs/repair.h
+++ b/fs/ubifs/repair.h
@@ -151,13 +151,24 @@ struct scanned_file {
struct rb_root data_nodes;
};

+
+/**
+ * lprops - logical eraseblock properties.
+ * @end: the end postition of LEB calculated by the last node
+ */
+struct lprops {
+ int end;
+};
+
/**
* ubifs_repair_info - per-FS repairing information.
* @usen_lebs: a bitmap used for recording used lebs
+ * @lpts: lprops table
* @scanned_files: tree of all scanned files
*/
struct ubifs_repair_info {
unsigned long *used_lebs;
+ struct lprops *lpts;
struct rb_root scanned_files;
};

--
2.31.1


2023-12-28 01:39:52

by Zhihao Cheng

[permalink] [raw]
Subject: [PATCH RFC 07/17] ubifs: repair: Check and correct files' information

This is the 7/13 step of repairing. Correct the file information.
Traverse all files and calculate information (nlink, size, xattr_cnt,
etc.) for each file just like check_leaf() does, correct inode node
based on calculated information.

Now, all files are consistent, and UBIFS will pass chk_fs after mounting.

Signed-off-by: Zhihao Cheng <[email protected]>
---
fs/ubifs/debug.c | 24 +----
fs/ubifs/repair.c | 260 ++++++++++++++++++++++++++++++++++++++++++++++
fs/ubifs/ubifs.h | 26 +++++
3 files changed, 287 insertions(+), 23 deletions(-)

diff --git a/fs/ubifs/debug.c b/fs/ubifs/debug.c
index d013c5b3f1ed..1fe180c22b96 100644
--- a/fs/ubifs/debug.c
+++ b/fs/ubifs/debug.c
@@ -65,28 +65,6 @@ static const char *get_key_type(int type)
}
}

-static const char *get_dent_type(int type)
-{
- switch (type) {
- case UBIFS_ITYPE_REG:
- return "file";
- case UBIFS_ITYPE_DIR:
- return "dir";
- case UBIFS_ITYPE_LNK:
- return "symlink";
- case UBIFS_ITYPE_BLK:
- return "blkdev";
- case UBIFS_ITYPE_CHR:
- return "char dev";
- case UBIFS_ITYPE_FIFO:
- return "fifo";
- case UBIFS_ITYPE_SOCK:
- return "socket";
- default:
- return "unknown/invalid type";
- }
-}
-
const char *dbg_snprintf_key(const struct ubifs_info *c,
const union ubifs_key *key, char *buffer, int len)
{
@@ -279,7 +257,7 @@ void ubifs_dump_inode(struct ubifs_info *c, const struct inode *inode)

pr_err("\t%d: inode %llu, type %s, len %d\n",
count++, (unsigned long long) le64_to_cpu(dent->inum),
- get_dent_type(dent->type),
+ ubifs_get_type_name(dent->type),
le16_to_cpu(dent->nlen));

fname_name(&nm) = dent->name;
diff --git a/fs/ubifs/repair.c b/fs/ubifs/repair.c
index c9435c9aa148..cb70a14e2a54 100644
--- a/fs/ubifs/repair.c
+++ b/fs/ubifs/repair.c
@@ -12,6 +12,7 @@
*/

#include <linux/module.h>
+#include <linux/crc32.h>
#include "ubifs.h"
#include "repair.h"

@@ -1454,6 +1455,261 @@ static void extract_dentry_tree(struct ubifs_info *c)
}
}

+/**
+ * calculate_file_info - calculate the information of file
+ * @c: UBIFS file-system description object
+ * @file: file object
+ *
+ * This function calculates file information according to dentry nodes,
+ * data nodes and truncation node. The calculated informaion will be used
+ * to correct inode node.
+ */
+static void calculate_file_info(struct ubifs_info *c, struct scanned_file *file)
+{
+ int nlink = 0;
+ bool corrupted_truncation = false;
+ unsigned long long trun_size, ino_sqnum, new_size = 0, trun_sqnum = 0;
+ struct rb_node *node;
+ struct scanned_file *parent_file;
+ struct scanned_dent_node *dent_node;
+ struct scanned_data_node *data_node;
+ LIST_HEAD(drop_list);
+
+ if (file->inum == UBIFS_ROOT_INO) {
+ file->calc_nlink += 2;
+ file->calc_size += UBIFS_INO_NODE_SZ;
+ return;
+ }
+
+ if ((file->ino.mode & S_IFMT) == S_IFDIR) {
+ file->calc_nlink += 2;
+ file->calc_size += UBIFS_INO_NODE_SZ;
+
+ dent_node = rb_entry(rb_first(&file->dent_nodes),
+ struct scanned_dent_node, rb);
+ parent_file = lookup_file(c, key_inum(c, &dent_node->key));
+ ubifs_assert(c, parent_file);
+ parent_file->calc_nlink += 1;
+ parent_file->calc_size += CALC_DENT_SIZE(dent_node->nlen);
+ return;
+ }
+
+ if (file->ino.is_xattr) {
+ file->calc_nlink = 1;
+ file->calc_size = file->ino.size;
+
+ dent_node = rb_entry(rb_first(&file->dent_nodes),
+ struct scanned_dent_node, rb);
+ parent_file = lookup_file(c, key_inum(c, &dent_node->key));
+ ubifs_assert(c, parent_file);
+ parent_file->calc_xcnt += 1;
+ parent_file->calc_xsz += CALC_DENT_SIZE(dent_node->nlen);
+ parent_file->calc_xsz += CALC_XATTR_BYTES(file->ino.size);
+ parent_file->calc_xnms += dent_node->nlen;
+ return;
+ }
+
+ for (node = rb_first(&file->dent_nodes); node; node = rb_next(node)) {
+ nlink++;
+
+ cond_resched();
+ dent_node = rb_entry(node, struct scanned_dent_node, rb);
+
+ parent_file = lookup_file(c, key_inum(c, &dent_node->key));
+ ubifs_assert(c, parent_file);
+ parent_file->calc_size += CALC_DENT_SIZE(dent_node->nlen);
+ }
+ file->calc_nlink = nlink;
+
+ if ((file->ino.mode & S_IFMT) != S_IFREG) {
+ /* No need to verify i_size for symlink/sock/block/char/fifo. */
+ file->calc_size = file->ino.size;
+ return;
+ }
+
+ /*
+ * Process i_size and data content, following situations should
+ * be considered:
+ * 1. Sequential writing or overwriting, i_size should be
+ * max(i_size, data node size), pick larger sqnum one from
+ * data nodes with same block index.
+ * 2. Mixed truncation and writing, i_size depends on the latest
+ * truncation node or inode node or last data node, pick data
+ * nodes which are not truncated.
+ * 3. Setting bigger i_size attr, pick inode size or biggest
+ * i_size calculated by data nodes.
+ */
+ if (file->trun.header.exist) {
+ trun_size = file->trun.new_size;
+ trun_sqnum = file->trun.header.sqnum;
+ }
+ ino_sqnum = file->ino.header.sqnum;
+ for (node = rb_first(&file->data_nodes); node; node = rb_next(node)) {
+ unsigned long long d_sz, d_sqnum;
+ unsigned int block_no;
+
+ cond_resched();
+ data_node = rb_entry(node, struct scanned_data_node, rb);
+
+ d_sqnum = data_node->header.sqnum;
+ block_no = key_block(c, &data_node->key);
+ d_sz = data_node->size + block_no * UBIFS_BLOCK_SIZE;
+ if ((trun_sqnum > d_sqnum && trun_size < d_sz) ||
+ (ino_sqnum > d_sqnum && file->ino.size < d_sz)) {
+ /*
+ * The truncated data nodes are not gced after
+ * truncating, just remove them.
+ */
+ list_move(&data_node->list, &drop_list);
+ } else if (ino_sqnum < d_sqnum) {
+ /*
+ * In general, i_size in inode is newest, unless
+ * data node is fell on disk and inode is not fell
+ * on disk when writing appendantly.
+ */
+ new_size = max(new_size, d_sz);
+ }
+ }
+ /*
+ * Truncation node is written successful, but inode node is not. It
+ * won't happen because inode node is written before truncation node
+ * according to ubifs_jnl_truncate(), unless only inode is corrupted.
+ * In this case, data nodes could have been removed in history mounting
+ * recovery, so i_size needs to be updated.
+ */
+ if (trun_sqnum > ino_sqnum && trun_size < file->ino.size) {
+ if (trun_size < new_size) {
+ corrupted_truncation = true;
+ /*
+ * Appendant writing after truncation and newest inode
+ * is not fell on disk.
+ */
+ goto update_isize;
+ }
+
+ /*
+ * Overwriting happens after truncation and newest inode is
+ * not fell on disk.
+ */
+ file->calc_size = trun_size;
+ goto drop_data;
+ }
+update_isize:
+ /*
+ * The file cannot use 'new_size' directly when the file may have ever
+ * been set i_size. For example:
+ * 1. echo 123 > file # i_size = 4
+ * 2. truncate -s 100 file # i_size = 100
+ * After scanning, new_size is 4. Apperantly the size of 'file' should
+ * be 100. So, the calculated new_size according to data nodes should
+ * only be used for extending i_size, like ubifs_recover_size() does.
+ */
+ if (new_size > file->ino.size || corrupted_truncation)
+ file->calc_size = new_size;
+ else
+ file->calc_size = file->ino.size;
+
+drop_data:
+ while (!list_empty(&drop_list)) {
+ cond_resched();
+ data_node = list_entry(drop_list.next, struct scanned_data_node,
+ list);
+
+ list_del(&data_node->list);
+ rb_erase(&data_node->rb, &file->data_nodes);
+ kfree(data_node);
+ }
+}
+
+/**
+ * correct_file_info - correct the information of file
+ * @c: UBIFS file-system description object
+ * @file: file object
+ *
+ * This function corrects file information according to calculated fields,
+ * eg. 'calc_nlink', 'calc_xcnt', 'calc_xsz', 'calc_xnms' and 'calc_size'.
+ * Corrected inode node will be re-written.
+ */
+static int correct_file_info(struct ubifs_info *c, struct scanned_file *file)
+{
+ uint32_t crc;
+ int err, lnum, len;
+ struct ubifs_ino_node *ino;
+
+ if (file->calc_nlink == file->ino.nlink &&
+ file->calc_xcnt == file->ino.xcnt &&
+ file->calc_xsz == file->ino.xsz &&
+ file->calc_xnms == file->ino.xnms &&
+ file->calc_size == file->ino.size)
+ return 0;
+
+ lnum = file->ino.header.lnum;
+ dbg_repair("correct file(inum:%lu type:%s), nlink %u->%u, xattr cnt %u->%u, xattr size %u->%u, xattr names %u->%u, size %llu->%llu, at %d:%d, in ubi%d_%d",
+ file->inum, file->ino.is_xattr ? "xattr" :
+ ubifs_get_type_name(ubifs_get_dent_type(file->ino.mode)),
+ file->ino.nlink, file->calc_nlink,
+ file->ino.xcnt, file->calc_xcnt,
+ file->ino.xsz, file->calc_xsz,
+ file->ino.xnms, file->calc_xnms,
+ file->ino.size, file->calc_size,
+ lnum, file->ino.header.offs,
+ c->vi.ubi_num, c->vi.vol_id);
+
+ err = ubifs_leb_read(c, lnum, c->sbuf, 0, c->leb_size, 0);
+ if (err && err != -EBADMSG)
+ return err;
+
+ ino = c->sbuf + file->ino.header.offs;
+ ino->nlink = cpu_to_le32(file->calc_nlink);
+ ino->xattr_cnt = cpu_to_le32(file->calc_xcnt);
+ ino->xattr_size = cpu_to_le32(file->calc_xsz);
+ ino->xattr_names = cpu_to_le32(file->calc_xnms);
+ ino->size = cpu_to_le64(file->calc_size);
+ len = le32_to_cpu(ino->ch.len);
+ crc = crc32(UBIFS_CRC32_INIT, (void *)ino + 8, len - 8);
+ ino->ch.crc = cpu_to_le32(crc);
+
+ /* Atomically write the fixed LEB back again */
+ return ubifs_leb_change(c, lnum, c->sbuf, c->leb_size);
+}
+
+/**
+ * check_and_correct_files - check and correct information of files.
+ * @c: UBIFS file-system description object
+ *
+ * This function does similar things with dbg_check_filesystem(), besides,
+ * it also corrects file information if the calculated information is not
+ * consistent with information from flash.
+ */
+static int check_and_correct_files(struct ubifs_info *c)
+{
+ int err;
+ struct rb_node *node;
+ struct scanned_file *file;
+
+ for (node = rb_first(&c->repair->scanned_files); node;
+ node = rb_next(node)) {
+ cond_resched();
+ file = rb_entry(node, struct scanned_file, rb);
+
+ calculate_file_info(c, file);
+ }
+
+ for (node = rb_first(&c->repair->scanned_files); node;
+ node = rb_next(node)) {
+ if (fatal_signal_pending(current))
+ return -EINTR;
+ cond_resched();
+ file = rb_entry(node, struct scanned_file, rb);
+
+ err = correct_file_info(c, file);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
static int do_repair(struct ubifs_info *c)
{
int err = 0;
@@ -1485,6 +1741,10 @@ static int do_repair(struct ubifs_info *c)
ubifs_msg(c, "Step 5: Extract reachable files");
extract_dentry_tree(c);

+ /* Step 7: Check & correct files' information. */
+ ubifs_msg(c, "Step 7: Check & correct file information");
+ err = check_and_correct_files(c);
+
out:
destroy_scanned_info(c, &si);
return err;
diff --git a/fs/ubifs/ubifs.h b/fs/ubifs/ubifs.h
index 449ab220db30..726161f46f3c 100644
--- a/fs/ubifs/ubifs.h
+++ b/fs/ubifs/ubifs.h
@@ -1856,6 +1856,32 @@ static inline int ubifs_get_dent_type(int mode)
return 0;
}

+/**
+ * ubifs_get_type_name - get type name according to file type in dentry node.
+ * @type: file type in dentry node
+ */
+static inline const char *ubifs_get_type_name(int type)
+{
+ switch (type) {
+ case UBIFS_ITYPE_REG:
+ return "file";
+ case UBIFS_ITYPE_DIR:
+ return "dir";
+ case UBIFS_ITYPE_LNK:
+ return "symlink";
+ case UBIFS_ITYPE_BLK:
+ return "blkdev";
+ case UBIFS_ITYPE_CHR:
+ return "char dev";
+ case UBIFS_ITYPE_FIFO:
+ return "fifo";
+ case UBIFS_ITYPE_SOCK:
+ return "socket";
+ default:
+ return "unknown/invalid type";
+ }
+}
+
/* budget.c */
int ubifs_budget_space(struct ubifs_info *c, struct ubifs_budget_req *req);
void ubifs_release_budget(struct ubifs_info *c, struct ubifs_budget_req *req);
--
2.31.1


2023-12-28 01:39:55

by Zhihao Cheng

[permalink] [raw]
Subject: [PATCH RFC 10/17] ubifs: repair: Create new root dir if there are no scanned files

This is a preparation for building TNC, there must at least one file
in filesystem, if not, just create new root dir.

Signed-off-by: Zhihao Cheng <[email protected]>
---
fs/ubifs/repair.c | 187 +++++++++++++++++++++++++++++++++++++++++++++-
fs/ubifs/repair.h | 6 ++
fs/ubifs/sb.c | 24 +-----
fs/ubifs/ubifs.h | 28 +++++++
4 files changed, 220 insertions(+), 25 deletions(-)

diff --git a/fs/ubifs/repair.c b/fs/ubifs/repair.c
index 59b79481974a..c43150fff19d 100644
--- a/fs/ubifs/repair.c
+++ b/fs/ubifs/repair.c
@@ -50,9 +50,17 @@ static int init_repair_info(struct ubifs_info *c)
err = -ENOMEM;
goto free_used_lebs;
}
+ c->repair->write_buf = vmalloc(c->leb_size);
+ if (!c->repair->write_buf) {
+ err = -ENOMEM;
+ goto free_lpts;
+ }
+ c->repair->head_lnum = -1;

return 0;

+free_lpts:
+ kfree(c->repair->lpts);
free_used_lebs:
bitmap_free(c->repair->used_lebs);
free_repair:
@@ -62,6 +70,7 @@ static int init_repair_info(struct ubifs_info *c)

static void destroy_repair_info(struct ubifs_info *c)
{
+ vfree(c->repair->write_buf);
kfree(c->repair->lpts);
bitmap_free(c->repair->used_lebs);
kfree(c->repair);
@@ -1745,6 +1754,171 @@ static int check_and_correct_files(struct ubifs_info *c)
return 0;
}

+/**
+ * get_free_leb - get a free LEB according to @c->repair->used_lebs.
+ * @c: UBIFS file-system description object
+ *
+ * This function tries to find a free LEB, %0 is returned if found, otherwise
+ * %ENOSPC is returned.
+ */
+static int get_free_leb(struct ubifs_info *c)
+{
+ int lnum, err;
+
+ lnum = find_next_zero_bit(c->repair->used_lebs, c->main_lebs, 0);
+ if (lnum >= c->main_lebs) {
+ ubifs_err(c, "No space left.");
+ return -ENOSPC;
+ }
+ set_bit(lnum, c->repair->used_lebs);
+ lnum += c->main_first;
+
+ err = ubifs_leb_unmap(c, lnum);
+ if (err)
+ return err;
+
+ c->repair->head_lnum = lnum;
+ c->repair->head_offs = 0;
+
+ return 0;
+}
+
+/**
+ * flush_write_buf - flush write buffer.
+ * @c: UBIFS file-system description object
+ *
+ * This function flush write buffer to LEB @c->repair->head_lnum, then set
+ * @c->repair->head_lnum to '-1'.
+ */
+static int flush_write_buf(struct ubifs_info *c)
+{
+ int len, pad, err;
+
+ if (!c->repair->head_offs)
+ return 0;
+
+ len = ALIGN(c->repair->head_offs, c->min_io_size);
+ pad = len - c->repair->head_offs;
+ if (pad)
+ ubifs_pad(c, c->repair->write_buf + c->repair->head_offs, pad);
+
+ err = ubifs_leb_write(c, c->repair->head_lnum, c->repair->write_buf, 0,
+ len);
+ if (err)
+ return err;
+
+ c->repair->head_lnum = -1;
+
+ return 0;
+}
+
+/**
+ * reserve_space - reserve enough space to write data.
+ * @c: UBIFS file-system description object
+ * @len: the length of written data
+ * @lnum: the write LEB number is returned here
+ * @offs: the write pos in LEB is returned here
+ *
+ * This function finds target position <@lnum, @offs> to write data with
+ * length of @len.
+ */
+static int reserve_space(struct ubifs_info *c, int len, int *lnum, int *offs)
+{
+ int err;
+
+ if (c->repair->head_lnum == -1) {
+get_new:
+ err = get_free_leb(c);
+ if (err)
+ return err;
+ }
+
+ if (len > c->leb_size - c->repair->head_offs) {
+ err = flush_write_buf(c);
+ if (err)
+ return err;
+
+ goto get_new;
+ }
+
+ *lnum = c->repair->head_lnum;
+ *offs = c->repair->head_offs;
+ c->repair->head_offs += ALIGN(len, 8);
+
+ return 0;
+}
+
+static void copy_node_data(struct ubifs_info *c, void *node, int offs, int len)
+{
+ memcpy(c->repair->write_buf + offs, node, len);
+ memset(c->repair->write_buf + offs + len, 0xff, ALIGN(len, 8) - len);
+}
+
+/**
+ * create_root - create root dir.
+ * @c: UBIFS file-system description object
+ *
+ * This function creates root dir.
+ */
+static int create_root(struct ubifs_info *c)
+{
+ int err, lnum, offs;
+ struct ubifs_ino_node *ino;
+ struct scanned_file *file;
+
+ ino = kzalloc(ALIGN(UBIFS_INO_NODE_SZ, c->min_io_size), GFP_KERNEL);
+ if (!ino)
+ return -ENOMEM;
+
+ c->max_sqnum = 0;
+ ubifs_init_root_ino(c, ino);
+ err = ubifs_prepare_node_hmac(c, ino, UBIFS_INO_NODE_SZ, -1, 1);
+ if (err)
+ goto out;
+
+ err = reserve_space(c, UBIFS_INO_NODE_SZ, &lnum, &offs);
+ if (err)
+ goto out;
+
+ copy_node_data(c, ino, offs, UBIFS_INO_NODE_SZ);
+
+ err = flush_write_buf(c);
+ if (err)
+ goto out;
+
+ file = kzalloc(sizeof(struct scanned_file), GFP_KERNEL);
+ if (!file) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ file->inum = UBIFS_ROOT_INO;
+ file->dent_nodes = RB_ROOT;
+ file->data_nodes = RB_ROOT;
+ INIT_LIST_HEAD(&file->list);
+
+ file->ino.header.exist = true;
+ file->ino.header.lnum = lnum;
+ file->ino.header.offs = offs;
+ file->ino.header.len = UBIFS_INO_NODE_SZ;
+ file->ino.header.sqnum = le64_to_cpu(ino->ch.sqnum);
+ ino_key_init(c, &file->ino.key, UBIFS_ROOT_INO);
+ file->ino.is_xattr = le32_to_cpu(ino->flags) & UBIFS_XATTR_FL;
+ file->ino.mode = le32_to_cpu(ino->mode);
+ file->calc_nlink = file->ino.nlink = le32_to_cpu(ino->nlink);
+ file->calc_xcnt = file->ino.xcnt = le32_to_cpu(ino->xattr_cnt);
+ file->calc_xsz = file->ino.xsz = le32_to_cpu(ino->xattr_size);
+ file->calc_xnms = file->ino.xnms = le32_to_cpu(ino->xattr_names);
+ file->calc_size = file->ino.size = le64_to_cpu(ino->size);
+
+ rb_link_node(&file->rb, NULL, &c->repair->scanned_files.rb_node);
+ rb_insert_color(&file->rb, &c->repair->scanned_files);
+
+out:
+ kfree(ino);
+ return err;
+}
+
static char *get_file_name(struct ubifs_info *c, struct scanned_file *file)
{
static char name[UBIFS_MAX_NLEN + 1];
@@ -1786,9 +1960,10 @@ static void parse_node_location(struct ubifs_info *c, struct scanned_node *sn)
*
* This function traverses all nodes from valid files and does following
* things:
- * 1. Record all used LEBs which may hold useful nodes, then left unused
+ * 1. If there are no scanned files, create default empty filesystem.
+ * 2. Record all used LEBs which may hold useful nodes, then left unused
* LEBs could be taken for storing new index tree.
- * 2. Re-write data to prevent failed gc scanning in the subsequent mounting
+ * 3. Re-write data to prevent failed gc scanning in the subsequent mounting
* process caused by corrupted data.
*/
static int traverse_files_and_nodes(struct ubifs_info *c)
@@ -1799,6 +1974,14 @@ static int traverse_files_and_nodes(struct ubifs_info *c)
struct scanned_dent_node *dent_node;
struct scanned_data_node *data_node;

+ if (rb_first(&c->repair->scanned_files) == NULL) {
+ /* No scanned files. Create root dir. */
+ ubifs_msg(c, "No scanned files, create empty filesystem");
+ err = create_root(c);
+ if (err)
+ return err;
+ }
+
ubifs_msg(c, "Step 8: Record used LEBs");
for (node = rb_first(&c->repair->scanned_files); node;
node = rb_next(node)) {
diff --git a/fs/ubifs/repair.h b/fs/ubifs/repair.h
index 2ab885fefee0..e3b879b3f999 100644
--- a/fs/ubifs/repair.h
+++ b/fs/ubifs/repair.h
@@ -165,11 +165,17 @@ struct lprops {
* @usen_lebs: a bitmap used for recording used lebs
* @lpts: lprops table
* @scanned_files: tree of all scanned files
+ * @write_buf: write buffer for LEB @head_lnum
+ * @head_lnum: current writing LEB number
+ * @head_offs: current writing position in LEB @head_lnum
*/
struct ubifs_repair_info {
unsigned long *used_lebs;
struct lprops *lpts;
struct rb_root scanned_files;
+ void *write_buf;
+ int head_lnum;
+ int head_offs;
};

#endif /* !__UBIFS_REPAIR_H__ */
diff --git a/fs/ubifs/sb.c b/fs/ubifs/sb.c
index e7693b94e5b5..d46798fd548b 100644
--- a/fs/ubifs/sb.c
+++ b/fs/ubifs/sb.c
@@ -86,8 +86,6 @@ static int create_default_filesystem(struct ubifs_info *c)
int min_leb_cnt = UBIFS_MIN_LEB_CNT;
int idx_node_size;
long long tmp64, main_bytes;
- __le64 tmp_le64;
- struct timespec64 ts;
u8 hash[UBIFS_HASH_ARR_SZ];
u8 hash_lpt[UBIFS_HASH_ARR_SZ];

@@ -287,27 +285,7 @@ static int create_default_filesystem(struct ubifs_info *c)
dbg_gen("default root indexing node created LEB %d:0",
main_first + DEFAULT_IDX_LEB);

- /* Create default root inode */
-
- ino_key_init_flash(c, &ino->key, UBIFS_ROOT_INO);
- ino->ch.node_type = UBIFS_INO_NODE;
- ino->creat_sqnum = cpu_to_le64(++c->max_sqnum);
- ino->nlink = cpu_to_le32(2);
-
- ktime_get_coarse_real_ts64(&ts);
- tmp_le64 = cpu_to_le64(ts.tv_sec);
- ino->atime_sec = tmp_le64;
- ino->ctime_sec = tmp_le64;
- ino->mtime_sec = tmp_le64;
- ino->atime_nsec = 0;
- ino->ctime_nsec = 0;
- ino->mtime_nsec = 0;
- ino->mode = cpu_to_le32(S_IFDIR | S_IRUGO | S_IWUSR | S_IXUGO);
- ino->size = cpu_to_le64(UBIFS_INO_NODE_SZ);
-
- /* Set compression enabled by default */
- ino->flags = cpu_to_le32(UBIFS_COMPR_FL);
-
+ ubifs_init_root_ino(c, ino);
dbg_gen("root inode created at LEB %d:0",
main_first + DEFAULT_DATA_LEB);

diff --git a/fs/ubifs/ubifs.h b/fs/ubifs/ubifs.h
index 726161f46f3c..7d66440623cb 100644
--- a/fs/ubifs/ubifs.h
+++ b/fs/ubifs/ubifs.h
@@ -2207,6 +2207,34 @@ int ubifs_decrypt(const struct inode *inode, struct ubifs_data_node *dn,

extern const struct fscrypt_operations ubifs_crypt_operations;

+static inline void ubifs_init_root_ino(struct ubifs_info *c,
+ struct ubifs_ino_node *ino)
+{
+ __le64 tmp_le64;
+ struct timespec64 ts;
+
+ /* Create default root inode */
+
+ ino_key_init_flash(c, &ino->key, UBIFS_ROOT_INO);
+ ino->ch.node_type = UBIFS_INO_NODE;
+ ino->creat_sqnum = cpu_to_le64(++c->max_sqnum);
+ ino->nlink = cpu_to_le32(2);
+
+ ktime_get_coarse_real_ts64(&ts);
+ tmp_le64 = cpu_to_le64(ts.tv_sec);
+ ino->atime_sec = tmp_le64;
+ ino->ctime_sec = tmp_le64;
+ ino->mtime_sec = tmp_le64;
+ ino->atime_nsec = 0;
+ ino->ctime_nsec = 0;
+ ino->mtime_nsec = 0;
+ ino->mode = cpu_to_le32(S_IFDIR | S_IRUGO | S_IWUSR | S_IXUGO);
+ ino->size = cpu_to_le64(UBIFS_INO_NODE_SZ);
+
+ /* Set compression enabled by default */
+ ino->flags = cpu_to_le32(UBIFS_COMPR_FL);
+}
+
/* Normal UBIFS messages */
__printf(2, 3)
void ubifs_msg(const struct ubifs_info *c, const char *fmt, ...);
--
2.31.1


2023-12-28 01:39:59

by Zhihao Cheng

[permalink] [raw]
Subject: [PATCH RFC 08/17] ubifs: repair: Record used LEBs

This is the 8/13 step of repairing. Record used LEBs which may hold useful
nodes, then left unused LEBs could be taken for storing new index tree.
Notice, LEB that contains effective nodes on deleted trees in step 2 is
regarded as used.

Signed-off-by: Zhihao Cheng <[email protected]>
---
fs/ubifs/repair.c | 94 +++++++++++++++++++++++++++++++++++++++++++++++
fs/ubifs/repair.h | 2 +
2 files changed, 96 insertions(+)

diff --git a/fs/ubifs/repair.c b/fs/ubifs/repair.c
index cb70a14e2a54..7d97de6219fa 100644
--- a/fs/ubifs/repair.c
+++ b/fs/ubifs/repair.c
@@ -37,12 +37,18 @@ static int init_repair_info(struct ubifs_info *c)
return -ENOMEM;

c->repair->scanned_files = RB_ROOT;
+ c->repair->used_lebs = bitmap_zalloc(c->main_lebs, GFP_KERNEL);
+ if (!c->repair->used_lebs) {
+ kfree(c->repair);
+ return -ENOMEM;
+ }

return 0;
}

static void destroy_repair_info(struct ubifs_info *c)
{
+ bitmap_free(c->repair->used_lebs);
kfree(c->repair);
}

@@ -1020,6 +1026,9 @@ static void remove_del_nodes(struct ubifs_info *c, struct scanned_info *si)

valid_ino_node = lookup_valid_ino_node(c, si, del_ino_node);
if (valid_ino_node) {
+ int lnum = del_ino_node->header.lnum;
+
+ set_bit(lnum - c->main_first, c->repair->used_lebs);
rb_erase(&valid_ino_node->rb, &si->valid_inos);
kfree(valid_ino_node);
}
@@ -1036,6 +1045,9 @@ static void remove_del_nodes(struct ubifs_info *c, struct scanned_info *si)

valid_dent_node = lookup_valid_dent_node(c, si, del_dent_node);
if (valid_dent_node) {
+ int lnum = del_dent_node->header.lnum;
+
+ set_bit(lnum - c->main_first, c->repair->used_lebs);
rb_erase(&valid_dent_node->rb, &si->valid_dents);
kfree(valid_dent_node);
}
@@ -1710,6 +1722,82 @@ static int check_and_correct_files(struct ubifs_info *c)
return 0;
}

+static char *get_file_name(struct ubifs_info *c, struct scanned_file *file)
+{
+ static char name[UBIFS_MAX_NLEN + 1];
+ struct rb_node *node;
+ struct scanned_dent_node *dent_node;
+
+ node = rb_first(&file->dent_nodes);
+ if (!node) {
+ ubifs_assert(c, file->inum == UBIFS_ROOT_INO);
+ return "/";
+ }
+
+ if (c->encrypted && !file->ino.is_xattr)
+ /* Encrypted file name. */
+ return "<encrypted>";
+
+ /* Get name from any one dentry. */
+ dent_node = rb_entry(node, struct scanned_dent_node, rb);
+ memcpy(name, dent_node->name, dent_node->nlen);
+ /* @dent->name could be non '\0' terminated. */
+ name[dent_node->nlen] = '\0';
+ return name;
+}
+
+/**
+ * record_used_lebs - record used LEBs.
+ * @c: UBIFS file-system description object
+ *
+ * This function records all used LEBs which may hold useful nodes, then left
+ * unused LEBs could be taken for storing new index tree.
+ */
+static void record_used_lebs(struct ubifs_info *c)
+{
+ int lnum;
+ struct rb_node *node, *n;
+ struct scanned_file *file;
+ struct scanned_dent_node *dent_node;
+ struct scanned_data_node *data_node;
+
+ for (node = rb_first(&c->repair->scanned_files); node;
+ node = rb_next(node)) {
+ cond_resched();
+ file = rb_entry(node, struct scanned_file, rb);
+
+ dbg_repair("recovered file(inum:%lu name:%s type:%s), in ubi%d_%d",
+ file->inum, get_file_name(c, file),
+ file->ino.is_xattr ? "xattr" :
+ ubifs_get_type_name(ubifs_get_dent_type(file->ino.mode)),
+ c->vi.ubi_num, c->vi.vol_id);
+
+ lnum = file->ino.header.lnum;
+ set_bit(lnum - c->main_first, c->repair->used_lebs);
+
+ if (file->trun.header.exist) {
+ lnum = file->trun.header.lnum;
+ set_bit(lnum - c->main_first, c->repair->used_lebs);
+ }
+
+ for (n = rb_first(&file->data_nodes); n; n = rb_next(n)) {
+ cond_resched();
+ data_node = rb_entry(n, struct scanned_data_node, rb);
+
+ lnum = data_node->header.lnum;
+ set_bit(lnum - c->main_first, c->repair->used_lebs);
+ }
+
+ for (n = rb_first(&file->dent_nodes); n; n = rb_next(n)) {
+ cond_resched();
+ dent_node = rb_entry(n, struct scanned_dent_node, rb);
+
+ lnum = dent_node->header.lnum;
+ set_bit(lnum - c->main_first, c->repair->used_lebs);
+ }
+ }
+}
+
static int do_repair(struct ubifs_info *c)
{
int err = 0;
@@ -1744,6 +1832,12 @@ static int do_repair(struct ubifs_info *c)
/* Step 7: Check & correct files' information. */
ubifs_msg(c, "Step 7: Check & correct file information");
err = check_and_correct_files(c);
+ if (err)
+ goto out;
+
+ /* Step 8: Record used LEBs. */
+ ubifs_msg(c, "Step 8: Record used LEBs");
+ record_used_lebs(c);

out:
destroy_scanned_info(c, &si);
diff --git a/fs/ubifs/repair.h b/fs/ubifs/repair.h
index 05bfe9a2189e..fecf437ff0f7 100644
--- a/fs/ubifs/repair.h
+++ b/fs/ubifs/repair.h
@@ -153,9 +153,11 @@ struct scanned_file {

/**
* ubifs_repair_info - per-FS repairing information.
+ * @usen_lebs: a bitmap used for recording used lebs
* @scanned_files: tree of all scanned files
*/
struct ubifs_repair_info {
+ unsigned long *used_lebs;
struct rb_root scanned_files;
};

--
2.31.1


2023-12-28 01:40:22

by Zhihao Cheng

[permalink] [raw]
Subject: [PATCH RFC 02/17] ubifs: repair: Scan nodes from volume

This is the 2/13 step of repairing. Collect files, valid inode nodes,
deleted inode nodes, valid dentry nodes and deleted dentry nodes in
kinds of trees by scanning nodes from flash. Corrupted nodes(eg.
incorrect crc, bad inode size, bad dentry name length, etc.) are
dropped during scanning.

The final recovered file(xattr is treated as a file) is organized as:
(rbtree, inum indexed)
/ \
file1 file2
/ \
file3 file4

file {
inode node // each file has 1 inode node
dentry (sub rb_tree, sqnum indexed) // '/' has no dentries,
// otherwise at least 1
// dentry is required.
trun node // the newest one truncation node
data (sub rb_tree, block number indexed) // Each file may have 0
// many data nodes
}

Each file from file rb_tree is constructed by scanning nodes(eg. inode,
dentry, etc.) from UBI volume. In this step, trun node and data nodes
are put into certain file, inode/dentry nodes are put into four trees:
valid_inos(nlink != 0), del_inos(nlink is 0), valid_dents(inum != 0),
del_dents(inum is 0). Step 3 will process above four trees to deal
deletion situations.

Signed-off-by: Zhihao Cheng <[email protected]>
---
fs/ubifs/debug.h | 2 +
fs/ubifs/repair.c | 941 ++++++++++++++++++++++++++++++++++++++++++++++
fs/ubifs/repair.h | 158 ++++++++
fs/ubifs/ubifs.h | 10 +-
4 files changed, 1109 insertions(+), 2 deletions(-)
create mode 100644 fs/ubifs/repair.h

diff --git a/fs/ubifs/debug.h b/fs/ubifs/debug.h
index ed966108da80..6d35c165b8f5 100644
--- a/fs/ubifs/debug.h
+++ b/fs/ubifs/debug.h
@@ -198,6 +198,8 @@ void ubifs_assert_failed(struct ubifs_info *c, const char *expr,
#define dbg_scan(fmt, ...) ubifs_dbg_msg("scan", fmt, ##__VA_ARGS__)
/* Additional recovery messages */
#define dbg_rcvry(fmt, ...) ubifs_dbg_msg("rcvry", fmt, ##__VA_ARGS__)
+/* Additional repairing messages */
+#define dbg_repair(fmt, ...) ubifs_dbg_msg("repair", fmt, ##__VA_ARGS__)

extern struct ubifs_global_debug_info ubifs_dbg;

diff --git a/fs/ubifs/repair.c b/fs/ubifs/repair.c
index b722187de74f..18920e9896bc 100644
--- a/fs/ubifs/repair.c
+++ b/fs/ubifs/repair.c
@@ -13,6 +13,937 @@

#include <linux/module.h>
#include "ubifs.h"
+#include "repair.h"
+
+/**
+ * scanned_info - nodes and files information from scanning.
+ * @valid_inos: the tree of scanned inode nodes with 'nlink > 0'
+ * @del_inos: the tree of scanned inode nodes with 'nlink = 0'
+ * @valid_dents: the tree of scanned dentry nodes with 'inum > 0'
+ * @del_dents: the tree of scanned dentry nodes with 'inum = 0'
+ */
+struct scanned_info {
+ struct rb_root valid_inos;
+ struct rb_root del_inos;
+ struct rb_root valid_dents;
+ struct rb_root del_dents;
+};
+
+static int init_repair_info(struct ubifs_info *c)
+{
+ c->repair = kzalloc(sizeof(struct ubifs_repair_info), GFP_KERNEL);
+ if (!c->repair)
+ return -ENOMEM;
+
+ c->repair->scanned_files = RB_ROOT;
+
+ return 0;
+}
+
+static void destroy_repair_info(struct ubifs_info *c)
+{
+ kfree(c->repair);
+}
+
+static void parse_node_header(struct ubifs_info *c, int lnum,
+ struct ubifs_scan_node *snod,
+ struct scanned_node *header)
+{
+ header->exist = true;
+ header->lnum = lnum;
+ header->offs = snod->offs;
+ header->len = snod->len;
+ header->sqnum = snod->sqnum;
+}
+
+/**
+ * insert_or_update_ino_node - insert or update inode node.
+ * @c: UBIFS file-system description object
+ * @new_ino: new inode node
+ * @tree: a tree to record valid/deleted inode node info
+ *
+ * This function inserts @new_ino into the @tree, or updates inode node
+ * if it already exists in the tree. Returns zero in case of success, a
+ * negative error code in case of failure.
+ */
+static int insert_or_update_ino_node(struct ubifs_info *c,
+ struct scanned_ino_node *new_ino,
+ struct rb_root *tree)
+{
+ int cmp;
+ struct scanned_ino_node *ino_node, *old_ino_node = NULL;
+ struct rb_node **p, *parent = NULL;
+
+ p = &tree->rb_node;
+ while (*p) {
+ parent = *p;
+ ino_node = rb_entry(parent, struct scanned_ino_node, rb);
+ cmp = keys_cmp(c, &new_ino->key, &ino_node->key);
+ if (cmp < 0) {
+ p = &(*p)->rb_left;
+ } else if (cmp > 0) {
+ p = &(*p)->rb_right;
+ } else {
+ old_ino_node = ino_node;
+ break;
+ }
+ }
+ if (old_ino_node) {
+ if (old_ino_node->header.sqnum < new_ino->header.sqnum) {
+ size_t len = offsetof(struct scanned_ino_node, rb);
+
+ memcpy(old_ino_node, new_ino, len);
+ }
+ return 0;
+ }
+
+ ino_node = kmalloc(sizeof(struct scanned_ino_node), GFP_KERNEL);
+ if (!ino_node)
+ return -ENOMEM;
+
+ *ino_node = *new_ino;
+ rb_link_node(&ino_node->rb, parent, p);
+ rb_insert_color(&ino_node->rb, tree);
+
+ return 0;
+}
+
+static inline bool inode_can_be_encrypted(struct ubifs_info *c,
+ struct scanned_ino_node *ino_node)
+{
+ if (!c->encrypted)
+ return false;
+
+ if (ino_node->is_xattr)
+ return false;
+
+ /* Only regular files, directories, and symlinks can be encrypted. */
+ if (S_ISREG(ino_node->mode) || S_ISDIR(ino_node->mode) ||
+ S_ISLNK(ino_node->mode))
+ return true;
+
+ return false;
+}
+
+/**
+ * parse_and_record_ino_node - parse inode node and record it in tree.
+ * @c: UBIFS file-system description object
+ * @lnum: logical eraseblock number
+ * @snod: scanned node
+ * @si: records nodes and files information during scanning
+ *
+ * This function parses, checks and records raw inode information. Returns
+ * zero in case of success, a negative error code in case of failure.
+ */
+static int parse_and_record_ino_node(struct ubifs_info *c, int lnum,
+ struct ubifs_scan_node *snod,
+ struct scanned_info *si)
+{
+ int data_len;
+ unsigned int flags;
+ struct ubifs_ino_node *ino = snod->node;
+ struct scanned_ino_node ino_node;
+ ino_t inum = key_inum(c, &snod->key);
+
+ if (!inum || inum > INUM_WATERMARK) {
+ dbg_repair("bad inode node(bad inum %lu) at %d:%d, in ubi%d_%d",
+ inum, lnum, snod->offs, c->vi.ubi_num, c->vi.vol_id);
+ goto out;
+ }
+
+ if (snod->type != key_type(c, &snod->key)) {
+ dbg_repair("bad inode node %lu(inconsistent type %d vs key_type %d) at %d:%d, in ubi%d_%d",
+ inum, snod->type, key_type(c, &snod->key), lnum,
+ snod->offs, c->vi.ubi_num, c->vi.vol_id);
+ goto out;
+ }
+
+ key_copy(c, &snod->key, &ino_node.key);
+ flags = le32_to_cpu(ino->flags);
+ data_len = le32_to_cpu(ino->data_len);
+ ino_node.is_xattr = !!(flags & UBIFS_XATTR_FL) ? 1 : 0;
+ ino_node.is_encrypted = !!(flags & UBIFS_CRYPT_FL) ? 1 : 0;
+ ino_node.mode = le32_to_cpu(ino->mode);
+ ino_node.nlink = le32_to_cpu(ino->nlink);
+ ino_node.xcnt = le32_to_cpu(ino->xattr_cnt);
+ ino_node.xsz = le32_to_cpu(ino->xattr_size);
+ ino_node.xnms = le32_to_cpu(ino->xattr_names);
+ ino_node.size = le64_to_cpu(ino->size);
+
+ if (ino_node.size > c->max_inode_sz) {
+ dbg_repair("bad inode node %lu(size %llu is too large) at %d:%d, in ubi%d_%d",
+ inum, ino_node.size, lnum, snod->offs,
+ c->vi.ubi_num, c->vi.vol_id);
+ goto out;
+ }
+
+ if (le16_to_cpu(ino->compr_type) >= UBIFS_COMPR_TYPES_CNT) {
+ dbg_repair("bad inode node %lu(unknown compression type %d) at %d:%d, in ubi%d_%d",
+ inum, le16_to_cpu(ino->compr_type), lnum, snod->offs,
+ c->vi.ubi_num, c->vi.vol_id);
+ goto out;
+ }
+
+ if (ino_node.xnms + ino_node.xcnt > XATTR_LIST_MAX) {
+ dbg_repair("bad inode node %lu(too big xnames %u xcount %u) at %d:%d, in ubi%d_%d",
+ inum, ino_node.xnms, ino_node.xcnt, lnum, snod->offs,
+ c->vi.ubi_num, c->vi.vol_id);
+ goto out;
+ }
+
+ if (data_len < 0 || data_len > UBIFS_MAX_INO_DATA) {
+ dbg_repair("bad inode node %lu(invalid data len %d) at %d:%d, in ubi%d_%d",
+ inum, data_len, lnum, snod->offs,
+ c->vi.ubi_num, c->vi.vol_id);
+ goto out;
+ }
+
+ if (ino_node.is_xattr && !S_ISREG(ino_node.mode)) {
+ dbg_repair("bad inode node %lu(bad type %u for xattr) at %d:%d, in ubi%d_%d",
+ inum, ino_node.mode & S_IFMT, lnum, snod->offs,
+ c->vi.ubi_num, c->vi.vol_id);
+ goto out;
+ }
+
+ switch (ino_node.mode & S_IFMT) {
+ case S_IFREG:
+ if (!ino_node.is_xattr && data_len != 0) {
+ dbg_repair("bad inode node %lu(bad data len %d for reg file) at %d:%d, in ubi%d_%d",
+ inum, data_len, lnum, snod->offs,
+ c->vi.ubi_num, c->vi.vol_id);
+ goto out;
+ }
+ break;
+ case S_IFDIR:
+ if (data_len != 0) {
+ dbg_repair("bad inode node %lu(bad data len %d for dir file) at %d:%d, in ubi%d_%d",
+ inum, data_len, lnum, snod->offs,
+ c->vi.ubi_num, c->vi.vol_id);
+ goto out;
+ }
+ break;
+ case S_IFLNK:
+ if (data_len == 0) {
+ /*
+ * For encryption enabled or selinux enabled situation,
+ * uninitialized inode with xattrs could be written
+ * before ubifs_jnl_update(). If the dent node is
+ * written successfully but the initialized inode is
+ * not written, ubifs_iget() will get bad symlink inode
+ * with 'ui->data_len = 0'. Similar phenomenon can also
+ * occur for block/char dev creation.
+ * Just drop the inode node when above class of
+ * exceptions are found.
+ */
+ dbg_repair("bad symlink inode node %lu(bad data len %d) at %d:%d, in ubi%d_%d",
+ inum, data_len, lnum, snod->offs,
+ c->vi.ubi_num, c->vi.vol_id);
+ goto out;
+ }
+ break;
+ case S_IFBLK:
+ fallthrough;
+ case S_IFCHR:
+ {
+ union ubifs_dev_desc *dev = (union ubifs_dev_desc *)ino->data;
+ int sz_new = sizeof(dev->new), sz_huge = sizeof(dev->huge);
+
+ if (data_len != sz_new && data_len != sz_huge) {
+ dbg_repair("bad inode node %lu(bad data len %d for char/block file, expect %d or %d) at %d:%d, in ubi%d_%d",
+ inum, data_len, sz_new, sz_huge, lnum,
+ snod->offs, c->vi.ubi_num, c->vi.vol_id);
+ goto out;
+ }
+ break;
+ }
+ case S_IFSOCK:
+ fallthrough;
+ case S_IFIFO:
+ if (data_len != 0) {
+ dbg_repair("bad inode node %lu(bad data len %d for fifo/sock file) at %d:%d, in ubi%d_%d",
+ inum, data_len, lnum, snod->offs,
+ c->vi.ubi_num, c->vi.vol_id);
+ goto out;
+ }
+ break;
+ default:
+ /* invalid file type. */
+ dbg_repair("bad inode node %lu(unknown type %u) at %d:%d, in ubi%d_%d",
+ inum, ino_node.mode & S_IFMT, lnum, snod->offs,
+ c->vi.ubi_num, c->vi.vol_id);
+ goto out;
+ }
+
+ if (ino_node.is_encrypted && !inode_can_be_encrypted(c, &ino_node)) {
+ dbg_repair("bad inode node %lu(encrypted but cannot be encrypted, type %u, is_xattr %d, fs_encrypted %d) at %d:%d, in ubi%d_%d",
+ inum, ino_node.mode & S_IFMT, ino_node.is_xattr,
+ c->encrypted, lnum, snod->offs,
+ c->vi.ubi_num, c->vi.vol_id);
+ goto out;
+ }
+
+ parse_node_header(c, lnum, snod, &ino_node.header);
+
+ if (ino_node.nlink)
+ return insert_or_update_ino_node(c, &ino_node, &si->valid_inos);
+ else
+ return insert_or_update_ino_node(c, &ino_node, &si->del_inos);
+
+out:
+ return 0;
+}
+
+/**
+ * insert_or_update_dent_node - insert or update dentry node.
+ * @c: UBIFS file-system description object
+ * @new_dent: new dentry node
+ * @tree: a tree to record valid/deleted dentry node info
+ *
+ * This function inserts @new_dent into the @tree, or updates dent node
+ * if it already exists in the tree. Returns zero in case of success, a
+ * negative error code in case of failure.
+ */
+static int insert_or_update_dent_node(struct ubifs_info *c,
+ struct scanned_dent_node *new_dent,
+ struct rb_root *tree)
+{
+ int cmp, nlen;
+ struct scanned_dent_node *dent_node, *old_dent_node = NULL;
+ struct rb_node **p, *parent = NULL;
+
+ p = &tree->rb_node;
+ while (*p) {
+ parent = *p;
+ dent_node = rb_entry(parent, struct scanned_dent_node, rb);
+ cmp = keys_cmp(c, &new_dent->key, &dent_node->key);
+ if (cmp < 0) {
+ p = &(*p)->rb_left;
+ } else if (cmp > 0) {
+ p = &(*p)->rb_right;
+ } else {
+ nlen = min(new_dent->nlen, dent_node->nlen);
+ cmp = strncmp(new_dent->name, dent_node->name, nlen) ? :
+ new_dent->nlen - dent_node->nlen;
+ if (cmp < 0) {
+ p = &(*p)->rb_left;
+ } else if (cmp > 0) {
+ p = &(*p)->rb_right;
+ } else {
+ old_dent_node = dent_node;
+ break;
+ }
+ }
+ }
+ if (old_dent_node) {
+ if (old_dent_node->header.sqnum < new_dent->header.sqnum) {
+ size_t len = offsetof(struct scanned_dent_node, rb);
+
+ memcpy(old_dent_node, new_dent, len);
+ }
+ return 0;
+ }
+
+ dent_node = kmalloc(sizeof(struct scanned_dent_node), GFP_KERNEL);
+ if (!dent_node)
+ return -ENOMEM;
+
+ *dent_node = *new_dent;
+ rb_link_node(&dent_node->rb, parent, p);
+ rb_insert_color(&dent_node->rb, tree);
+
+ return 0;
+}
+
+/**
+ * parse_and_record_dent_node - parse dentry node and record it in tree.
+ * @c: UBIFS file-system description object
+ * @lnum: logical eraseblock number
+ * @snod: scanned node
+ * @si: records nodes and files information during scanning
+ *
+ * This function parses, checks and records raw dentry/(xattr entry)
+ * information. Returns zero in case of success, a negative error code
+ * in case of failure.
+ */
+static int parse_and_record_dent_node(struct ubifs_info *c, int lnum,
+ struct ubifs_scan_node *snod,
+ struct scanned_info *si)
+{
+ struct scanned_dent_node dent_node;
+ struct ubifs_dent_node *dent = snod->node;
+ unsigned int nlen = le32_to_cpu(dent->nlen);
+ int key_type = key_type_flash(c, dent->key);
+ ino_t inum;
+
+ inum = le64_to_cpu(dent->inum);
+
+ if (snod->len != nlen + UBIFS_DENT_NODE_SZ + 1 ||
+ dent->type >= UBIFS_ITYPES_CNT ||
+ nlen > UBIFS_MAX_NLEN || dent->name[nlen] != 0 ||
+ (key_type == UBIFS_XENT_KEY && strnlen(dent->name, nlen) != nlen) ||
+ inum > INUM_WATERMARK || key_type != snod->type) {
+ dbg_repair("bad %s node at %d:%d, in ubi%d_%d",
+ snod->type == UBIFS_XENT_NODE ? "xattr entry" : "directory entry",
+ lnum, snod->offs, c->vi.ubi_num, c->vi.vol_id);
+ goto out;
+ }
+
+ key_copy(c, &snod->key, &dent_node.key);
+ dent_node.can_be_found = false;
+ dent_node.type = dent->type;
+ dent_node.nlen = nlen;
+ memcpy(dent_node.name, dent->name, nlen);
+ dent_node.inum = inum;
+
+ parse_node_header(c, lnum, snod, &dent_node.header);
+
+ if (inum)
+ return insert_or_update_dent_node(c, &dent_node,
+ &si->valid_dents);
+ else
+ return insert_or_update_dent_node(c, &dent_node,
+ &si->del_dents);
+
+out:
+ return 0;
+}
+
+/**
+ * parse_data_node - parse data node.
+ * @c: UBIFS file-system description object
+ * @lnum: logical eraseblock number
+ * @snod: scanned node
+ * @data_node: the data node to store parsed information
+ * @inum: parsed inode number
+ *
+ * This function parses and checks raw data node information. Returns zero
+ * in case of success, 1% if the node is invalid.
+ */
+static int parse_data_node(struct ubifs_info *c, int lnum,
+ struct ubifs_scan_node *snod,
+ struct scanned_data_node *data_node,
+ ino_t *inum)
+{
+ struct ubifs_data_node *dn = snod->node;
+
+ if (snod->type != key_type(c, &snod->key)) {
+ dbg_repair("bad data node(inconsistent type %d vs key_type %d) at %d:%d, in ubi%d_%d",
+ snod->type, key_type(c, &snod->key), lnum, snod->offs,
+ c->vi.ubi_num, c->vi.vol_id);
+ return 1;
+ }
+
+ *inum = key_inum(c, &snod->key);
+ if (!*inum || *inum > INUM_WATERMARK) {
+ dbg_repair("bad data node(bad inum %lu) at %d:%d, in ubi%d_%d",
+ *inum, lnum, snod->offs,
+ c->vi.ubi_num, c->vi.vol_id);
+ return 1;
+ }
+
+ key_copy(c, &snod->key, &data_node->key);
+ data_node->size = le32_to_cpu(dn->size);
+
+ if (!data_node->size || data_node->size > UBIFS_BLOCK_SIZE) {
+ dbg_repair("bad data node(invalid size %u) at %d:%d, in ubi%d_%d",
+ data_node->size, lnum, snod->offs,
+ c->vi.ubi_num, c->vi.vol_id);
+ return 1;
+ }
+
+ if (le16_to_cpu(dn->compr_type) >= UBIFS_COMPR_TYPES_CNT) {
+ dbg_repair("bad data node(invalid compression type %d) at %d:%d, in ubi%d_%d",
+ le16_to_cpu(dn->compr_type), lnum, snod->offs,
+ c->vi.ubi_num, c->vi.vol_id);
+ return 1;
+ }
+
+ parse_node_header(c, lnum, snod, &data_node->header);
+
+ return 0;
+}
+
+/**
+ * parse_trun_node - parse truncation node.
+ * @c: UBIFS file-system description object
+ * @lnum: logical eraseblock number
+ * @snod: scanned node
+ * @trun_info: the truncation node to store parsed information
+ * @inum: parsed inode number
+ *
+ * This function parses and checks raw truncation node information. Returns
+ * zero in case of success, 1% if the node is invalid.
+ */
+static int parse_trun_node(struct ubifs_info *c, int lnum,
+ struct ubifs_scan_node *snod,
+ struct scanned_trun_node *trun_info,
+ ino_t *inum)
+{
+ struct ubifs_trun_node *trun = snod->node;
+ loff_t old_size = le64_to_cpu(trun->old_size);
+ loff_t new_size = le64_to_cpu(trun->new_size);
+
+ *inum = le32_to_cpu(trun->inum);
+ if (!*inum || *inum > INUM_WATERMARK) {
+ dbg_repair("bad truncation node(bad inum %lu) at %d:%d, in ubi%d_%d",
+ *inum, lnum, snod->offs,
+ c->vi.ubi_num, c->vi.vol_id);
+ return 1;
+ }
+
+ trun_info->new_size = new_size;
+
+ if (old_size < 0 || old_size > c->max_inode_sz ||
+ new_size < 0 || new_size > c->max_inode_sz ||
+ old_size <= new_size) {
+ dbg_repair("bad truncation node at %d:%d, in ubi%d_%d",
+ lnum, snod->offs, c->vi.ubi_num, c->vi.vol_id);
+ return 1;
+ }
+
+ trun_key_init(c, &snod->key, *inum);
+ parse_node_header(c, lnum, snod, &trun_info->header);
+
+ return 0;
+}
+
+/**
+ * insert_file_dentry - insert dentry according to scanned dent node.
+ * @file: file object
+ * @n_dent: scanned dent node
+ *
+ * Insert file dentry information. Returns zero in case of success, a
+ * negative error code in case of failure.
+ */
+static int insert_file_dentry(struct scanned_file *file,
+ struct scanned_dent_node *n_dent)
+{
+ struct scanned_dent_node *dent;
+ struct rb_node **p, *parent = NULL;
+
+ p = &file->dent_nodes.rb_node;
+ while (*p) {
+ parent = *p;
+ dent = rb_entry(parent, struct scanned_dent_node, rb);
+ if (n_dent->header.sqnum < dent->header.sqnum)
+ p = &(*p)->rb_left;
+ else
+ p = &(*p)->rb_right;
+ }
+
+ dent = kmalloc(sizeof(struct scanned_dent_node), GFP_KERNEL);
+ if (!dent)
+ return -ENOMEM;
+
+ *dent = *n_dent;
+ rb_link_node(&dent->rb, parent, p);
+ rb_insert_color(&dent->rb, &file->dent_nodes);
+
+ return 0;
+}
+
+/**
+ * update_file_data - insert/update data according to scanned data node.
+ * @c: UBIFS file-system description object
+ * @file: file object
+ * @n_dn: scanned data node
+ *
+ * Insert or update file data information. Returns zero in case of success,
+ * a negative error code in case of failure.
+ */
+static int update_file_data(struct ubifs_info *c, struct scanned_file *file,
+ struct scanned_data_node *n_dn)
+{
+ int cmp;
+ struct scanned_data_node *dn, *o_dn = NULL;
+ struct rb_node **p, *parent = NULL;
+
+ p = &file->data_nodes.rb_node;
+ while (*p) {
+ parent = *p;
+ dn = rb_entry(parent, struct scanned_data_node, rb);
+ cmp = keys_cmp(c, &n_dn->key, &dn->key);
+ if (cmp < 0) {
+ p = &(*p)->rb_left;
+ } else if (cmp > 0) {
+ p = &(*p)->rb_right;
+ } else {
+ o_dn = dn;
+ break;
+ }
+ }
+
+ if (o_dn) {
+ /* found data node with same block no. */
+ if (o_dn->header.sqnum < n_dn->header.sqnum) {
+ o_dn->header = n_dn->header;
+ o_dn->size = n_dn->size;
+ }
+
+ return 0;
+ }
+
+ dn = kmalloc(sizeof(struct scanned_data_node), GFP_KERNEL);
+ if (!dn)
+ return -ENOMEM;
+
+ *dn = *n_dn;
+ INIT_LIST_HEAD(&dn->list);
+ rb_link_node(&dn->rb, parent, p);
+ rb_insert_color(&dn->rb, &file->data_nodes);
+
+ return 0;
+}
+
+/**
+ * update_file - update file information.
+ * @c: UBIFS file-system description object
+ * @file: file object
+ * @sn: scanned node
+ * @key_type: type of @sn
+ *
+ * Update inode/dent/truncation/data node information of @file. Returns
+ * zero in case of success, a negative error code in case of failure.
+ */
+static int update_file(struct ubifs_info *c, struct scanned_file *file,
+ struct scanned_node *sn, int key_type)
+{
+ int err = 0;
+
+ switch (key_type) {
+ case UBIFS_INO_KEY:
+ {
+ struct scanned_ino_node *o_ino, *n_ino;
+
+ o_ino = &file->ino;
+ n_ino = (struct scanned_ino_node *)sn;
+ if (o_ino->header.exist && o_ino->header.sqnum > sn->sqnum)
+ goto out;
+
+ *o_ino = *n_ino;
+ break;
+ }
+ case UBIFS_DENT_KEY:
+ case UBIFS_XENT_KEY:
+ {
+ struct scanned_dent_node *dent = (struct scanned_dent_node *)sn;
+
+ err = insert_file_dentry(file, dent);
+ break;
+ }
+ case UBIFS_DATA_KEY:
+ {
+ struct scanned_data_node *dn = (struct scanned_data_node *)sn;
+
+ err = update_file_data(c, file, dn);
+ break;
+ }
+ case UBIFS_TRUN_KEY:
+ {
+ struct scanned_trun_node *o_trun, *n_trun;
+
+ o_trun = &file->trun;
+ n_trun = (struct scanned_trun_node *)sn;
+ if (o_trun->header.exist && o_trun->header.sqnum > sn->sqnum)
+ goto out;
+
+ *o_trun = *n_trun;
+ break;
+ }
+ default:
+ err = -EINVAL;
+ ubifs_err(c, "unknown key type %d", key_type);
+ }
+
+out:
+ return err;
+}
+
+/**
+ * insert_or_update_file - insert or update file according to scanned node.
+ * @c: UBIFS file-system description object
+ * @sn: scanned node
+ * @key_type: type of @sn
+ * @inum: inode number
+ *
+ * According to @sn, this function inserts file into the tree, or updates
+ * file information if it already exists in the tree. Returns zero in case
+ * of success, a negative error code in case of failure.
+ */
+static int insert_or_update_file(struct ubifs_info *c, struct scanned_node *sn,
+ int key_type, ino_t inum)
+{
+ int err;
+ struct scanned_file *file, *old_file = NULL;
+ struct rb_node **p, *parent = NULL;
+
+ p = &c->repair->scanned_files.rb_node;
+ while (*p) {
+ parent = *p;
+ file = rb_entry(parent, struct scanned_file, rb);
+ if (inum < file->inum) {
+ p = &(*p)->rb_left;
+ } else if (inum > file->inum) {
+ p = &(*p)->rb_right;
+ } else {
+ old_file = file;
+ break;
+ }
+ }
+ if (old_file)
+ return update_file(c, old_file, sn, key_type);
+
+ file = kzalloc(sizeof(struct scanned_file), GFP_KERNEL);
+ if (!file)
+ return -ENOMEM;
+
+ file->inum = inum;
+ file->dent_nodes = RB_ROOT;
+ file->data_nodes = RB_ROOT;
+ INIT_LIST_HEAD(&file->list);
+ err = update_file(c, file, sn, key_type);
+ if (err) {
+ kfree(file);
+ return err;
+ }
+ rb_link_node(&file->rb, parent, p);
+ rb_insert_color(&file->rb, &c->repair->scanned_files);
+
+ return 0;
+}
+
+/**
+ * process_scanned_node - process scanned node.
+ * @c: UBIFS file-system description object
+ * @lnum: logical eraseblock number
+ * @snod: scanned node
+ * @si: records nodes and files information during scanning
+ *
+ * This function parses, checks and records scanned node information.
+ * Returns zero in case of success, 1% if the scanned LEB doesn't hold file
+ * data and should be ignored(eg. index LEB), a negative error code in case
+ * of failure.
+ */
+static int process_scanned_node(struct ubifs_info *c, int lnum,
+ struct ubifs_scan_node *snod,
+ struct scanned_info *si)
+{
+ ino_t inum;
+ struct scanned_node *sn;
+ struct scanned_data_node data_node;
+ struct scanned_trun_node trun_node;
+
+ switch (snod->type) {
+ case UBIFS_INO_NODE:
+ {
+ return parse_and_record_ino_node(c, lnum, snod, si);
+ }
+ case UBIFS_DENT_NODE:
+ case UBIFS_XENT_NODE:
+ {
+ return parse_and_record_dent_node(c, lnum, snod, si);
+ }
+ case UBIFS_DATA_NODE:
+ {
+ if (parse_data_node(c, lnum, snod, &data_node, &inum))
+ return 0;
+ sn = (struct scanned_node *)&data_node;
+ break;
+ }
+ case UBIFS_TRUN_NODE:
+ {
+ if (parse_trun_node(c, lnum, snod, &trun_node, &inum))
+ return 0;
+ sn = (struct scanned_node *)&trun_node;
+ break;
+ }
+ default:
+ dbg_repair("skip node type %d, at %d:%d, in ubi%d_%d",
+ snod->type, lnum, snod->offs,
+ c->vi.ubi_num, c->vi.vol_id);
+ return 1;
+ }
+
+ return insert_or_update_file(c, sn, key_type(c, &snod->key), inum);
+}
+
+/**
+ * destroy_file_content - destroy scanned data/dentry nodes in give file.
+ * @file: file object
+ *
+ * Destroy all data/dentry nodes attached to @file.
+ */
+static void destroy_file_content(struct scanned_file *file)
+{
+ struct scanned_data_node *dn;
+ struct scanned_dent_node *dent;
+ struct rb_node *this;
+
+ this = rb_first(&file->data_nodes);
+ while (this) {
+ cond_resched();
+ dn = rb_entry(this, struct scanned_data_node, rb);
+ this = rb_next(this);
+
+ rb_erase(&dn->rb, &file->data_nodes);
+ kfree(dn);
+ }
+
+ this = rb_first(&file->dent_nodes);
+ while (this) {
+ cond_resched();
+ dent = rb_entry(this, struct scanned_dent_node, rb);
+ this = rb_next(this);
+
+ rb_erase(&dent->rb, &file->dent_nodes);
+ kfree(dent);
+ }
+}
+
+/**
+ * destroy_scanned_info - destroy scanned nodes.
+ * @c: UBIFS file-system description object
+ * @si: records nodes and files information during scanning
+ *
+ * Destroy scanned files and all data/dentry nodes attached to file, destroy
+ * valid/deleted inode/dentry info.
+ */
+static void destroy_scanned_info(struct ubifs_info *c, struct scanned_info *si)
+{
+ struct scanned_file *file;
+ struct scanned_ino_node *ino_node;
+ struct scanned_dent_node *dent_node;
+ struct rb_node *this;
+
+ this = rb_first(&c->repair->scanned_files);
+ while (this) {
+ cond_resched();
+ file = rb_entry(this, struct scanned_file, rb);
+ this = rb_next(this);
+
+ destroy_file_content(file);
+
+ rb_erase(&file->rb, &c->repair->scanned_files);
+ kfree(file);
+ }
+
+ this = rb_first(&si->valid_inos);
+ while (this) {
+ cond_resched();
+ ino_node = rb_entry(this, struct scanned_ino_node, rb);
+ this = rb_next(this);
+
+ rb_erase(&ino_node->rb, &si->valid_inos);
+ kfree(ino_node);
+ }
+
+ this = rb_first(&si->del_inos);
+ while (this) {
+ cond_resched();
+ ino_node = rb_entry(this, struct scanned_ino_node, rb);
+ this = rb_next(this);
+
+ rb_erase(&ino_node->rb, &si->del_inos);
+ kfree(ino_node);
+ }
+
+ this = rb_first(&si->valid_dents);
+ while (this) {
+ cond_resched();
+ dent_node = rb_entry(this, struct scanned_dent_node, rb);
+ this = rb_next(this);
+
+ rb_erase(&dent_node->rb, &si->valid_dents);
+ kfree(dent_node);
+ }
+
+ this = rb_first(&si->del_dents);
+ while (this) {
+ cond_resched();
+ dent_node = rb_entry(this, struct scanned_dent_node, rb);
+ this = rb_next(this);
+
+ rb_erase(&dent_node->rb, &si->del_dents);
+ kfree(dent_node);
+ }
+}
+
+/**
+ * scan_nodes - scan node information from flash.
+ * @c: UBIFS file-system description object
+ * @si: records nodes and files information during scanning
+ *
+ * This function scans nodes from flash, all ino/dent nodes are split
+ * into valid tree and deleted tree, all trun/data nodes are collected
+ * into file, the file is inserted into @c->repair->scanned_files.
+ */
+static int scan_nodes(struct ubifs_info *c, struct scanned_info *si)
+{
+ int lnum, err = 0;
+ struct ubifs_scan_leb *sleb;
+ struct ubifs_scan_node *snod;
+
+ for (lnum = c->main_first; lnum < c->leb_cnt; ++lnum) {
+ if (fatal_signal_pending(current))
+ return -EINTR;
+ cond_resched();
+
+ dbg_repair("scan nodes at LEB %d, in ubi%d_%d",
+ lnum, c->vi.ubi_num, c->vi.vol_id);
+
+ sleb = ubifs_scan(c, lnum, 0, c->sbuf, 1);
+ if (IS_ERR(sleb)) {
+ if (PTR_ERR(sleb) != -EUCLEAN)
+ return PTR_ERR(sleb);
+
+ sleb = ubifs_recover_leb(c, lnum, 0, c->sbuf, -1);
+ if (IS_ERR(sleb)) {
+ if (PTR_ERR(sleb) != -EUCLEAN)
+ return PTR_ERR(sleb);
+
+ /* This LEB holds corrupted data, abandon it. */
+ continue;
+ }
+ }
+
+ list_for_each_entry(snod, &sleb->nodes, list) {
+ if (snod->sqnum > c->max_sqnum)
+ c->max_sqnum = snod->sqnum;
+
+ cond_resched();
+ err = process_scanned_node(c, lnum, snod, si);
+ if (err < 0) {
+ ubifs_err(c, "process node failed at LEB %d, err %d",
+ lnum, err);
+ ubifs_scan_destroy(sleb);
+ goto out;
+ } else if (err == 1) {
+ err = 0;
+ break;
+ }
+ }
+
+ ubifs_scan_destroy(sleb);
+ }
+
+out:
+ return err;
+}
+
+static int do_repair(struct ubifs_info *c)
+{
+ int err = 0;
+ struct scanned_info si;
+
+ si.valid_inos = si.del_inos = si.valid_dents = si.del_dents = RB_ROOT;
+
+ /* Step 2: Scan valid/deleted nodes from volume. */
+ ubifs_msg(c, "Step 2: Scan nodes");
+ err = scan_nodes(c, &si);
+
+ destroy_scanned_info(c, &si);
+ return err;
+}

int ubifs_repair(const char *dev_name)
{
@@ -92,8 +1023,18 @@ int ubifs_repair(const char *dev_name)
if (err)
goto free_sup;

+ err = init_repair_info(c);
+ if (err)
+ goto free_sup;
+
+ err = do_repair(c);
+ if (err)
+ goto destroy_repair;
+
ubifs_msg(c, "Repair success!");

+destroy_repair:
+ destroy_repair_info(c);
free_sup:
kfree(c->sup_node);
vfree(c->sbuf);
diff --git a/fs/ubifs/repair.h b/fs/ubifs/repair.h
new file mode 100644
index 000000000000..242bff2833bd
--- /dev/null
+++ b/fs/ubifs/repair.h
@@ -0,0 +1,158 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * This file is part of UBIFS.
+ *
+ * Copyright (C) 2023-2024, Huawei Technologies Co, Ltd.
+ *
+ * Authors: Zhihao Cheng <[email protected]>
+ */
+
+/*
+ * This file implements ubifs repair.
+ */
+
+#ifndef __UBIFS_REPAIR_H__
+#define __UBIFS_REPAIR_H__
+
+/**
+ * scanned_node - common header node.
+ * @exist: whether the node is found by scanning
+ * @lnum: LEB number of the scanned node
+ * @offs: scanned node's offset within @lnum
+ * @len: length of scanned node
+ * @sqnum: sequence number
+ */
+struct scanned_node {
+ bool exist;
+ int lnum;
+ int offs;
+ int len;
+ unsigned long long sqnum;
+};
+
+/**
+ * scanned_ino_node - scanned inode node.
+ * @header: common header of scanned node
+ * @key: the key of inode node
+ * @is_xattr: %1 for xattr inode, otherwise %0
+ * @is_encrypted: %1 for encrypted inode, otherwise %0
+ * @mode: file mode
+ * @nlink: number of hard links
+ * @xcnt: count of extended attributes this inode has
+ * @xsz: summarized size of all extended attributes in bytes
+ * @xnms: sum of lengths of all extended attribute names
+ * @size: inode size in bytes
+ * @rb: link in the tree of valid inode nodes or deleted inode nodes
+ */
+struct scanned_ino_node {
+ struct scanned_node header;
+ union ubifs_key key;
+ unsigned int is_xattr:1;
+ unsigned int is_encrypted:1;
+ unsigned int mode;
+ unsigned int nlink;
+ unsigned int xcnt;
+ unsigned int xsz;
+ unsigned int xnms;
+ unsigned long long size;
+ struct rb_node rb;
+};
+
+/**
+ * scanned_dent_node - scanned dentry node.
+ * @header: common header of scanned node
+ * @key: the key of dentry node
+ * @can_be_found: whether this dentry can be found from '/'
+ * @type: file type, reg/dir/symlink/block/char/fifo/sock
+ * @nlen: name length
+ * @name: dentry name
+ * @inum: target inode number
+ * @rb: link in the trees of:
+ * 1) valid dentry nodes or deleted dentry node
+ * 2) all scanned dentry nodes from same file
+ * @list: link in the list dentries for looking up/deleting
+ */
+struct scanned_dent_node {
+ struct scanned_node header;
+ union ubifs_key key;
+ bool can_be_found;
+ unsigned int type;
+ unsigned int nlen;
+ char name[UBIFS_MAX_NLEN];
+ ino_t inum;
+ struct rb_node rb;
+ struct list_head list;
+};
+
+/**
+ * scanned_trun_node - scanned truncation node.
+ * @header: common header of scanned node
+ * @new_size: size after truncation
+ */
+struct scanned_trun_node {
+ struct scanned_node header;
+ unsigned long long new_size;
+};
+
+/**
+ * scanned_data_node - scanned data node.
+ * @header: common header of scanned node
+ * @key: the key of data node
+ * @size: uncompressed data size in bytes
+ * @rb: link in the tree of all scanned data nodes from same file
+ * @list: link in the list for deleting
+ */
+struct scanned_data_node {
+ struct scanned_node header;
+ union ubifs_key key;
+ unsigned int size;
+ struct rb_node rb;
+ struct list_head list;
+};
+
+/**
+ * scanned_file - file info scanned from UBIFS volume.
+ *
+ * @calc_nlink: calculated count of directory entries refer this inode
+ * @calc_xcnt: calculated count of extended attributes
+ * @calc_xsz: calculated summary size of all extended attributes
+ * @calc_xnms: calculated sum of lengths of all extended attribute names
+ * @calc_size: calculated file size
+ * @has_encrypted_info: whether the file has encryption related xattrs
+ *
+ * @inum: inode number
+ * @ino: inode node
+ * @trun: truncation node
+ *
+ * @rb: link in the tree of all scanned files
+ * @list: link in the list files for kinds of processing
+ * @dent_nodes: tree of all scanned dentry nodes
+ * @data_nodes: tree of all scanned data nodes
+ */
+struct scanned_file {
+ unsigned int calc_nlink;
+ unsigned int calc_xcnt;
+ unsigned int calc_xsz;
+ unsigned int calc_xnms;
+ unsigned long long calc_size;
+ bool has_encrypted_info;
+
+ ino_t inum;
+ struct scanned_ino_node ino;
+ struct scanned_trun_node trun;
+
+ struct rb_node rb;
+ struct list_head list;
+ struct rb_root dent_nodes;
+ struct rb_root data_nodes;
+};
+
+/**
+ * ubifs_repair_info - per-FS repairing information.
+ * @scanned_files: tree of all scanned files
+ */
+struct ubifs_repair_info {
+ struct rb_root scanned_files;
+};
+
+#endif /* !__UBIFS_REPAIR_H__ */
diff --git a/fs/ubifs/ubifs.h b/fs/ubifs/ubifs.h
index a7ee8010ad66..014f5ea26b17 100644
--- a/fs/ubifs/ubifs.h
+++ b/fs/ubifs/ubifs.h
@@ -1013,6 +1013,8 @@ struct ubifs_stats_info {

struct ubifs_debug_info;

+struct ubifs_repair_info;
+
/**
* struct ubifs_info - UBIFS file-system description data structure
* (per-superblock).
@@ -1053,6 +1055,9 @@ struct ubifs_debug_info;
* @cs_lock: commit state lock
* @cmt_wq: wait queue to sleep on if the log is full and a commit is running
*
+ * @kobj: kobject for /sys/fs/ubifs/
+ * @kobj_unregister: completion to unregister sysfs kobject
+ *
* @big_lpt: flag that LPT is too big to write whole during commit
* @space_fixup: flag indicating that free space in LEBs needs to be cleaned up
* @double_hash: flag indicating that we can do lookups by hash
@@ -1274,8 +1279,7 @@ struct ubifs_debug_info;
* @dbg: debugging-related information
* @stats: statistics exported over sysfs
*
- * @kobj: kobject for /sys/fs/ubifs/
- * @kobj_unregister: completion to unregister sysfs kobject
+ * @repair: repairing-related information
*/
struct ubifs_info {
struct super_block *vfs_sb;
@@ -1522,6 +1526,8 @@ struct ubifs_info {

struct ubifs_debug_info *dbg;
struct ubifs_stats_info *stats;
+
+ struct ubifs_repair_info *repair;
};

extern struct list_head ubifs_infos;
--
2.31.1


2023-12-28 01:41:03

by Zhihao Cheng

[permalink] [raw]
Subject: [PATCH RFC 11/17] ubifs: repair: Build TNC

This is the 10/13 step of repairing. Construct TNC according to scanned
files, and write TNC on flash, just like mkfs does.
Because this step, it can effectively solve many failed mounting
problems caused by bad TNC(eg. bad node pointed by TNC, bad key order
in znode, etc.).

Signed-off-by: Zhihao Cheng <[email protected]>
---
fs/ubifs/repair.c | 291 +++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 272 insertions(+), 19 deletions(-)

diff --git a/fs/ubifs/repair.c b/fs/ubifs/repair.c
index c43150fff19d..724f58510599 100644
--- a/fs/ubifs/repair.c
+++ b/fs/ubifs/repair.c
@@ -13,6 +13,7 @@

#include <linux/module.h>
#include <linux/crc32.h>
+#include <linux/list_sort.h>
#include "ubifs.h"
#include "repair.h"

@@ -30,6 +31,29 @@ struct scanned_info {
struct rb_root del_dents;
};

+/**
+ * struct idx_entry - index entry.
+ * @list: link in the list index entries for building index tree
+ * @key: key
+ * @name: directory entry name used for sorting colliding keys by name
+ * @lnum: LEB number
+ * @offs: offset
+ * @len: length
+ *
+ * The index is recorded as a linked list which is sorted and used to create
+ * the bottom level of the on-flash index tree. The remaining levels of the
+ * index tree are each built from the level below.
+ */
+struct idx_entry {
+ struct list_head list;
+ union ubifs_key key;
+ char *name;
+ int name_len;
+ int lnum;
+ int offs;
+ int len;
+};
+
static int init_repair_info(struct ubifs_info *c)
{
int err;
@@ -324,6 +348,17 @@ static int parse_and_record_ino_node(struct ubifs_info *c, int lnum,
return 0;
}

+static int namecmp(const char *a, int la, const char *b, int lb)
+{
+ int cmp, len = min(la, lb);
+
+ cmp = memcmp(a, b, len);
+ if (cmp)
+ return cmp;
+
+ return la - lb;
+}
+
/**
* insert_or_update_dent_node - insert or update dentry node.
* @c: UBIFS file-system description object
@@ -338,7 +373,7 @@ static int insert_or_update_dent_node(struct ubifs_info *c,
struct scanned_dent_node *new_dent,
struct rb_root *tree)
{
- int cmp, nlen;
+ int cmp;
struct scanned_dent_node *dent_node, *old_dent_node = NULL;
struct rb_node **p, *parent = NULL;

@@ -352,9 +387,8 @@ static int insert_or_update_dent_node(struct ubifs_info *c,
} else if (cmp > 0) {
p = &(*p)->rb_right;
} else {
- nlen = min(new_dent->nlen, dent_node->nlen);
- cmp = strncmp(new_dent->name, dent_node->name, nlen) ? :
- new_dent->nlen - dent_node->nlen;
+ cmp = namecmp(new_dent->name, new_dent->nlen,
+ dent_node->name, dent_node->nlen);
if (cmp < 0) {
p = &(*p)->rb_left;
} else if (cmp > 0) {
@@ -993,7 +1027,7 @@ static struct scanned_dent_node *
lookup_valid_dent_node(struct ubifs_info *c, struct scanned_info *si,
struct scanned_dent_node *target)
{
- int cmp, nlen;
+ int cmp;
struct scanned_dent_node *dent_node;
struct rb_node *p;

@@ -1006,9 +1040,8 @@ lookup_valid_dent_node(struct ubifs_info *c, struct scanned_info *si,
} else if (cmp > 0) {
p = p->rb_right;
} else {
- nlen = min(target->nlen, dent_node->nlen);
- cmp = strncmp(target->name, dent_node->name, nlen) ? :
- target->nlen - dent_node->nlen;
+ cmp = namecmp(target->name, target->nlen,
+ dent_node->name, dent_node->nlen);
if (cmp < 0) {
p = p->rb_left;
} else if (cmp > 0) {
@@ -1754,6 +1787,27 @@ static int check_and_correct_files(struct ubifs_info *c)
return 0;
}

+static int cmp_idx(void *priv, const struct list_head *a,
+ const struct list_head *b)
+{
+ int cmp;
+ struct ubifs_info *c = priv;
+ struct idx_entry *ia, *ib;
+
+ cond_resched();
+ if (a == b)
+ return 0;
+
+ ia = list_entry(a, struct idx_entry, list);
+ ib = list_entry(b, struct idx_entry, list);
+
+ cmp = keys_cmp(c, &ia->key, &ib->key);
+ if (cmp)
+ return cmp;
+
+ return namecmp(ia->name, ia->name_len, ib->name, ib->name_len);
+}
+
/**
* get_free_leb - get a free LEB according to @c->repair->used_lebs.
* @c: UBIFS file-system description object
@@ -1943,15 +1997,181 @@ static char *get_file_name(struct ubifs_info *c, struct scanned_file *file)
return name;
}

-static void parse_node_location(struct ubifs_info *c, struct scanned_node *sn)
+static int parse_node_info(struct ubifs_info *c, struct scanned_node *sn,
+ union ubifs_key *key, char *name, int name_len,
+ struct list_head *idx_list, int *idx_cnt)
{
int lnum, pos;
+ struct idx_entry *e;

lnum = sn->lnum - c->main_first;
pos = sn->offs + ALIGN(sn->len, 8);

set_bit(lnum, c->repair->used_lebs);
c->repair->lpts[lnum].end = max(c->repair->lpts[lnum].end, pos);
+
+ if (idx_cnt == NULL)
+ /* Skip truncation node. */
+ return 0;
+
+ e = kmalloc(sizeof(struct idx_entry), GFP_KERNEL);
+ if (!e)
+ return -ENOMEM;
+
+ key_copy(c, key, &e->key);
+ e->name = name;
+ e->name_len = name_len;
+ e->lnum = sn->lnum;
+ e->offs = sn->offs;
+ e->len = sn->len;
+ list_add_tail(&e->list, idx_list);
+ *idx_cnt = *idx_cnt + 1;
+
+ return 0;
+}
+
+static int add_idx_node(struct ubifs_info *c, struct ubifs_idx_node *idx,
+ union ubifs_key *key, int child_cnt,
+ struct idx_entry *e)
+{
+ int err, lnum, offs, len;
+
+ len = ubifs_idx_node_sz(c, child_cnt);
+ ubifs_prepare_node(c, idx, len, 0);
+
+ err = reserve_space(c, len, &lnum, &offs);
+ if (err)
+ return err;
+
+ copy_node_data(c, idx, offs, len);
+
+ c->calc_idx_sz += ALIGN(len, 8);
+
+ /* The last index node written will be the root */
+ c->zroot.lnum = lnum;
+ c->zroot.offs = offs;
+ c->zroot.len = len;
+
+ key_copy(c, key, &e->key);
+ e->lnum = lnum;
+ e->offs = offs;
+ e->len = len;
+
+ return err;
+}
+
+/**
+ * build_tnc - construct TNC and write it into flash.
+ * @c: UBIFS file-system description object
+ * @lower_idxs: leaf entries of TNC
+ * @lower_cnt: the count of @lower_idxs
+ *
+ * This function builds TNC according to @lower_idxs and writes all index
+ * nodes into flash.
+ */
+static int build_tnc(struct ubifs_info *c, struct list_head *lower_idxs,
+ int lower_cnt)
+{
+ int i, j, err, upper_cnt, child_cnt, idx_sz, level = 0;
+ struct idx_entry *pe, *tmp_e, *e = NULL;
+ struct ubifs_idx_node *idx;
+ struct ubifs_branch *br;
+ union ubifs_key key;
+ LIST_HEAD(upper_idxs);
+
+ idx_sz = ubifs_idx_node_sz(c, c->fanout);
+ idx = kmalloc(idx_sz, GFP_KERNEL);
+ if (!idx)
+ return -ENOMEM;
+
+ list_sort(c, lower_idxs, cmp_idx);
+
+ ubifs_assert(c, lower_cnt != 0);
+
+ do {
+ upper_cnt = lower_cnt / c->fanout;
+ if (lower_cnt % c->fanout)
+ upper_cnt += 1;
+ e = list_first_entry(lower_idxs, struct idx_entry, list);
+
+ for (i = 0; i < upper_cnt; i++) {
+ if (fatal_signal_pending(current)) {
+ err = -EINTR;
+ goto out;
+ }
+ cond_resched();
+
+ if (i == upper_cnt - 1) {
+ child_cnt = lower_cnt % c->fanout;
+ if (child_cnt == 0)
+ child_cnt = c->fanout;
+ } else
+ child_cnt = c->fanout;
+
+ key_copy(c, &e->key, &key);
+ memset(idx, 0, idx_sz);
+ idx->ch.node_type = UBIFS_IDX_NODE;
+ idx->child_cnt = cpu_to_le16(child_cnt);
+ idx->level = cpu_to_le16(level);
+ for (j = 0; j < child_cnt; j++) {
+ ubifs_assert(c,
+ !list_entry_is_head(e, lower_idxs, list));
+ br = ubifs_idx_branch(c, idx, j);
+ key_write_idx(c, &e->key, &br->key);
+ br->lnum = cpu_to_le32(e->lnum);
+ br->offs = cpu_to_le32(e->offs);
+ br->len = cpu_to_le32(e->len);
+ e = list_next_entry(e, list);
+ }
+
+ pe = kmalloc(sizeof(struct idx_entry), GFP_KERNEL);
+ if (!pe) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ err = add_idx_node(c, idx, &key, child_cnt, pe);
+ if (err) {
+ kfree(pe);
+ goto out;
+ }
+
+ list_add_tail(&pe->list, &upper_idxs);
+ }
+
+ level++;
+ list_for_each_entry_safe(e, tmp_e, lower_idxs, list) {
+ cond_resched();
+
+ list_del(&e->list);
+ kfree(e);
+ }
+ list_splice_init(&upper_idxs, lower_idxs);
+ lower_cnt = upper_cnt;
+ } while (lower_cnt > 1);
+
+ /* Set the index head */
+ c->ihead_lnum = c->repair->head_lnum;
+ c->ihead_offs = ALIGN(c->repair->head_offs, c->min_io_size);
+
+ /* Flush the last index LEB */
+ err = flush_write_buf(c);
+
+out:
+ list_for_each_entry_safe(e, tmp_e, lower_idxs, list) {
+ cond_resched();
+
+ list_del(&e->list);
+ kfree(e);
+ }
+ list_for_each_entry_safe(e, tmp_e, &upper_idxs, list) {
+ cond_resched();
+
+ list_del(&e->list);
+ kfree(e);
+ }
+ kfree(idx);
+ return err;
}

/**
@@ -1965,14 +2185,17 @@ static void parse_node_location(struct ubifs_info *c, struct scanned_node *sn)
* LEBs could be taken for storing new index tree.
* 3. Re-write data to prevent failed gc scanning in the subsequent mounting
* process caused by corrupted data.
+ * 4. Build TNC.
*/
static int traverse_files_and_nodes(struct ubifs_info *c)
{
- int i, err = 0;
+ int i, err = 0, idx_cnt = 0;
struct rb_node *node, *n;
struct scanned_file *file;
struct scanned_dent_node *dent_node;
struct scanned_data_node *data_node;
+ struct idx_entry *ie, *tmp_ie;
+ LIST_HEAD(idx_list);

if (rb_first(&c->repair->scanned_files) == NULL) {
/* No scanned files. Create root dir. */
@@ -1994,23 +2217,39 @@ static int traverse_files_and_nodes(struct ubifs_info *c)
ubifs_get_type_name(ubifs_get_dent_type(file->ino.mode)),
c->vi.ubi_num, c->vi.vol_id);

- parse_node_location(c, &file->ino.header);
+ err = parse_node_info(c, &file->ino.header, &file->ino.key,
+ NULL, 0, &idx_list, &idx_cnt);
+ if (err)
+ goto out_idx_list;

- if (file->trun.header.exist)
- parse_node_location(c, &file->trun.header);
+ if (file->trun.header.exist) {
+ err = parse_node_info(c, &file->trun.header, NULL, NULL,
+ 0, &idx_list, NULL);
+ if (err)
+ goto out_idx_list;
+ }

for (n = rb_first(&file->data_nodes); n; n = rb_next(n)) {
cond_resched();
data_node = rb_entry(n, struct scanned_data_node, rb);

- parse_node_location(c, &data_node->header);
+ err = parse_node_info(c, &data_node->header,
+ &data_node->key, NULL, 0,
+ &idx_list, &idx_cnt);
+ if (err)
+ goto out_idx_list;
}

for (n = rb_first(&file->dent_nodes); n; n = rb_next(n)) {
cond_resched();
dent_node = rb_entry(n, struct scanned_dent_node, rb);

- parse_node_location(c, &dent_node->header);
+ err = parse_node_info(c, &dent_node->header,
+ &dent_node->key, dent_node->name,
+ dent_node->nlen,
+ &idx_list, &idx_cnt);
+ if (err)
+ goto out_idx_list;
}
}

@@ -2019,8 +2258,10 @@ static int traverse_files_and_nodes(struct ubifs_info *c)
for (i = 0; i < c->main_lebs; ++i) {
int lnum, len, end;

- if (fatal_signal_pending(current))
- return -EINTR;
+ if (fatal_signal_pending(current)) {
+ err = -EINTR;
+ goto out_idx_list;
+ }
cond_resched();

if (!test_bit(i, c->repair->used_lebs))
@@ -2035,16 +2276,27 @@ static int traverse_files_and_nodes(struct ubifs_info *c)

err = ubifs_leb_read(c, lnum, c->sbuf, 0, len, 0);
if (err && err != -EBADMSG)
- return err;
+ goto out_idx_list;

if (len > end)
ubifs_pad(c, c->sbuf + end, len - end);

err = ubifs_leb_change(c, lnum, c->sbuf, len);
if (err)
- return err;
+ goto out_idx_list;
}

+ /* Build TNC. */
+ ubifs_msg(c, "Step 10: Build TNC");
+ err = build_tnc(c, &idx_list, idx_cnt);
+
+out_idx_list:
+ list_for_each_entry_safe(ie, tmp_ie, &idx_list, list) {
+ cond_resched();
+
+ list_del(&ie->list);
+ kfree(ie);
+ }
return err;
}

@@ -2088,6 +2340,7 @@ static int do_repair(struct ubifs_info *c)
/*
* Step 8: Record used LEBs.
* Step 9: Re-write data to clean corrupted data.
+ * Step 10: Build TNC.
*/
err = traverse_files_and_nodes(c);

--
2.31.1


2023-12-28 01:41:07

by Zhihao Cheng

[permalink] [raw]
Subject: [PATCH RFC 04/17] ubifs: repair: Add valid nodes into file

This is the 4/13 step of repairing. Generate file according to left valid
inode nodes and dentry nodes. Based on results from step 3, it is easy to
understand:

Step 3 has done:
valid_inos - del_inos = left_inos
valid_dents - del_dents = left_dents
Step 4 should do:
Traverse left_inos and left_dents, insert inode/dentry nodes into
corresponding file.

Now, all files are generated by scanning, the next thing to do is
dropping invalid files(eg. nonconsistent file type between inode node and
dentry nodes, file has no dentry nodes(excepts '/'), encrypted file has
no xattr information, etc.).

Signed-off-by: Zhihao Cheng <[email protected]>
---
fs/ubifs/repair.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 60 insertions(+)

diff --git a/fs/ubifs/repair.c b/fs/ubifs/repair.c
index d932c16ec893..7a1732ef903f 100644
--- a/fs/ubifs/repair.c
+++ b/fs/ubifs/repair.c
@@ -1043,6 +1043,62 @@ static void remove_del_nodes(struct ubifs_info *c, struct scanned_info *si)
}
}

+/**
+ * add_valid_nodes_into_file - add valid nodes into file.
+ * @c: UBIFS file-system description object
+ * @si: records nodes and files information during scanning
+ *
+ * This function adds valid nodes into corresponding file, all valid ino/dent
+ * nodes will be removed from @si->valid_inos/@si->valid_dents if the function
+ * is executed successfully.
+ */
+static int add_valid_nodes_into_file(struct ubifs_info *c,
+ struct scanned_info *si)
+{
+ int err, type;
+ ino_t inum;
+ struct scanned_node *sn;
+ struct scanned_ino_node *ino_node;
+ struct scanned_dent_node *dent_node;
+ struct rb_node *this;
+
+ this = rb_first(&si->valid_inos);
+ while (this) {
+ cond_resched();
+ ino_node = rb_entry(this, struct scanned_ino_node, rb);
+ this = rb_next(this);
+
+ sn = (struct scanned_node *)ino_node;
+ type = key_type(c, &ino_node->key);
+ inum = key_inum(c, &ino_node->key);
+ err = insert_or_update_file(c, sn, type, inum);
+ if (err)
+ return err;
+
+ rb_erase(&ino_node->rb, &si->valid_inos);
+ kfree(ino_node);
+ }
+
+ this = rb_first(&si->valid_dents);
+ while (this) {
+ cond_resched();
+ dent_node = rb_entry(this, struct scanned_dent_node, rb);
+ this = rb_next(this);
+
+ sn = (struct scanned_node *)dent_node;
+ inum = dent_node->inum;
+ type = key_type(c, &dent_node->key);
+ err = insert_or_update_file(c, sn, type, inum);
+ if (err)
+ return err;
+
+ rb_erase(&dent_node->rb, &si->valid_dents);
+ kfree(dent_node);
+ }
+
+ return 0;
+}
+
static int do_repair(struct ubifs_info *c)
{
int err = 0;
@@ -1060,6 +1116,10 @@ static int do_repair(struct ubifs_info *c)
ubifs_msg(c, "Step 3: Remove deleted nodes");
remove_del_nodes(c, &si);

+ /* Step 4: Add valid nodes into file. */
+ ubifs_msg(c, "Step 4: Add valid nodes into file");
+ err = add_valid_nodes_into_file(c, &si);
+
out:
destroy_scanned_info(c, &si);
return err;
--
2.31.1


2023-12-28 01:41:17

by Zhihao Cheng

[permalink] [raw]
Subject: [PATCH RFC 12/17] ubifs: Extract a helper function to create lpt

Abstract a helper function to create lpt, this is a preparation to
add lpt creation support in ubifs_repair.

Signed-off-by: Zhihao Cheng <[email protected]>
---
fs/ubifs/lpt.c | 140 ++++++++++++++++++++++++++--------------------
fs/ubifs/repair.h | 11 +---
fs/ubifs/ubifs.h | 18 ++++++
3 files changed, 98 insertions(+), 71 deletions(-)

diff --git a/fs/ubifs/lpt.c b/fs/ubifs/lpt.c
index 778a22bf9a92..fa50ad5106d4 100644
--- a/fs/ubifs/lpt.c
+++ b/fs/ubifs/lpt.c
@@ -586,20 +586,21 @@ static int calc_pnode_num_from_parent(const struct ubifs_info *c,
}

/**
- * ubifs_create_dflt_lpt - create default LPT.
+ * ubifs_create_lpt - create lpt acccording to lprops array.
* @c: UBIFS file-system description object
- * @main_lebs: number of main area LEBs is passed and returned here
- * @lpt_first: LEB number of first LPT LEB
- * @lpt_lebs: number of LEBs for LPT is passed and returned here
- * @big_lpt: use big LPT model is passed and returned here
+ * @lps: lprops array to record logical eraseblock properties
+ * @lp_cnt: the length of @lps
* @hash: hash of the LPT is returned here
*
- * This function returns %0 on success and a negative error code on failure.
+ * This function creates lpt, the pnode will be initialized based on
+ * corresponding elements in @lps. If there are no corresponding lprops
+ * (eg. @lp_cnt is smaller than @c->main_lebs), the LEB property is set
+ * as free state.
*/
-int ubifs_create_dflt_lpt(struct ubifs_info *c, int *main_lebs, int lpt_first,
- int *lpt_lebs, int *big_lpt, u8 *hash)
+int ubifs_create_lpt(struct ubifs_info *c, struct lprops *lps, int lp_cnt,
+ u8 *hash)
{
- int lnum, err = 0, node_sz, iopos, i, j, cnt, len, alen, row;
+ int lnum, err = 0, i, j, cnt, len, alen, row;
int blnum, boffs, bsz, bcnt;
struct ubifs_pnode *pnode = NULL;
struct ubifs_nnode *nnode = NULL;
@@ -608,18 +609,6 @@ int ubifs_create_dflt_lpt(struct ubifs_info *c, int *main_lebs, int lpt_first,
int *lsave = NULL;
struct shash_desc *desc;

- err = calc_dflt_lpt_geom(c, main_lebs, big_lpt);
- if (err)
- return err;
- *lpt_lebs = c->lpt_lebs;
-
- /* Needed by 'ubifs_pack_nnode()' and 'set_ltab()' */
- c->lpt_first = lpt_first;
- /* Needed by 'set_ltab()' */
- c->lpt_last = lpt_first + c->lpt_lebs - 1;
- /* Needed by 'ubifs_pack_lsave()' */
- c->main_first = c->leb_cnt - *main_lebs;
-
desc = ubifs_hash_get_desc(c);
if (IS_ERR(desc))
return PTR_ERR(desc);
@@ -646,47 +635,12 @@ int ubifs_create_dflt_lpt(struct ubifs_info *c, int *main_lebs, int lpt_first,
ltab[i].cmt = 0;
}

- lnum = lpt_first;
+ lnum = c->lpt_first;
p = buf;
+ len = 0;
/* Number of leaf nodes (pnodes) */
cnt = c->pnode_cnt;

- /*
- * The first pnode contains the LEB properties for the LEBs that contain
- * the root inode node and the root index node of the index tree.
- */
- node_sz = ALIGN(ubifs_idx_node_sz(c, 1), 8);
- iopos = ALIGN(node_sz, c->min_io_size);
- pnode->lprops[0].free = c->leb_size - iopos;
- pnode->lprops[0].dirty = iopos - node_sz;
- pnode->lprops[0].flags = LPROPS_INDEX;
-
- node_sz = UBIFS_INO_NODE_SZ;
- iopos = ALIGN(node_sz, c->min_io_size);
- pnode->lprops[1].free = c->leb_size - iopos;
- pnode->lprops[1].dirty = iopos - node_sz;
-
- for (i = 2; i < UBIFS_LPT_FANOUT; i++)
- pnode->lprops[i].free = c->leb_size;
-
- /* Add first pnode */
- ubifs_pack_pnode(c, p, pnode);
- err = ubifs_shash_update(c, desc, p, c->pnode_sz);
- if (err)
- goto out;
-
- p += c->pnode_sz;
- len = c->pnode_sz;
- pnode->num += 1;
-
- /* Reset pnode values for remaining pnodes */
- pnode->lprops[0].free = c->leb_size;
- pnode->lprops[0].dirty = 0;
- pnode->lprops[0].flags = 0;
-
- pnode->lprops[1].free = c->leb_size;
- pnode->lprops[1].dirty = 0;
-
/*
* To calculate the internal node branches, we keep information about
* the level below.
@@ -696,8 +650,8 @@ int ubifs_create_dflt_lpt(struct ubifs_info *c, int *main_lebs, int lpt_first,
bcnt = cnt; /* Number of nodes in level below */
bsz = c->pnode_sz; /* Size of nodes in level below */

- /* Add all remaining pnodes */
- for (i = 1; i < cnt; i++) {
+ /* Add all pnodes */
+ for (i = 0; i < cnt; i++) {
if (len + c->pnode_sz > c->leb_size) {
alen = ALIGN(len, c->min_io_size);
set_ltab(c, lnum, c->leb_size - alen, alen - len);
@@ -708,6 +662,20 @@ int ubifs_create_dflt_lpt(struct ubifs_info *c, int *main_lebs, int lpt_first,
p = buf;
len = 0;
}
+ /* Fill in the pnode */
+ for (j = 0; j < UBIFS_LPT_FANOUT; j++) {
+ int k = (i << UBIFS_LPT_FANOUT_SHIFT) + j;
+
+ if (k < lp_cnt) {
+ pnode->lprops[j].free = lps[k].free;
+ pnode->lprops[j].dirty = lps[k].dirty;
+ pnode->lprops[j].flags = lps[k].flags;
+ } else {
+ pnode->lprops[j].free = c->leb_size;
+ pnode->lprops[j].dirty = 0;
+ pnode->lprops[j].flags = 0;
+ }
+ }
ubifs_pack_pnode(c, p, pnode);
err = ubifs_shash_update(c, desc, p, c->pnode_sz);
if (err)
@@ -777,7 +745,7 @@ int ubifs_create_dflt_lpt(struct ubifs_info *c, int *main_lebs, int lpt_first,
row -= 1;
}

- if (*big_lpt) {
+ if (c->big_lpt) {
/* Need to add LPT's save table */
if (len + c->lsave_sz > c->leb_size) {
alen = ALIGN(len, c->min_io_size);
@@ -793,7 +761,7 @@ int ubifs_create_dflt_lpt(struct ubifs_info *c, int *main_lebs, int lpt_first,
c->lsave_lnum = lnum;
c->lsave_offs = len;

- for (i = 0; i < c->lsave_cnt && i < *main_lebs; i++)
+ for (i = 0; i < c->lsave_cnt && i < c->main_lebs; i++)
lsave[i] = c->main_first + i;
for (; i < c->lsave_cnt; i++)
lsave[i] = c->main_first;
@@ -868,6 +836,54 @@ int ubifs_create_dflt_lpt(struct ubifs_info *c, int *main_lebs, int lpt_first,
return err;
}

+/**
+ * ubifs_create_dflt_lpt - create default LPT.
+ * @c: UBIFS file-system description object
+ * @main_lebs: number of main area LEBs is passed and returned here
+ * @lpt_first: LEB number of first LPT LEB
+ * @lpt_lebs: number of LEBs for LPT is passed and returned here
+ * @big_lpt: use big LPT model is passed and returned here
+ * @hash: hash of the LPT is returned here
+ *
+ * This function returns %0 on success and a negative error code on failure.
+ */
+int ubifs_create_dflt_lpt(struct ubifs_info *c, int *main_lebs, int lpt_first,
+ int *lpt_lebs, int *big_lpt, u8 *hash)
+{
+ int node_sz, iopos, err = 0;
+ struct lprops lps[2];
+
+ err = calc_dflt_lpt_geom(c, main_lebs, big_lpt);
+ if (err)
+ return err;
+ *lpt_lebs = c->lpt_lebs;
+
+ /* Needed by 'ubifs_pack_nnode()' and 'set_ltab()' */
+ c->lpt_first = lpt_first;
+ /* Needed by 'set_ltab()' */
+ c->lpt_last = lpt_first + c->lpt_lebs - 1;
+ /* Needed by 'ubifs_pack_lsave()' */
+ c->main_first = c->leb_cnt - *main_lebs;
+
+ /*
+ * The first pnode contains the LEB properties for the LEBs that contain
+ * the root inode node and the root index node of the index tree.
+ */
+ node_sz = ALIGN(ubifs_idx_node_sz(c, 1), 8);
+ iopos = ALIGN(node_sz, c->min_io_size);
+ lps[0].free = c->leb_size - iopos;
+ lps[0].dirty = iopos - node_sz;
+ lps[0].flags = LPROPS_INDEX;
+
+ node_sz = UBIFS_INO_NODE_SZ;
+ iopos = ALIGN(node_sz, c->min_io_size);
+ lps[1].free = c->leb_size - iopos;
+ lps[1].dirty = iopos - node_sz;
+ lps[1].flags = 0;
+
+ return ubifs_create_lpt(c, lps, 2, hash);
+}
+
/**
* update_cats - add LEB properties of a pnode to LEB category lists and heaps.
* @c: UBIFS file-system description object
diff --git a/fs/ubifs/repair.h b/fs/ubifs/repair.h
index e3b879b3f999..f8d18f07e324 100644
--- a/fs/ubifs/repair.h
+++ b/fs/ubifs/repair.h
@@ -151,15 +151,6 @@ struct scanned_file {
struct rb_root data_nodes;
};

-
-/**
- * lprops - logical eraseblock properties.
- * @end: the end postition of LEB calculated by the last node
- */
-struct lprops {
- int end;
-};
-
/**
* ubifs_repair_info - per-FS repairing information.
* @usen_lebs: a bitmap used for recording used lebs
@@ -168,6 +159,7 @@ struct lprops {
* @write_buf: write buffer for LEB @head_lnum
* @head_lnum: current writing LEB number
* @head_offs: current writing position in LEB @head_lnum
+ * @need_update_lpt: whether to update lpt while writing index nodes
*/
struct ubifs_repair_info {
unsigned long *used_lebs;
@@ -176,6 +168,7 @@ struct ubifs_repair_info {
void *write_buf;
int head_lnum;
int head_offs;
+ bool need_update_lpt;
};

#endif /* !__UBIFS_REPAIR_H__ */
diff --git a/fs/ubifs/ubifs.h b/fs/ubifs/ubifs.h
index 7d66440623cb..5fe28980d151 100644
--- a/fs/ubifs/ubifs.h
+++ b/fs/ubifs/ubifs.h
@@ -495,6 +495,22 @@ struct ubifs_lprops {
};
};

+/**
+ * struct lprops - logical eraseblock properties.
+ * @free: amount of free space in bytes
+ * @dirty: amount of dirty space in bytes
+ * @used: amount of used space in bytes
+ * @flags: LEB properties flags
+ * @end: the end postition of LEB calculated by the last node
+ */
+struct lprops {
+ int free;
+ int dirty;
+ int used;
+ int flags;
+ int end;
+};
+
/**
* struct ubifs_lpt_lprops - LPT logical eraseblock properties.
* @free: amount of free space in bytes
@@ -2020,6 +2036,8 @@ int ubifs_clear_orphans(struct ubifs_info *c);

/* lpt.c */
int ubifs_calc_lpt_geom(struct ubifs_info *c);
+int ubifs_create_lpt(struct ubifs_info *c, struct lprops *lps, int lp_cnt,
+ u8 *hash);
int ubifs_create_dflt_lpt(struct ubifs_info *c, int *main_lebs, int lpt_first,
int *lpt_lebs, int *big_lpt, u8 *hash);
int ubifs_lpt_init(struct ubifs_info *c, int rd, int wr);
--
2.31.1


2023-12-28 01:41:30

by Zhihao Cheng

[permalink] [raw]
Subject: [PATCH RFC 13/17] ubifs: repair: Build LPT

This is the 11/13 step of repairing. All LEBs' properties can be
calculated in previous steps according to all nodes' position, then
construct LPT just like mkfs does, and write TNC on flash.

Signed-off-by: Zhihao Cheng <[email protected]>
---
fs/ubifs/repair.c | 113 ++++++++++++++++++++++++++++++++++++++--------
1 file changed, 93 insertions(+), 20 deletions(-)

diff --git a/fs/ubifs/repair.c b/fs/ubifs/repair.c
index 724f58510599..8d40cff26f7a 100644
--- a/fs/ubifs/repair.c
+++ b/fs/ubifs/repair.c
@@ -1059,6 +1059,21 @@ lookup_valid_dent_node(struct ubifs_info *c, struct scanned_info *si,
return NULL;
}

+static void update_lpt(struct ubifs_info *c, struct scanned_node *sn,
+ bool deleted)
+{
+ int lnum = sn->lnum - c->main_first;
+ int pos = sn->offs + ALIGN(sn->len, 8);
+
+ set_bit(lnum, c->repair->used_lebs);
+ c->repair->lpts[lnum].end = max(c->repair->lpts[lnum].end, pos);
+
+ if (deleted)
+ return;
+
+ c->repair->lpts[lnum].used += ALIGN(sn->len, 8);
+}
+
/**
* remove_del_nodes - remove deleted nodes from valid node tree.
* @c: UBIFS file-system description object
@@ -1083,13 +1098,7 @@ static void remove_del_nodes(struct ubifs_info *c, struct scanned_info *si)

valid_ino_node = lookup_valid_ino_node(c, si, del_ino_node);
if (valid_ino_node) {
- int lnum = del_ino_node->header.lnum - c->main_first;
- int pos = del_ino_node->header.offs +
- ALIGN(del_ino_node->header.len, 8);
-
- set_bit(lnum, c->repair->used_lebs);
- c->repair->lpts[lnum].end =
- max(c->repair->lpts[lnum].end, pos);
+ update_lpt(c, &del_ino_node->header, true);
rb_erase(&valid_ino_node->rb, &si->valid_inos);
kfree(valid_ino_node);
}
@@ -1106,13 +1115,7 @@ static void remove_del_nodes(struct ubifs_info *c, struct scanned_info *si)

valid_dent_node = lookup_valid_dent_node(c, si, del_dent_node);
if (valid_dent_node) {
- int lnum = del_dent_node->header.lnum - c->main_first;
- int pos = del_dent_node->header.offs +
- ALIGN(del_dent_node->header.len, 8);
-
- set_bit(lnum, c->repair->used_lebs);
- c->repair->lpts[lnum].end =
- max(c->repair->lpts[lnum].end, pos);
+ update_lpt(c, &del_dent_node->header, true);
rb_erase(&valid_dent_node->rb, &si->valid_dents);
kfree(valid_dent_node);
}
@@ -1861,6 +1864,14 @@ static int flush_write_buf(struct ubifs_info *c)
if (err)
return err;

+ if (c->repair->need_update_lpt) {
+ int lnum = c->repair->head_lnum - c->main_first;
+
+ c->repair->lpts[lnum].free = c->leb_size - len;
+ c->repair->lpts[lnum].dirty = pad;
+ c->repair->lpts[lnum].flags = LPROPS_INDEX;
+ }
+
c->repair->head_lnum = -1;

return 0;
@@ -2001,14 +2012,9 @@ static int parse_node_info(struct ubifs_info *c, struct scanned_node *sn,
union ubifs_key *key, char *name, int name_len,
struct list_head *idx_list, int *idx_cnt)
{
- int lnum, pos;
struct idx_entry *e;

- lnum = sn->lnum - c->main_first;
- pos = sn->offs + ALIGN(sn->len, 8);
-
- set_bit(lnum, c->repair->used_lebs);
- c->repair->lpts[lnum].end = max(c->repair->lpts[lnum].end, pos);
+ update_lpt(c, sn, idx_cnt == NULL);

if (idx_cnt == NULL)
/* Skip truncation node. */
@@ -2085,6 +2091,7 @@ static int build_tnc(struct ubifs_info *c, struct list_head *lower_idxs,
return -ENOMEM;

list_sort(c, lower_idxs, cmp_idx);
+ c->repair->need_update_lpt = true;

ubifs_assert(c, lower_cnt != 0);

@@ -2156,6 +2163,7 @@ static int build_tnc(struct ubifs_info *c, struct list_head *lower_idxs,

/* Flush the last index LEB */
err = flush_write_buf(c);
+ c->repair->need_update_lpt = false;

out:
list_for_each_entry_safe(e, tmp_e, lower_idxs, list) {
@@ -2300,6 +2308,65 @@ static int traverse_files_and_nodes(struct ubifs_info *c)
return err;
}

+/**
+ * build_lpt - construct LPT and write it into flash.
+ * @c: UBIFS file-system description object
+ *
+ * This function builds LPT according to @c->repair->lpts and writes LPT
+ * into flash.
+ */
+int build_lpt(struct ubifs_info *c)
+{
+ int i, len, free, dirty, err;
+ u8 hash_lpt[UBIFS_HASH_ARR_SZ];
+
+ /* Set gc lnum. */
+ err = get_free_leb(c);
+ if (err)
+ return err;
+ c->gc_lnum = c->repair->head_lnum;
+ /* GC LEB is empty. */
+ c->lst.empty_lebs = 1;
+
+ /* Update LPT. */
+ for (i = 0; i < c->main_lebs; i++) {
+ cond_resched();
+
+ if (!test_bit(i, c->repair->used_lebs)) {
+ free = c->leb_size;
+ dirty = 0;
+ c->lst.empty_lebs++;
+ } else if (c->repair->lpts[i].flags & LPROPS_INDEX) {
+ free = c->repair->lpts[i].free;
+ dirty = c->repair->lpts[i].dirty;
+ c->lst.idx_lebs += 1;
+ } else {
+ len = ALIGN(c->repair->lpts[i].end, c->min_io_size);
+ free = c->leb_size - len;
+ dirty = len - c->repair->lpts[i].used;
+ }
+
+ c->repair->lpts[i].free = free;
+ c->repair->lpts[i].dirty = dirty;
+ c->lst.total_free += free;
+ c->lst.total_dirty += dirty;
+
+ if (!(c->repair->lpts[i].flags & LPROPS_INDEX)) {
+ int spc;
+
+ spc = free + dirty;
+ if (spc < c->dead_wm)
+ c->lst.total_dead += spc;
+ else
+ c->lst.total_dark += ubifs_calc_dark(c, spc);
+ c->lst.total_used += c->leb_size - spc;
+ }
+ }
+
+ /* Write LPT. */
+ return ubifs_create_lpt(c, c->repair->lpts, c->main_lebs, hash_lpt);
+}
+
static int do_repair(struct ubifs_info *c)
{
int err = 0;
@@ -2343,6 +2410,12 @@ static int do_repair(struct ubifs_info *c)
* Step 10: Build TNC.
*/
err = traverse_files_and_nodes(c);
+ if (err)
+ goto out;
+
+ /* Step 11. Build LPT. */
+ ubifs_msg(c, "Step 11: Build LPT");
+ err = build_lpt(c);

out:
destroy_scanned_info(c, &si);
--
2.31.1


2023-12-28 01:41:43

by Zhihao Cheng

[permalink] [raw]
Subject: [PATCH RFC 14/17] ubifs: repair: Clean up log and orphan area

This is the 12/13 step of repairing. Clean up log and orphan area, all
nodes have been recovered, these two areas should be cleared, otherwise
old content in journal/orphan could be replayed in next mounting.

Signed-off-by: Zhihao Cheng <[email protected]>
---
fs/ubifs/repair.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 47 insertions(+)

diff --git a/fs/ubifs/repair.c b/fs/ubifs/repair.c
index 8d40cff26f7a..42124cda5d7d 100644
--- a/fs/ubifs/repair.c
+++ b/fs/ubifs/repair.c
@@ -2367,6 +2367,43 @@ int build_lpt(struct ubifs_info *c)
return ubifs_create_lpt(c, c->repair->lpts, c->main_lebs, hash_lpt);
}

+/**
+ * clean_log - clean up log area.
+ * @c: UBIFS file-system description object
+ *
+ * This function cleans up log area, since there is no need to do recovery
+ * in next mounting.
+ */
+static int clean_log(struct ubifs_info *c)
+{
+ int lnum, err;
+ struct ubifs_cs_node *cs;
+
+ for (lnum = UBIFS_LOG_LNUM; lnum <= c->log_last; lnum++) {
+ if (fatal_signal_pending(current))
+ return -EINTR;
+ cond_resched();
+
+ err = ubifs_leb_unmap(c, lnum);
+ if (err)
+ return err;
+ }
+
+ cs = kzalloc(ALIGN(UBIFS_CS_NODE_SZ, c->min_io_size), GFP_KERNEL);
+ if (!cs)
+ return -ENOMEM;
+
+ cs->ch.node_type = UBIFS_CS_NODE;
+ cs->cmt_no = cpu_to_le64(0);
+
+ err = ubifs_write_node(c, cs, UBIFS_CS_NODE_SZ, UBIFS_LOG_LNUM, 0);
+ kfree(cs);
+ if (err)
+ return err;
+
+ return 0;
+}
+
static int do_repair(struct ubifs_info *c)
{
int err = 0;
@@ -2416,6 +2453,16 @@ static int do_repair(struct ubifs_info *c)
/* Step 11. Build LPT. */
ubifs_msg(c, "Step 11: Build LPT");
err = build_lpt(c);
+ if (err)
+ goto out;
+
+ /* Step 12. Clean up log & orphan. */
+ ubifs_msg(c, "Step 12: Clean up log & orphan");
+ err = clean_log(c);
+ if (err)
+ goto out;
+
+ err = ubifs_clear_orphans(c);

out:
destroy_scanned_info(c, &si);
--
2.31.1


2023-12-28 01:41:57

by Zhihao Cheng

[permalink] [raw]
Subject: [PATCH RFC 15/17] ubifs: repair: Write master node

This is the 13/13 step of repairing. Since all meta areas are ready,
master node can be updated. After this step, a consistent UBIFS image
can be mounted, and it should pass all tests from chk_fs, chk_general,
chk_index, chk_lprops and chk_orphans.

Signed-off-by: Zhihao Cheng <[email protected]>
---
fs/ubifs/repair.c | 77 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 77 insertions(+)

diff --git a/fs/ubifs/repair.c b/fs/ubifs/repair.c
index 42124cda5d7d..f2600bbb1431 100644
--- a/fs/ubifs/repair.c
+++ b/fs/ubifs/repair.c
@@ -2225,6 +2225,8 @@ static int traverse_files_and_nodes(struct ubifs_info *c)
ubifs_get_type_name(ubifs_get_dent_type(file->ino.mode)),
c->vi.ubi_num, c->vi.vol_id);

+ c->highest_inum = max(c->highest_inum, file->inum);
+
err = parse_node_info(c, &file->ino.header, &file->ino.key,
NULL, 0, &idx_list, &idx_cnt);
if (err)
@@ -2404,6 +2406,75 @@ static int clean_log(struct ubifs_info *c)
return 0;
}

+/**
+ * write_master - write master nodes.
+ * @c: UBIFS file-system description object
+ *
+ * This function updates meta information into master node and writes master
+ * node into master area.
+ */
+static int write_master(struct ubifs_info *c)
+{
+ int err, lnum;
+ struct ubifs_mst_node *mst;
+
+ mst = kzalloc(c->mst_node_alsz, GFP_KERNEL);
+ if (!mst)
+ return -ENOMEM;
+
+ mst->ch.node_type = UBIFS_MST_NODE;
+ mst->log_lnum = cpu_to_le32(UBIFS_LOG_LNUM);
+ mst->highest_inum = cpu_to_le64(c->highest_inum);
+ mst->cmt_no = 0;
+ mst->root_lnum = cpu_to_le32(c->zroot.lnum);
+ mst->root_offs = cpu_to_le32(c->zroot.offs);
+ mst->root_len = cpu_to_le32(c->zroot.len);
+ mst->gc_lnum = cpu_to_le32(c->gc_lnum);
+ mst->ihead_lnum = cpu_to_le32(c->ihead_lnum);
+ mst->ihead_offs = cpu_to_le32(c->ihead_offs);
+ mst->index_size = cpu_to_le64(c->calc_idx_sz);
+ mst->lpt_lnum = cpu_to_le32(c->lpt_lnum);
+ mst->lpt_offs = cpu_to_le32(c->lpt_offs);
+ mst->nhead_lnum = cpu_to_le32(c->nhead_lnum);
+ mst->nhead_offs = cpu_to_le32(c->nhead_offs);
+ mst->ltab_lnum = cpu_to_le32(c->ltab_lnum);
+ mst->ltab_offs = cpu_to_le32(c->ltab_offs);
+ mst->lsave_lnum = cpu_to_le32(c->lsave_lnum);
+ mst->lsave_offs = cpu_to_le32(c->lsave_offs);
+ mst->lscan_lnum = cpu_to_le32(c->main_first);
+ mst->empty_lebs = cpu_to_le32(c->lst.empty_lebs);
+ mst->idx_lebs = cpu_to_le32(c->lst.idx_lebs);
+ mst->leb_cnt = cpu_to_le32(c->leb_cnt);
+ mst->total_free = cpu_to_le64(c->lst.total_free);
+ mst->total_dirty = cpu_to_le64(c->lst.total_dirty);
+ mst->total_used = cpu_to_le64(c->lst.total_used);
+ mst->total_dead = cpu_to_le64(c->lst.total_dead);
+ mst->total_dark = cpu_to_le64(c->lst.total_dark);
+ mst->flags |= cpu_to_le32(UBIFS_MST_NO_ORPHS);
+
+ lnum = UBIFS_MST_LNUM;
+ err = ubifs_leb_unmap(c, lnum);
+ if (err)
+ goto out;
+ err = ubifs_write_node_hmac(c, mst, UBIFS_MST_NODE_SZ, lnum, 0,
+ offsetof(struct ubifs_mst_node, hmac));
+ if (err)
+ goto out;
+ lnum++;
+ err = ubifs_leb_unmap(c, lnum);
+ if (err)
+ goto out;
+ err = ubifs_write_node_hmac(c, mst, UBIFS_MST_NODE_SZ, lnum, 0,
+ offsetof(struct ubifs_mst_node, hmac));
+ if (err)
+ goto out;
+
+out:
+ kfree(mst);
+
+ return err;
+}
+
static int do_repair(struct ubifs_info *c)
{
int err = 0;
@@ -2463,6 +2534,12 @@ static int do_repair(struct ubifs_info *c)
goto out;

err = ubifs_clear_orphans(c);
+ if (err)
+ goto out;
+
+ /* Step 13. Write master node. */
+ ubifs_msg(c, "Step 13: Write master");
+ err = write_master(c);

out:
destroy_scanned_info(c, &si);
--
2.31.1


2023-12-28 01:42:10

by Zhihao Cheng

[permalink] [raw]
Subject: [PATCH RFC 16/17] ubifs: Enable ubifs_repair in '/sys/kernel/debug/ubifs/repair_fs'

Add new interface '/sys/kernel/debug/ubifs/repair_fs' to enable
ubifs repair.

Invoke UBIFS repair by:
echo UBIFS_DEV > /sys/kernel/debug/ubifs/repair_fs, UBIFS_DEV could be:
1. ubiX_Y: X means UBI device number and Y means UBI volume number.
For example: echo "ubi0_0" > /sys/kernel/debug/ubifs/repair_fs
2. /dev/ubiX_Y: X means UBI device number and Y means UBI volume number.
For example: echo "/dev/ubi0_0" > /sys/kernel/debug/ubifs/repair_fs
3. ubiX:NAME: X means UBI device number and NAME means UBI volume name.
For example: echo "ubi0:userdata" > /sys/kernel/debug/ubifs/repair_fs

Signed-off-by: Zhihao Cheng <[email protected]>
---
fs/ubifs/debug.c | 33 +++++++++++++++++++++++++++++++++
fs/ubifs/repair.h | 2 ++
2 files changed, 35 insertions(+)

diff --git a/fs/ubifs/debug.c b/fs/ubifs/debug.c
index 1fe180c22b96..e8d6e948c32c 100644
--- a/fs/ubifs/debug.c
+++ b/fs/ubifs/debug.c
@@ -22,6 +22,7 @@
#include <linux/random.h>
#include <linux/ctype.h>
#include "ubifs.h"
+#include "repair.h"

static DEFINE_SPINLOCK(dbg_lock);

@@ -2868,6 +2869,7 @@ static struct dentry *dfs_chk_orph;
static struct dentry *dfs_chk_lprops;
static struct dentry *dfs_chk_fs;
static struct dentry *dfs_tst_rcvry;
+static struct dentry *dfs_repair_fs;

static ssize_t dfs_global_file_read(struct file *file, char __user *u,
size_t count, loff_t *ppos)
@@ -2899,6 +2901,33 @@ static ssize_t dfs_global_file_write(struct file *file, const char __user *u,
struct dentry *dent = file->f_path.dentry;
int val;

+ if (dent == dfs_repair_fs) {
+ int ret, i;
+ size_t buf_size;
+ char *dev_name;
+
+ dev_name = vmalloc(PAGE_SIZE);
+ if (!dev_name)
+ return -ENOMEM;
+
+ buf_size = min_t(size_t, count, PAGE_SIZE - 1);
+ if (copy_from_user(dev_name, u, buf_size)) {
+ vfree(dev_name);
+ return -EFAULT;
+ }
+
+ /* Filter '\n' */
+ for (i = 0; i < buf_size; ++i)
+ if (dev_name[i] == '\n' || dev_name[i] == '\0')
+ break;
+ dev_name[i] = '\0';
+
+ ret = ubifs_repair(dev_name);
+ vfree(dev_name);
+
+ return ret < 0 ? ret : count;
+ }
+
val = interpret_user_input(u, count);
if (val < 0)
return val;
@@ -2965,6 +2994,10 @@ void dbg_debugfs_init(void)
fname = "tst_recovery";
dfs_tst_rcvry = debugfs_create_file(fname, S_IRUSR | S_IWUSR,
dfs_rootdir, NULL, &dfs_global_fops);
+
+ fname = "repair_fs";
+ dfs_repair_fs = debugfs_create_file(fname, S_IWUSR, dfs_rootdir,
+ NULL, &dfs_global_fops);
}

/**
diff --git a/fs/ubifs/repair.h b/fs/ubifs/repair.h
index f8d18f07e324..3dcf94787cbe 100644
--- a/fs/ubifs/repair.h
+++ b/fs/ubifs/repair.h
@@ -171,4 +171,6 @@ struct ubifs_repair_info {
bool need_update_lpt;
};

+int ubifs_repair(const char *dev_name);
+
#endif /* !__UBIFS_REPAIR_H__ */
--
2.31.1


2023-12-28 01:42:23

by Zhihao Cheng

[permalink] [raw]
Subject: [PATCH RFC 17/17] Documentation: ubifs: Add ubifs repair whitepaper

Add documentation for UBIFS repair.
Add 'ubifs' dir under 'Documentation/filesystems/', and move all docs
related to UBIFS into 'Documentation/filesystems/ubifs'.

Signed-off-by: Zhihao Cheng <[email protected]>
---
Documentation/filesystems/index.rst | 3 +-
.../authentication.rst} | 0
Documentation/filesystems/ubifs/index.rst | 11 +
.../filesystems/{ubifs.rst => ubifs/main.rst} | 0
Documentation/filesystems/ubifs/repair.rst | 235 ++++++++++++++++++
MAINTAINERS | 5 +-
6 files changed, 250 insertions(+), 4 deletions(-)
rename Documentation/filesystems/{ubifs-authentication.rst => ubifs/authentication.rst} (100%)
create mode 100644 Documentation/filesystems/ubifs/index.rst
rename Documentation/filesystems/{ubifs.rst => ubifs/main.rst} (100%)
create mode 100644 Documentation/filesystems/ubifs/repair.rst

diff --git a/Documentation/filesystems/index.rst b/Documentation/filesystems/index.rst
index 09cade7eaefc..7a7e3c0a5289 100644
--- a/Documentation/filesystems/index.rst
+++ b/Documentation/filesystems/index.rst
@@ -116,8 +116,7 @@ Documentation for filesystem implementations.
sysfs
sysv-fs
tmpfs
- ubifs
- ubifs-authentication
+ ubifs/index
udf
virtiofs
vfat
diff --git a/Documentation/filesystems/ubifs-authentication.rst b/Documentation/filesystems/ubifs/authentication.rst
similarity index 100%
rename from Documentation/filesystems/ubifs-authentication.rst
rename to Documentation/filesystems/ubifs/authentication.rst
diff --git a/Documentation/filesystems/ubifs/index.rst b/Documentation/filesystems/ubifs/index.rst
new file mode 100644
index 000000000000..fba59a916e89
--- /dev/null
+++ b/Documentation/filesystems/ubifs/index.rst
@@ -0,0 +1,11 @@
+===============
+UBI File System
+===============
+
+
+.. toctree::
+ :maxdepth: 1
+
+ main
+ repair
+ authentication
diff --git a/Documentation/filesystems/ubifs.rst b/Documentation/filesystems/ubifs/main.rst
similarity index 100%
rename from Documentation/filesystems/ubifs.rst
rename to Documentation/filesystems/ubifs/main.rst
diff --git a/Documentation/filesystems/ubifs/repair.rst b/Documentation/filesystems/ubifs/repair.rst
new file mode 100644
index 000000000000..212fa886b1a1
--- /dev/null
+++ b/Documentation/filesystems/ubifs/repair.rst
@@ -0,0 +1,235 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. UBIFS Repairing
+.. sigma star gmbh
+.. 2023
+
+====================
+UBIFS Repair Support
+====================
+
+Introduction
+============
+
+UBIFS repair provides a way to fix inconsistent UBIFS image(which is corrupted
+by hardware exceptions or UBIFS realization bugs) and makes filesystem become
+consistent, just like fsck tools(eg. fsck.ext4, fsck.f2fs, fsck.fat, etc.) do.
+
+Why do we need UBIFS repair?
+============================
+
+The inconsistent UBIFS image is produced by mainly two aspects:
+
+- *Hardware exceptions*: Some of them are ecc uncorrectable errors(eg. [1]_
+ [2]_), some of them are caused by intermittent writing protection(unstable
+ voltage).
+- *UBIFS realization bugs*: Some of them are known bugs which are fixable (eg.
+ [3]_ [4]_ [5]_ [6]_ [7]_ [8]_ [9]_ [10]_ [11]_ [12]_, etc.), some of them are
+ unknown bugs(eg. [13]_), some of them are hard to fix(eg. [14]_).
+
+Once the UBIFS image becomes inconsistent, userspace applications won't work
+properly, as we all know, UBIFS is mainly applied in embedded system, which
+could affect many domains(eg. communications, IoT, family network, etc.). The
+only way to rescue device is formating UBI device and mkfs, which will lost all
+userdata, and it could be intolerable for some important situations.
+
+So, a filesystem repair tool is urgent for UBIFS, even it has been born for 15
+years, and it's not too late to do it for there will be more embedded devices in
+the future IOT world.
+
+Implementation
+==============
+
+Design
+------
+
+The direct idea of fixing an UBIFS image may be similar to mounting process:
+
+- Step 1: Read superblock.
+- Step 2: Read master node.
+- Step 3: Replay journal.
+- Step 4: Traverse TNC, check and drop bad znodes, scan files according to TNC.
+- Step 5: ...
+
+.. [LINK_1] This method has 3 disadvantages, and point 2 and 3 are unsolvable:
+
+- 1. It depends on too many areas, for example master, log. Repair will be
+ failed if each one of these areas becomes corrupted.
+- 2. The amount of files can be recovered is decided by the degree of corruption
+ in TNC. All files will be lost if the max level znode is corrupted.
+- 3. If we do step 3 before step 4 and TNC is corrupted, step 3 could be failed
+ while updating TNC, which makes repair failed. If we do step 4 before step 3
+ and gc occurred in last mounting, empty('0xFF') area could be scanned based on
+ TNC, the node corresponding to bad TNC branch could be a good one because the
+ empty area has been gced and journal replaying is not performed (TNC could be
+ updated after replaying jouranl); Or the node corresponding to bad TNC branch
+ could be a bad one, because hardware exceptions happened. So it is hard to
+ decide the order between journal replaying and TNC traversing. Similar order
+ problem exists between journal replaying and LPT traversing too.
+
+To sum up, UBIFS repair should depend on less meta areas and recover as much
+repairable data as possible.
+
+Since UBIFS assign sqnum for each node, it makes possible to distinguish new and
+old verions for same file, so the main idea is scanning all nodes and then
+rebuild TNC and recalculate space statistics informaion.
+
+Repair process
+--------------
+
+There are 13 steps for reparing an UBIFS image:
+
+- Step 1: Read superblock.
+- Step 2: Scan nodes(inode node/dentry node/data node/truncation node) from
+ all LEBs, corrupted nodes(eg. incorrect crc, bad inode size, bad dentry name
+ length, etc.) are dropped during scanning. Valid inode nodes(nlink > 0) and
+ dentry nodes(inum != 0) are put into two valid trees separately, also, deleted
+ inode nodes (nlink is 0) and deleted dentry nodes(inum is 0) are put into two
+ deleted trees separately. Other nodes are put into corresponding file. The
+ final recovered file(xattr is treated as a file) is organized as:
+
+::
+
+ (rbtree, inum indexed)
+ / \
+ file1 file2
+ / \
+ file3 file4
+
+ file {
+ inode node // each file has 1 inode node
+ dentry (sub rb_tree, sqnum indexed) // '/' has no dentries, otherwise
+ // at least 1 dentry is required.
+ trun node // the newest one truncation node
+ data (sub rb_tree, block number indexed) // Each file may have 0
+ // many data nodes
+ }
+
+- Step 3: Traverse nodes from deleted trees, remove inode nodes and dentry nodes
+ with smaller sqnum from valid trees. valid_inos - del_inos = *left_inos*,
+ valid_dents - del_dents = *left_dents*.
+
+.. [LINK_2] This step handles deleting case, for example, file A is deleted,
+ deleted inode node and deleted dentry node are written, if we ignore the
+ deleted nodes, file A can be recovered after repairing because undeleted
+ inode node and undeleted dentry node can be scanned. There's an exception,
+ if deleted inode node and deleted dentry node are reclaimed(by gc) after
+ deletion, file A is recovered.
+
+- Step 4: Traverse *left_inos* and *left_dents*, insert inode node and dentry
+ nodes into corresponding files.
+ Each file corresponds to a real file or xattr, it contains 1 inode node, multi
+ dentry nodes, multi data nodes, 1 newest truncation node(if exists).
+- Step 5: Drop invalid files. For example, nonconsistent file type between inode
+ node and dentry nodes, file has no dentry nodes(excepts '/'), encrypted file
+ has no xattr information, etc.
+- Step 6: Extract reachable directory entries tree. Make sure that all files can
+ be searched from '/', unreachable file is deleted.
+- Step 7: Correct the file information. Traverse all files and calculate
+ information (nlink, size, xattr_cnt, etc.) for each file just like
+ check_leaf() does, correct inode node based on calculated information.
+- Step 8: Record used LEBs. Traverse all files'(including effective nodes on
+ deleted trees in step 2) position, after this step UBIFS repair knows which
+ LEB is empty.
+- Step 9: Re-write data. Read data from LEB and write back data, make sure that
+ all LEB is ended with empty data(0xFF). It will prevent failed gc scanning in
+ next mounting.
+- Step 10: Build TNC. Construct TNC according to all files' nodes, just like
+ mkfs does, then write TNC on flash.
+- Step 11: Build LPT. Construct LPT according to all nodes' position, just like
+ mkfs does, then write TNC on flash.
+- Step 12: Clean up log area and orphan area. Recovery process is finished, log
+ area and orphan area can be erased.
+- Step 13: Write master node. Since all meta areas are ready, master node can
+ be updated.
+
+Evaluation
+==========
+
+Based on the implementation of ubifs repair, there are following advantages and
+limitations.
+
+Advantages
+----------
+
+- 1. Power-cut during repairing is tolerant, UBIFS repair always do full
+ scanning and final valid nodes won't be erased from flash, so UBIFS repair
+ can always rebuild filesystem based on scanning nodes even it is breaked by
+ power-cut.
+- 2. The UBIFS image can be fixed as long as the super block is not corrupted.
+ If there exists no valid nodes on flash, UBIFS repair will create a new '/'
+ just like create_default_filesystem() does.
+- 3. Encrypted UBIFS image is supported, because dentry name and data content of
+ file are not necessary for UBIFS repair process.
+
+Limitations
+-----------
+
+- 1. Deleted files could be recovered [LINK_2]_. Similar problem exists on data
+ nodes, for example, 8K-size file A(dn0, dn1) is truncated as 0, 4K data(dn2)
+ is then written from offset 4K in file A. The dn0 and dn2 is recovered after
+ repairing, but only dn2 is expected to be recovered. UBIFS repair cannot solve
+ it, because the real existence information of nodes depends on TNC, but TNC
+ should not be depended for UBIFS repair, see [LINK_1]_.
+- 2. All valid nodes are loaded in memory, if there are too many files, it could
+ trigger OOM while repairing. Similar problem exists in dbg_check_filesystem().
+- 3. Authenticated UBIFS image is not supported for now, it can be supported by
+ extending new fields in repair interface(used to passing authentication
+ information) and implementing parsing authenticated nodes.
+
+How to use?
+===========
+
+UBIFS repair is suggested to be invoked when UBIFS image becomes inconsistent,
+for example:
+
+- UBIFS image cannot be mounted caused by corrupted data(eg. bad crc, bad lpt,
+ bad tnc, bad master node, bad log area, etc.) or inconsistent meta data(eg.
+ inode size is smaller than data length, lost dentry node, lost inode node,
+ wrong nlink, etc.).
+- Errors occurr in file accessing/reading, and error messages are related to
+ inconsistent meta data.
+- UBIFS becomes readonly caused by inconsistent meta data(eg. assertion failure
+ on metadata checking).
+- Other problems related to inconsistent UBIFS image.
+
+Invoke UBIFS repair by: echo ``UBIFS_DEV`` > /sys/kernel/debug/ubifs/repair_fs,
+``UBIFS_DEV`` could be:
+
+- ubiX_Y: X means UBI device number and Y means UBI volume number.
+ For example: echo "ubi0_0" > /sys/kernel/debug/ubifs/repair_fs
+- /dev/ubiX_Y: X means UBI device number and Y means UBI volume number.
+ For example: echo "/dev/ubi0_0" > /sys/kernel/debug/ubifs/repair_fs
+- ubiX:NAME: X means UBI device number and NAME means UBI volume name.
+ For example: echo "ubi0:userdata" > /sys/kernel/debug/ubifs/repair_fs
+
+References
+==========
+
+.. [1] https://lore.kernel.org/linux-mtd/[email protected]/
+
+.. [2] https://lore.kernel.org/linux-mtd/CAMxq0fNSWrUFMmmTs8Ri9gFOvS+KQJvZN3-_KuiqXi9bbmCB0Q@mail.gmail.com/
+
+.. [3] https://lore.kernel.org/linux-mtd/[email protected]/
+
+.. [4] https://lore.kernel.org/linux-mtd/[email protected]/T/#u
+
+.. [5] https://lore.kernel.org/linux-mtd/[email protected]/
+
+.. [6] https://lore.kernel.org/linux-mtd/[email protected]/
+
+.. [7] https://lore.kernel.org/linux-mtd/[email protected]/
+
+.. [8] https://lore.kernel.org/linux-mtd/[email protected]/
+
+.. [9] https://lore.kernel.org/linux-mtd/[email protected]/
+
+.. [10] https://lore.kernel.org/linux-mtd/[email protected]/
+
+.. [11] https://lore.kernel.org/linux-mtd/[email protected]/
+
+.. [12] https://lore.kernel.org/linux-mtd/[email protected]/
+
+.. [13] https://linux-mtd.infradead.narkive.com/bfcHzD0j/ubi-ubifs-corruptions-during-random-power-cuts
+
+.. [14] https://bugzilla.kernel.org/show_bug.cgi?id=218309
diff --git a/MAINTAINERS b/MAINTAINERS
index 7cef2d2ef8d7..6703dfb10369 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -22161,8 +22161,9 @@ W: http://www.linux-mtd.infradead.org/doc/ubifs.html
T: git git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs.git next
T: git git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs.git fixes
F: Documentation/ABI/testing/sysfs-fs-ubifs
-F: Documentation/filesystems/ubifs-authentication.rst
-F: Documentation/filesystems/ubifs.rst
+F: Documentation/filesystems/ubifs/authentication.rst
+F: Documentation/filesystems/ubifs/main.rst
+F: Documentation/filesystems/ubifs/repair.rst
F: fs/ubifs/

UBLK USERSPACE BLOCK DRIVER
--
2.31.1


2023-12-29 10:06:15

by Richard Weinberger

[permalink] [raw]
Subject: Re: [PATCH RFC 00/17] ubifs: Add filesystem repair support

----- Ursprüngliche Mail -----
> Von: "chengzhihao1" <[email protected]>
> An: "david oberhollenzer" <[email protected]>, "richard" <[email protected]>, "Miquel Raynal"
> <[email protected]>, "Sascha Hauer" <[email protected]>, "Tudor Ambarus" <[email protected]>
> CC: "linux-kernel" <[email protected]>, "linux-mtd" <[email protected]>
> Gesendet: Donnerstag, 28. Dezember 2023 02:40:55
> Betreff: [PATCH RFC 00/17] ubifs: Add filesystem repair support

Thanks a lot for sharing this.
Below you find some thoughts that came into my mind while skimming over the
patch series.

> UBIFS repair provides a way to fix inconsistent UBIFS image(which is
> corrupted by hardware exceptions or UBIFS realization bugs) and makes
> filesystem become consistent, just like fsck tools(eg. fsck.ext4,
> fsck.f2fs, fsck.fat, etc.) do.

I don't fully agree. The tool makes UBIFS mount again but you still have lost data
and later userspace might fail because file no longer contain what the application
expected.
So my fear is that we're just shifting the problem one layer up.

UBIFS never had a fsck for reasons. UBIFS tries hard to not become inconsistent,
by maintaining a data journal for example.
It can fail of course by hardware issues. e.g. if the underlying MTD loses bits,
but there is nothing UBIFS can do except something like storing duplicates
of data like BTRFS does.

And finally, the biggest pain point, it can fail due to bugs in UBIFS itself.
In my opinion bugs should get addressed by improving our test infrastructure
instead of working around.

> About why do we need it, how it works, what it can fix or it can not
> fix, when and how to use it, see more details in
> Documentation/filesystems/ubifs/repair.rst (Patch 17).

This needs to go into the cover letter.

> Testing on UBIFS repair refers to
> https://bugzilla.kernel.org/show_bug.cgi?id=218327
>
> Whatever, we finally have a way to fix inconsistent UBFIS image instead
> of formatting UBI when UBIFS becomes inconsistent.

Fix in terms of making mount work again, I fear? As I said, most likely
the problem is now one layer above. UBIFS thinks everything is good but
userspace suddenly will see old/incomplete files.

What I can think of is a tool (in userspace like other fscks) which
can recover certain UBIFS structures but makes very clear to the user what
the data is lost. e.g. that inode XY now misses some blocks or an old version
of something will be used.
But this isl nothing you can run blindly in production.

Thanks,
//richard

2023-12-29 13:10:17

by Zhihao Cheng

[permalink] [raw]
Subject: Re: [PATCH RFC 00/17] ubifs: Add filesystem repair support

在 2023/12/29 18:06, Richard Weinberger 写道:
> ----- Ursprüngliche Mail -----
>> Von: "chengzhihao1" <[email protected]>
>> An: "david oberhollenzer" <[email protected]>, "richard" <[email protected]>, "Miquel Raynal"
>> <[email protected]>, "Sascha Hauer" <[email protected]>, "Tudor Ambarus" <[email protected]>
>> CC: "linux-kernel" <[email protected]>, "linux-mtd" <[email protected]>
>> Gesendet: Donnerstag, 28. Dezember 2023 02:40:55
>> Betreff: [PATCH RFC 00/17] ubifs: Add filesystem repair support
> Thanks a lot for sharing this.
> Below you find some thoughts that came into my mind while skimming over the
> patch series.
>
>> UBIFS repair provides a way to fix inconsistent UBIFS image(which is
>> corrupted by hardware exceptions or UBIFS realization bugs) and makes
>> filesystem become consistent, just like fsck tools(eg. fsck.ext4,
>> fsck.f2fs, fsck.fat, etc.) do.
> I don't fully agree. The tool makes UBIFS mount again but you still have lost data
> and later userspace might fail because file no longer contain what the application
> expected.
> So my fear is that we're just shifting the problem one layer up.
>
> UBIFS never had a fsck for reasons. UBIFS tries hard to not become inconsistent,
> by maintaining a data journal for example.
> It can fail of course by hardware issues. e.g. if the underlying MTD loses bits,
> but there is nothing UBIFS can do except something like storing duplicates
> of data like BTRFS does.
>
> And finally, the biggest pain point, it can fail due to bugs in UBIFS itself.
> In my opinion bugs should get addressed by improving our test infrastructure
> instead of working around.

I make UBIFS repair for two reasons:

1. There have been many inconsistent problems happened in our
products(40+ per year), and reasons for most of them are unknown, I even
can't judge the problem is caused by UBIFS bug or hardware exception.
The consistent problems, for example, TNC points to an empty space, TNC
points to an unexpected node, bad key order in znodes, dirty space of
pnode becomes greater than LEB size, huge number in
master->total_dead(looks like overflow), etc. I cannot send these bad
images to find help, because the corporate policy. Our kernel version is
new, and latest bugfixs in linux-mainline are picked in time. I have
looked though journal/recovery UBIFS subsystem dozens of times, the
implementation looks good, except one problem[2]. And we have do many
powercut/faul-injection tests for ubifs, and Zhaolong has published our
fault-injection implementation in [3], the result is that
journal/recovery UBIFS subsystem does look sturdy.

2. If there exists a fsck tool, user have one more option to deal with
inconsistent UBIFS image. UBIFS is mainly applied in embeded system,
making filesystem available is most important when filesystem becomes
inconsistent in some situations.

[1]
https://linux-mtd.infradead.narkive.com/bfcHzD0j/ubi-ubifs-corruptions-during-random-power-cuts

[2] https://bugzilla.kernel.org/show_bug.cgi?id=218309

[3] https://patchwork.ozlabs.org/project/linux-mtd/list/?series=388034

I'm not sure whether you prefer a fsck tool, in my opinion, fsck just
provide a way for userspace to fix filesystem, user can choose invoke it
or not according to the tool's limitations based on specific situation.
But according to your following reply, I guess you can accept that UBIFS
can have a fsck, and fsck should let user known which file is recovered
incomplete, which file is old, rather than just make filesystem become
consistent.

>
>> About why do we need it, how it works, what it can fix or it can not
>> fix, when and how to use it, see more details in
>> Documentation/filesystems/ubifs/repair.rst (Patch 17).
> This needs to go into the cover letter.
OK, thanks for reminding.
>
>> Testing on UBIFS repair refers to
>> https://bugzilla.kernel.org/show_bug.cgi?id=218327
>>
>> Whatever, we finally have a way to fix inconsistent UBFIS image instead
>> of formatting UBI when UBIFS becomes inconsistent.
> Fix in terms of making mount work again, I fear? As I said, most likely
> the problem is now one layer above. UBIFS thinks everything is good but
> userspace suddenly will see old/incomplete files.
>
> What I can think of is a tool (in userspace like other fscks) which
> can recover certain UBIFS structures but makes very clear to the user what
> the data is lost. e.g. that inode XY now misses some blocks or an old version
> of something will be used.
> But this isl nothing you can run blindly in production.

Let me see.

First, we have a common view, fsck tool is valuable for UBIFS, it just
provide a way for user application to make UBIFS be consistent and
available. Right?

Second, you concern odd/incomplete files are recovered just like I
metioned in documentation(Limitations section), which still make
application failed because the recovered file lost data or deleted file
is recovered. So you suggested me that make a userspace fsck tool, and
fsck can telll user which file is data lost, which file is recovered
after deletion.

The difficulty comes from second point,  how does fsck know a file is
recovered incomplete or old. Whether the node is existing, it is judged
by TNC, but TNC could be damaged like I descibed in above. Do you have
any ideas?

Besides, we get incomplete file because some data nodes are corrupted,
the corrupted data is printed in dbg msg when it is dropped.


2023-12-29 21:09:22

by Richard Weinberger

[permalink] [raw]
Subject: Re: [PATCH RFC 00/17] ubifs: Add filesystem repair support

----- Ursprüngliche Mail -----
> Von: "chengzhihao1" <[email protected]>
> I make UBIFS repair for two reasons:
>
> 1. There have been many inconsistent problems happened in our
> products(40+ per year), and reasons for most of them are unknown, I even
> can't judge the problem is caused by UBIFS bug or hardware exception.
> The consistent problems, for example, TNC points to an empty space, TNC
> points to an unexpected node, bad key order in znodes, dirty space of
> pnode becomes greater than LEB size, huge number in
> master->total_dead(looks like overflow), etc. I cannot send these bad
> images to find help, because the corporate policy. Our kernel version is
> new, and latest bugfixs in linux-mainline are picked in time. I have

Regarding company policy, we could implement a tool which dumps just UBIFS'
meta data (no data node content nor filenames). ext4 has such a tool to
exchange faulty filesystems.
Another option is, in case you want some else looking into the issue,
asking a contractor like me. Usually signing a NDA is not a big deal.

In any case, I'm keen to look into this issues. But I fear
we need more testing to find the root cause, if they are caused by UBIFS bugs.

> looked though journal/recovery UBIFS subsystem dozens of times, the
> implementation looks good, except one problem[2]. And we have do many
> powercut/faul-injection tests for ubifs, and Zhaolong has published our
> fault-injection implementation in [3], the result is that
> journal/recovery UBIFS subsystem does look sturdy.

I came to the same conclusion after digging through the code more than once. :-)

> 2. If there exists a fsck tool, user have one more option to deal with
> inconsistent UBIFS image. UBIFS is mainly applied in embeded system,
> making filesystem available is most important when filesystem becomes
> inconsistent in some situations.

This is the point where I'm more sceptical.
Please see my comments below.

> [1]
> https://linux-mtd.infradead.narkive.com/bfcHzD0j/ubi-ubifs-corruptions-during-random-power-cuts
>
> [2] https://bugzilla.kernel.org/show_bug.cgi?id=218309
>
> [3] https://patchwork.ozlabs.org/project/linux-mtd/list/?series=388034
>
> I'm not sure whether you prefer a fsck tool, in my opinion, fsck just
> provide a way for userspace to fix filesystem, user can choose invoke it
> or not according to the tool's limitations based on specific situation.
> But according to your following reply, I guess you can accept that UBIFS
> can have a fsck, and fsck should let user known which file is recovered
> incomplete, which file is old, rather than just make filesystem become
> consistent.

I see three different functions:

1. Online scrubbing

A feature which can check all UBIFS structures while UBIFS is mounted
and tell what's wrong. We have this already more or less ready, the chk_fs
debugfs knob.

2. Online repair

Like XFS online repair, this feature allows UBIFS to fix data structures by
*re-computing* them from other structures without loosing data nor violating
file contents consistency.
E.g. if a data node vanished, it can do nothing. Fixing the index tree
will make UBIFS no longer fail but userspace will be unhappy if a file
has suddenly a hole or is truncated.
On the other hand, a disconnected inode could be linked into a lost+found
folder or re-computing the LPT tree (I'm still not sure about the LPT).
Same for updating link counters, etc...

3. Offline repair

This is the classical fsck. It can do everything what 1) and 2) can do plus
dangerous operations like re-building the index tree by scanning for
UBIFS nodes on the media.

Re-building the index tree is dangerous because file *contents* can be
inconsistent later. If for example a whole LEB is lost, a file can
contain a mixture of old and new data blocks. For a text file this is
not always fatal. For a database it is.

But UBIFS itself will be consistent again, will mount and not render
read-only all of a sudden.

>>
>>> About why do we need it, how it works, what it can fix or it can not
>>> fix, when and how to use it, see more details in
>>> Documentation/filesystems/ubifs/repair.rst (Patch 17).
>> This needs to go into the cover letter.
> OK, thanks for reminding.
>>
>>> Testing on UBIFS repair refers to
>>> https://bugzilla.kernel.org/show_bug.cgi?id=218327
>>>
>>> Whatever, we finally have a way to fix inconsistent UBFIS image instead
>>> of formatting UBI when UBIFS becomes inconsistent.
>> Fix in terms of making mount work again, I fear? As I said, most likely
>> the problem is now one layer above. UBIFS thinks everything is good but
>> userspace suddenly will see old/incomplete files.
>>
>> What I can think of is a tool (in userspace like other fscks) which
>> can recover certain UBIFS structures but makes very clear to the user what
>> the data is lost. e.g. that inode XY now misses some blocks or an old version
>> of something will be used.
>> But this isl nothing you can run blindly in production.
>
> Let me see.
>
> First, we have a common view, fsck tool is valuable for UBIFS, it just
> provide a way for user application to make UBIFS be consistent and
> available. Right?

Yes. David Oberhollenzer and I will happily help with implementing, testing and
reviewing code.

> Second, you concern odd/incomplete files are recovered just like I
> metioned in documentation(Limitations section), which still make
> application failed because the recovered file lost data or deleted file
> is recovered. So you suggested me that make a userspace fsck tool, and
> fsck can telll user which file is data lost, which file is recovered
> after deletion.
>
> The difficulty comes from second point,  how does fsck know a file is
> recovered incomplete or old. Whether the node is existing, it is judged
> by TNC, but TNC could be damaged like I descibed in above. Do you have
> any ideas?

That's the problem what all fsck tools have in common.
The best we can do is offering safe and dangerous repair modes
plus a good repair report.

Long story short, I'm not opposed to the idea, I just want to make
sure that this new tool or feature is not used blindly, since
it cannot do magic.

Thanks,
//richard

2024-01-02 10:08:36

by Zhihao Cheng

[permalink] [raw]
Subject: Re: [PATCH RFC 00/17] ubifs: Add filesystem repair support

在 2023/12/30 5:08, Richard Weinberger 写道:
>> Second, you concern odd/incomplete files are recovered just like I
>> metioned in documentation(Limitations section), which still make
>> application failed because the recovered file lost data or deleted file
>> is recovered. So you suggested me that make a userspace fsck tool, and
>> fsck can telll user which file is data lost, which file is recovered
>> after deletion.
>>
>> The difficulty comes from second point,  how does fsck know a file is
>> recovered incomplete or old. Whether the node is existing, it is judged
>> by TNC, but TNC could be damaged like I descibed in above. Do you have
>> any ideas?
> That's the problem what all fsck tools have in common.
> The best we can do is offering safe and dangerous repair modes
> plus a good repair report.
>

I come up with another way to implement fsck.ubifs:

There are three modes:

1. common mode(no options): Ask user whether to fix as long as a problem
is detected.

2. safe mode(-a option): Auto repair as long as no data/files lost(eg.
nlink, isize, xattr_cnt, which can be corrected without dropping nodes),
otherwise returns error code.

3. dangerous mode(-y option): Fix is always successful, unless
superblock is corrupted. There are 2 situations:

a) TNC is valid: fsck will print which file is dropped and which
file's data is dropped

b) TNC is invalid: fsck will scan all nodes without referencing TNC,
same as this patchset does


Q1: How do you think of this method?

Q2: Mode 1, 2 and 3(a) depend on journal replaying, I found
xfs_repair[1] should be used after mounting/unmounting xfs (Let kernel
replay journal), if UBIFS does so, there is no need to copy recovery
subsystem into userspace, but user has to mount/unmount before doing
fsck. I found e2fsck has copied recovery code into userspace, so it can
do fsck without mounting/unmounting. If UBIFS does so, journal replaying
will update TNC and LPT, please reference Q3(1). Which method do you
suggest?

Q3: If fsck drops or updates a node(eg. dentry lost inode, corrected
inode) in mode 1,2 and 3(a), TNC/LPT should be updated. There are two
ways updating TNC and LPT:

1) Like kernel does, which means that mark dirty TNC/LPT for
corresponding branches and do_commit(). It will copy much code into
userspace, eg. tnc.c, lpt.c, tnc_commit.c,
lpt_commit.c. I fear there exists risks. For example, there is no space
left for new index nodes, gc should be performed? If so, gc/lpt gc code
should be copied too.

2) Rebuild new TNC/LPT based on valid nodes. This way is simple, but
old good TNC could be corrupted, it means that powercut during fsck may
let UBIFS image must be repaired in mode 3(b) but it could be repaired
in mode 2\3(a) before invoking fsck.

Which way is better?


[1]
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/managing_file_systems/checking-and-repairing-a-file-system__managing-file-systems#proc_repairing-an-xfs-file-system-with-xfs_repair_checking-and-repairing-a-file-system

> Long story short, I'm not opposed to the idea, I just want to make
> sure that this new tool or feature is not used blindly, since
> it cannot do magic.


2024-01-02 20:45:46

by Richard Weinberger

[permalink] [raw]
Subject: Re: [PATCH RFC 00/17] ubifs: Add filesystem repair support

----- Ursprüngliche Mail -----
> Von: "chengzhihao1" <[email protected]>
> I come up with another way to implement fsck.ubifs:
>
> There are three modes:
>
> 1. common mode(no options): Ask user whether to fix as long as a problem
> is detected.

Makes sense.

> 2. safe mode(-a option): Auto repair as long as no data/files lost(eg.
> nlink, isize, xattr_cnt, which can be corrected without dropping nodes),
> otherwise returns error code.

Makes sense.

> 3. dangerous mode(-y option): Fix is always successful, unless
> superblock is corrupted. There are 2 situations:

Please not use "-y". Usually "-y" stands for "answer yes to all questions".
But selecting names for command line flags is currently my least concern.

> a) TNC is valid: fsck will print which file is dropped and which
> file's data is dropped

Sounds also good.

> b) TNC is invalid: fsck will scan all nodes without referencing TNC,
> same as this patchset does

I'd make this a distinct mode.
It can be used to rebuild index and LEB property trees.
This is basically a "drop trees and rebuild" mode.

>
> Q1: How do you think of this method?

Sounds good so far.

> Q2: Mode 1, 2 and 3(a) depend on journal replaying, I found
> xfs_repair[1] should be used after mounting/unmounting xfs (Let kernel
> replay journal), if UBIFS does so, there is no need to copy recovery
> subsystem into userspace, but user has to mount/unmount before doing
> fsck. I found e2fsck has copied recovery code into userspace, so it can
> do fsck without mounting/unmounting. If UBIFS does so, journal replaying
> will update TNC and LPT, please reference Q3(1). Which method do you
> suggest?

UBIFS is a little special regarding the journal.

1. The journal is not an add-on like it is on many other file systems,
you have to replay it to get a consistent file system.
2. Journal replay is also needed after a clean umount. The reason is that
UBIFS does no commit at umount time.

So IMHO you need to have journal replay code in your tool in any case.
You can copy it from the kernel implementation and add more checks.
While the kernel code also tries to be fast, fsck should be more paranoid.
Ideally the outcome is a libubifs or such which can be derived from the
kernel source while building mtd-utils.

> Q3: If fsck drops or updates a node(eg. dentry lost inode, corrected
> inode) in mode 1,2 and 3(a), TNC/LPT should be updated. There are two
> ways updating TNC and LPT:
>
> 1) Like kernel does, which means that mark dirty TNC/LPT for
> corresponding branches and do_commit(). It will copy much code into
> userspace, eg. tnc.c, lpt.c, tnc_commit.c,
> lpt_commit.c. I fear there exists risks. For example, there is no space
> left for new index nodes, gc should be performed? If so, gc/lpt gc code
> should be copied too.
>
> 2) Rebuild new TNC/LPT based on valid nodes. This way is simple, but
> old good TNC could be corrupted, it means that powercut during fsck may
> let UBIFS image must be repaired in mode 3(b) but it could be repaired
> in mode 2\3(a) before invoking fsck.
>
> Which way is better?

Since you need to do a full journal replay anyway and power-cut awareness
is one of your requirements, I fear the only option is 1).

Thanks,
//richard

2024-01-03 03:18:59

by Zhihao Cheng

[permalink] [raw]
Subject: Re: [PATCH RFC 00/17] ubifs: Add filesystem repair support

在 2024/1/3 4:45, Richard Weinberger 写道:
> ----- Ursprüngliche Mail -----
>> Von: "chengzhihao1" <[email protected]>
>> I come up with another way to implement fsck.ubifs:
>>
>> There are three modes:
>>
>> 1. common mode(no options): Ask user whether to fix as long as a problem
>> is detected.
>
> Makes sense.
>
>> 2. safe mode(-a option): Auto repair as long as no data/files lost(eg.
>> nlink, isize, xattr_cnt, which can be corrected without dropping nodes),
>> otherwise returns error code.
>
> Makes sense.
>
>> 3. dangerous mode(-y option): Fix is always successful, unless
>> superblock is corrupted. There are 2 situations:
>
> Please not use "-y". Usually "-y" stands for "answer yes to all questions".
> But selecting names for command line flags is currently my least concern.
>
>> a) TNC is valid: fsck will print which file is dropped and which
>> file's data is dropped
>
> Sounds also good.
>
>> b) TNC is invalid: fsck will scan all nodes without referencing TNC,
>> same as this patchset does
>
> I'd make this a distinct mode.
> It can be used to rebuild index and LEB property trees.
> This is basically a "drop trees and rebuild" mode.
>

OK, then fsck will have four modes.

>>
>> Q1: How do you think of this method?
>
> Sounds good so far.
>
>> Q2: Mode 1, 2 and 3(a) depend on journal replaying, I found
>> xfs_repair[1] should be used after mounting/unmounting xfs (Let kernel
>> replay journal), if UBIFS does so, there is no need to copy recovery
>> subsystem into userspace, but user has to mount/unmount before doing
>> fsck. I found e2fsck has copied recovery code into userspace, so it can
>> do fsck without mounting/unmounting. If UBIFS does so, journal replaying
>> will update TNC and LPT, please reference Q3(1). Which method do you
>> suggest?
>
> UBIFS is a little special regarding the journal.
>
> 1. The journal is not an add-on like it is on many other file systems,
> you have to replay it to get a consistent file system.
> 2. Journal replay is also needed after a clean umount. The reason is that
> UBIFS does no commit at umount time.

I agree, there exists one situation that UBIFS replays journal even
after clean umount.
P1 ubifs_bgt umount
mkdir
run_bg_commit
c->cmt_state = COMMIT_RUNNING_BACKGROUND
do_commit
ubifs_log_start_commit(c, &new_ltail_lnum) // log start
up_write(&c->commit_sem)
touch
ubifs_jnl_update // new buds added
cleanup_mnt
deactivate_super
fs->kill_sb
generic_shutdown_super
sync_filesystem
ubifs_sync_fs
ubifs_run_commit
wait_for_commit // wait bg commit,
'touch' won't be commited, it will be replayed in next mount

>
> So IMHO you need to have journal replay code in your tool in any case.
> You can copy it from the kernel implementation and add more checks.
> While the kernel code also tries to be fast, fsck should be more paranoid.
> Ideally the outcome is a libubifs or such which can be derived from the
> kernel source while building mtd-utils.

All right, sounds like a huge copy work.

>
>> Q3: If fsck drops or updates a node(eg. dentry lost inode, corrected
>> inode) in mode 1,2 and 3(a), TNC/LPT should be updated. There are two
>> ways updating TNC and LPT:
>>
>> 1) Like kernel does, which means that mark dirty TNC/LPT for
>> corresponding branches and do_commit(). It will copy much code into
>> userspace, eg. tnc.c, lpt.c, tnc_commit.c,
>> lpt_commit.c. I fear there exists risks. For example, there is no space
>> left for new index nodes, gc should be performed? If so, gc/lpt gc code
>> should be copied too.
>>
>> 2) Rebuild new TNC/LPT based on valid nodes. This way is simple, but
>> old good TNC could be corrupted, it means that powercut during fsck may
>> let UBIFS image must be repaired in mode 3(b) but it could be repaired
>> in mode 2\3(a) before invoking fsck.
>>
>> Which way is better?
>
> Since you need to do a full journal replay anyway and power-cut awareness
> is one of your requirements, I fear the only option is 1). >
> Thanks,
> //richard
> .
>


2024-01-03 12:45:10

by Zhihao Cheng

[permalink] [raw]
Subject: Re: [PATCH RFC 00/17] ubifs: Add filesystem repair support

在 2024/1/3 11:18, Zhihao Cheng 写道:
> 在 2024/1/3 4:45, Richard Weinberger 写道:
>> ----- Ursprüngliche Mail -----
>>> Von: "chengzhihao1" <[email protected]>
>>> I come up with another way to implement fsck.ubifs:
>>>
>>> There are three modes:
>>>
>>> 1. common mode(no options): Ask user whether to fix as long as a problem
>>> is detected.
>>
>> Makes sense.
>>
>>> 2. safe mode(-a option): Auto repair as long as no data/files lost(eg.
>>> nlink, isize, xattr_cnt, which can be corrected without dropping nodes),
>>> otherwise returns error code.
>>
>> Makes sense.
>>> 3. dangerous mode(-y option): Fix is always successful, unless
>>> superblock is corrupted. There are 2 situations:
>>
>> Please not use "-y". Usually "-y" stands for "answer yes to all
>> questions".
>> But selecting names for command line flags is currently my least concern.
>>>    a) TNC is valid: fsck will print which file is dropped and which
>>> file's data is dropped
>>
>> Sounds also good.
>>>    b) TNC is invalid: fsck will scan all nodes without referencing TNC,
>>> same as this patchset does
>>
>> I'd make this a distinct mode.
>> It can be used to rebuild index and LEB property trees.
>> This is basically a "drop trees and rebuild" mode.
>>
>
> OK, then fsck will have four modes.

How about merging 3(a) and 3(b) as one mode(dangerous mode)? If fsck can
get a good TNC(all non-leaf index nodes are valid), fsck executes as
3(a) describes. If fsck cannot find a good TNC, fsck executes as 3(b)
and reminds user that "TNC is damaged, nodes dropping is not awared".


2024-01-03 12:56:16

by Richard Weinberger

[permalink] [raw]
Subject: Re: [PATCH RFC 00/17] ubifs: Add filesystem repair support

----- Ursprüngliche Mail -----
> Von: "chengzhihao1" <[email protected]>
> How about merging 3(a) and 3(b) as one mode(dangerous mode)? If fsck can
> get a good TNC(all non-leaf index nodes are valid), fsck executes as
> 3(a) describes. If fsck cannot find a good TNC, fsck executes as 3(b)
> and reminds user that "TNC is damaged, nodes dropping is not awared".

Well, you can make all modes combinable.
Right now I don't care much about the user interface.
But offering much flexibility is a worthwhile goal.

At the end it should be crystal clear to the user of fsck.ubifs whether
it fixed the file system by applying dangerous methods or not.

Want I want to avoid by all means is a tool which blindly alters
the filesystem just to stop UBIFS complaining about it.

Thanks,
//richard

2024-01-03 13:34:02

by Richard Weinberger

[permalink] [raw]
Subject: Re: [PATCH RFC 00/17] ubifs: Add filesystem repair support

----- Ursprüngliche Mail -----
> Von: "chengzhihao1" <[email protected]>
>> 2. Journal replay is also needed after a clean umount. The reason is that
>> UBIFS does no commit at umount time.
>
> I agree, there exists one situation that UBIFS replays journal even
> after clean umount.
> P1 ubifs_bgt umount
> mkdir
> run_bg_commit
> c->cmt_state = COMMIT_RUNNING_BACKGROUND
> do_commit
> ubifs_log_start_commit(c, &new_ltail_lnum) // log start
> up_write(&c->commit_sem)
> touch
> ubifs_jnl_update // new buds added
> cleanup_mnt
> deactivate_super
> fs->kill_sb
> generic_shutdown_super
> sync_filesystem
> ubifs_sync_fs
> ubifs_run_commit
> wait_for_commit // wait bg commit,
> 'touch' won't be commited, it will be replayed in next mount

BTW: I was imprecise, sorry for that.
The issue is that even after a commit you need to replay the journal.

Thanks,
//richard