Received: by 2002:a05:7412:b995:b0:f9:9502:5bb8 with SMTP id it21csp4529799rdb; Fri, 29 Dec 2023 05:10:17 -0800 (PST) X-Google-Smtp-Source: AGHT+IFXqXrVOX7YdXdFQT2wF5xvY26t16/2HN/txXh5cxkGkAYrF8ihGYN6102EUDb9fA+wTXLV X-Received: by 2002:a05:6a20:1011:b0:195:3163:671 with SMTP id gs17-20020a056a20101100b0019531630671mr4974386pzc.99.1703855417026; Fri, 29 Dec 2023 05:10:17 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1703855417; cv=none; d=google.com; s=arc-20160816; b=Zd0ba8TEnxXcjVBh8phN9nC60ZmtOLH1ujW/z4Lqgk1RbJX5o/2oS8xjAJgiyagoyb I1BERNgTIYDWLZoZI+4mwrO+klgQyW2GR9e/4LWSjXVVIFt8H4wnTfq1Tf5lwkIfDrFE B6LKt46q9sVI8pMNeEU6kpbdJgDkqcXPw3qDrHHeOAHUPZ66F45RvynXvcv8IXfBOZZc 9ruotEjszNvmFpitlSdl4Htra3GVsjJGB5hMj+6FX+96FzMDDD2ve8y5Ym5Np5O1i7Ys D5aRuzs89IsXwx1GUw2m7RHOC7EMMQf08caophHcTSoLZJUmw3yX8eCZYxN8RIXHRnjl 8Zmg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:user-agent:date:message-id:from :references:cc:to:subject; bh=A407ben4WCHJ3jtPMsY47K5hYJcxC8m7qHWCUhfajNk=; fh=7vfNIKc+0t0VMx977PgdEnqNyG5N8YJqSkNJCpByyjg=; b=WJeIYeyHSXhWHcV5Tt1xnAPHuu87PDxbd/LdkXJ8jmUmfH7JsISNDNCjdUC2GvDnDQ 88OYsbDKkTBBY9GqB02yoOHLMHGx4BiFguiu2z/Co8QxyVO4zrAs44s5ilYfH4zZqaTP w7HfWVqekF1fhe4VwhDfBuVgzXZpI1jyl4TuuFRF87kuCd7PmKIFUuXb+/ELr1eCB7mT 0lIPdSR4I9iCxh9BSEvZrrv1tGr47Jt986jODE4tkTe5xacRbVJQxFBoDlcP1VUnGOyW FQ3JAdRXqChsJVcCdBInLlBVVr7fIt/XWbiSElcj43sMgHHYoYty8gupYJq8GxBo5v/5 eeIQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-13094-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-13094-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id z12-20020a17090ab10c00b0028bce1c4f8csi14241260pjq.187.2023.12.29.05.10.16 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Dec 2023 05:10:17 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-13094-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-13094-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-13094-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id A8DD9282AE8 for ; Fri, 29 Dec 2023 13:10:16 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 04F4A111A1; Fri, 29 Dec 2023 13:10:11 +0000 (UTC) X-Original-To: linux-kernel@vger.kernel.org Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5EADA111AD for ; Fri, 29 Dec 2023 13:10:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.88.194]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4T1m1V3Bx3zsTgK; Fri, 29 Dec 2023 21:09:34 +0800 (CST) Received: from kwepemm000013.china.huawei.com (unknown [7.193.23.81]) by mail.maildlp.com (Postfix) with ESMTPS id 5A0811400E3; Fri, 29 Dec 2023 21:10:03 +0800 (CST) Received: from [10.174.178.46] (10.174.178.46) by kwepemm000013.china.huawei.com (7.193.23.81) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Fri, 29 Dec 2023 21:10:02 +0800 Subject: Re: [PATCH RFC 00/17] ubifs: Add filesystem repair support To: Richard Weinberger CC: david oberhollenzer , Miquel Raynal , Sascha Hauer , Tudor Ambarus , linux-kernel , linux-mtd References: <20231228014112.2836317-1-chengzhihao1@huawei.com> <1145531757.175508.1703844362355.JavaMail.zimbra@nod.at> From: Zhihao Cheng Message-ID: <13b259ca-b32f-a8d6-5e11-8bb38df72f5c@huawei.com> Date: Fri, 29 Dec 2023 21:09:50 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In-Reply-To: <1145531757.175508.1703844362355.JavaMail.zimbra@nod.at> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To kwepemm000013.china.huawei.com (7.193.23.81) 在 2023/12/29 18:06, Richard Weinberger 写道: > ----- Ursprüngliche Mail ----- >> Von: "chengzhihao1" >> An: "david oberhollenzer" , "richard" , "Miquel Raynal" >> , "Sascha Hauer" , "Tudor Ambarus" >> CC: "linux-kernel" , "linux-mtd" >> Gesendet: Donnerstag, 28. Dezember 2023 02:40:55 >> Betreff: [PATCH RFC 00/17] ubifs: Add filesystem repair support > Thanks a lot for sharing this. > Below you find some thoughts that came into my mind while skimming over the > patch series. > >> UBIFS repair provides a way to fix inconsistent UBIFS image(which is >> corrupted by hardware exceptions or UBIFS realization bugs) and makes >> filesystem become consistent, just like fsck tools(eg. fsck.ext4, >> fsck.f2fs, fsck.fat, etc.) do. > I don't fully agree. The tool makes UBIFS mount again but you still have lost data > and later userspace might fail because file no longer contain what the application > expected. > So my fear is that we're just shifting the problem one layer up. > > UBIFS never had a fsck for reasons. UBIFS tries hard to not become inconsistent, > by maintaining a data journal for example. > It can fail of course by hardware issues. e.g. if the underlying MTD loses bits, > but there is nothing UBIFS can do except something like storing duplicates > of data like BTRFS does. > > And finally, the biggest pain point, it can fail due to bugs in UBIFS itself. > In my opinion bugs should get addressed by improving our test infrastructure > instead of working around. I make UBIFS repair for two reasons: 1. There have been many inconsistent problems happened in our products(40+ per year), and reasons for most of them are unknown, I even can't judge the problem is caused by UBIFS bug or hardware exception. The consistent problems, for example, TNC points to an empty space, TNC points to an unexpected node, bad key order in znodes, dirty space of pnode becomes greater than LEB size, huge number in master->total_dead(looks like overflow), etc. I cannot send these bad images to find help, because the corporate policy. Our kernel version is new, and latest bugfixs in linux-mainline are picked in time. I have looked though journal/recovery UBIFS subsystem dozens of times, the implementation looks good, except one problem[2]. And we have do many powercut/faul-injection tests for ubifs, and Zhaolong has published our fault-injection implementation in [3], the result is that journal/recovery UBIFS subsystem does look sturdy. 2. If there exists a fsck tool, user have one more option to deal with inconsistent UBIFS image. UBIFS is mainly applied in embeded system, making filesystem available is most important when filesystem becomes inconsistent in some situations. [1] https://linux-mtd.infradead.narkive.com/bfcHzD0j/ubi-ubifs-corruptions-during-random-power-cuts [2] https://bugzilla.kernel.org/show_bug.cgi?id=218309 [3] https://patchwork.ozlabs.org/project/linux-mtd/list/?series=388034 I'm not sure whether you prefer a fsck tool, in my opinion, fsck just provide a way for userspace to fix filesystem, user can choose invoke it or not according to the tool's limitations based on specific situation. But according to your following reply, I guess you can accept that UBIFS can have a fsck, and fsck should let user known which file is recovered incomplete, which file is old, rather than just make filesystem become consistent. > >> About why do we need it, how it works, what it can fix or it can not >> fix, when and how to use it, see more details in >> Documentation/filesystems/ubifs/repair.rst (Patch 17). > This needs to go into the cover letter. OK, thanks for reminding. > >> Testing on UBIFS repair refers to >> https://bugzilla.kernel.org/show_bug.cgi?id=218327 >> >> Whatever, we finally have a way to fix inconsistent UBFIS image instead >> of formatting UBI when UBIFS becomes inconsistent. > Fix in terms of making mount work again, I fear? As I said, most likely > the problem is now one layer above. UBIFS thinks everything is good but > userspace suddenly will see old/incomplete files. > > What I can think of is a tool (in userspace like other fscks) which > can recover certain UBIFS structures but makes very clear to the user what > the data is lost. e.g. that inode XY now misses some blocks or an old version > of something will be used. > But this isl nothing you can run blindly in production. Let me see. First, we have a common view, fsck tool is valuable for UBIFS, it just provide a way for user application to make UBIFS be consistent and available. Right? Second, you concern odd/incomplete files are recovered just like I metioned in documentation(Limitations section), which still make application failed because the recovered file lost data or deleted file is recovered. So you suggested me that make a userspace fsck tool, and fsck can telll user which file is data lost, which file is recovered after deletion. The difficulty comes from second point,  how does fsck know a file is recovered incomplete or old. Whether the node is existing, it is judged by TNC, but TNC could be damaged like I descibed in above. Do you have any ideas? Besides, we get incomplete file because some data nodes are corrupted, the corrupted data is printed in dbg msg when it is dropped.