Received: by 2002:a05:6a10:a841:0:0:0:0 with SMTP id d1csp1897848pxy; Fri, 23 Apr 2021 21:59:22 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxFA/UBaj3YnGFV9KCZM4GXWV3Tmoi4oZBOq7WMr9UUAQx5sh9wjuC2aflzwm7w0ijFEWV2 X-Received: by 2002:a17:906:f285:: with SMTP id gu5mr7695964ejb.226.1619240361814; Fri, 23 Apr 2021 21:59:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1619240361; cv=none; d=google.com; s=arc-20160816; b=uFw2t0we/0+Y9hiVtMCa05IdypyvYy6zeSzLfLB0oG+6IYPB2+yAcL/fXU9o5m+iUh /XdIbo9HqO1ZyBWuah1ejXbJ5OteuZZyLm4iDuxXFQS3n2NwI7Ea6R9GWQd6CQ4annpQ fhKs6Fl9pagjmtGOCrnWbA7PEqmhVk+R4VjmOlfZiHS4VMHTNG+p8ZvicLi5JdbbbuMu VcrnK97OejBv6NpX1S84Ub/1RfNEWxlfryS6oCxA7fd/Q2VUM6RCMhoHWu7Cw8Bq0J7p 3WzA6CcGDnSrAT933FElwumBeGA16AwvhK8c4T7QXWoahiaoCmf2oCnppIX9qkqAAf2Y VhXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=z9OHnxO5sssi6eehkPTd0lM7GVvcz3tqoTLnlykOFxc=; b=iCn3qONFM1YaV84DGxAxIbv1Fh/TYZ1JDCNA+olLMq2hLx0ig2jxg/ICqp0pLptzKg s2XYqEgwkNeR6jK/mfHGyixR+kE9BRGggw4UJI4ABAyLXo/i4HNeG6BrUWWrQt3JLkdR 0qms7Pr9706AaKs7hk67Autc39hTyZoUjOzio4EpYQhJ0abiI6WAzONC5f/at1zPvSaR 7GZGqC7tODCZM+kbpifnFdlefyARDa1GGJZy/w4Ibt0C9r+zO+Z219yCR2dE1a1ABE/R hT+AkpJnqpiA2a0uD7hQdlz4CtlGQ7j+mKyfj67w0oOdAYyia34Jx+ggcbD4OTIyZ2dK 9tWg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z11si7704943edb.70.2021.04.23.21.58.21; Fri, 23 Apr 2021 21:59:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229654AbhDXErL (ORCPT + 99 others); Sat, 24 Apr 2021 00:47:11 -0400 Received: from szxga04-in.huawei.com ([45.249.212.190]:17052 "EHLO szxga04-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229627AbhDXErJ (ORCPT ); Sat, 24 Apr 2021 00:47:09 -0400 Received: from DGGEMS410-HUB.china.huawei.com (unknown [172.30.72.58]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4FRz836CN7z16L2d; Sat, 24 Apr 2021 12:44:03 +0800 (CST) Received: from [127.0.0.1] (10.174.176.216) by DGGEMS410-HUB.china.huawei.com (10.3.19.210) with Microsoft SMTP Server id 14.3.498.0; Sat, 24 Apr 2021 12:46:18 +0800 Subject: Re: [PATCH] e2fsprogs: Try again to solve unreliable io case To: Theodore Ts'o CC: Haotian Li , Ext4 Developers List , "harshad shirwadkar," , linfeilong References: From: Zhiqiang Liu Message-ID: <6bc8c1c2-9fff-bef9-c6f3-b2256a4888e1@huawei.com> Date: Sat, 24 Apr 2021 12:46:17 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.176.216] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On 2021/4/23 23:46, Theodore Ts'o wrote: > On Fri, Apr 23, 2021 at 10:18:09AM +0800, Zhiqiang Liu wrote: >> Thanks for your reply. >> Actually, we have met the problem in ipsan situation. >> When exec 'fsck -a ', short-term fluctuations or >> abnormalities may occur on the network. Despite the driver has >> do the best effort, some IO errors may occur. So add retrying in >> e2fsprogs can further improve the reliability of the repair >> process. > > But why doesn't this happen when the file system is mounted, and why > is that acceptable? And why not change the driver to do more retries? > > - Ted > Actually, this may happen when the filesystem is mounted. The difference is that the mounted filesystem can ensure the consistency with journal. For example, if the IO error occurs when calling io_channel_write_byte() to update superblock, the checksum may be not written to the disk successfully. Then the checksum error will occur, and the filesystem cannot be repaired with 'fsck -y|a|f'. This situation has a very low probability. For improving the reliability of the repair process, the retries in e2fsprogs may be necessary. Regards Zhiqiang Liu. > . >