Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp1890753pxf; Fri, 26 Mar 2021 18:53:51 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw0fXZxgaqgRSl1xKTF8e3N6I4Sr8X7fsnyEY/MLSSE5cpy9sJQpTDUmjIb9QRrwrE002t9 X-Received: by 2002:a17:906:8807:: with SMTP id zh7mr18167036ejb.196.1616810031805; Fri, 26 Mar 2021 18:53:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1616810031; cv=none; d=google.com; s=arc-20160816; b=jyX9J5lSIPg8zDvzbdJbxFmN24mdPUu7dEW0MfeeuNRLTXDBJ79WZozKIC460LRs5C EWwLQtxlOrojoH5rtpC3FmYni9OrksInyRf5c7K1RrYjjOhhGSz27qXSbfV9Fi/I4t+J FkcvDWU4cqWSnEnBuQHXrLXj3NpUg57KbAiFc4+cqHD6AHq0AkfMF/OQ81lrG9Xsn+y+ 6FZW4S4t8jcghfnQ6k90UWCxSrd10Q2VXXwD14CPTG/LAt9Fnlp9k063+Hb4l0ZiSX5u VFd7BWfvAJVv6mozuMqXRCteo7GxdSRY1/jwej3i3ehqHG2lcO/O9QwsdjarMM591jCq DiLA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=+D7s/J3ZtDBZfsdu+MU6VMGCN3vWOD4Jd3GS9NIzZZE=; b=XQEUpz61P1ozFd9AEh3dmf7fMI4tnB0JQsKX5njwCU5kR5F0HwTl95jALY95B5q/GY k3iVN3/MrGAsVJ2sJkLYw1EmoslVJ0N3nPSBv4K1qokSDnKvTUI+tKf8ES5jov+hqyLg B/lXm9rfdQ4KkgGA1KQ+EZZHSXozhUtbqRFBcofJKNdgx4FLQnJJsOy6p5A52JKh/MY4 iczTEiNzwtgwjpDfcXUTx0FmNlq9SgYIwqPZF5qKoFwVTULkCic5ybeymdkwZjjwREnt diWFmEgGrrBJV8B91VtNqeTj0dj675CgST0GPBNH0C5jv5e18dLxGpR74CFmdIPnpYPs X1zg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id j7si7618595ejc.237.2021.03.26.18.53.29; Fri, 26 Mar 2021 18:53:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230294AbhC0Bwg (ORCPT + 99 others); Fri, 26 Mar 2021 21:52:36 -0400 Received: from szxga04-in.huawei.com ([45.249.212.190]:14621 "EHLO szxga04-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230142AbhC0Bwf (ORCPT ); Fri, 26 Mar 2021 21:52:35 -0400 Received: from DGGEMS410-HUB.china.huawei.com (unknown [172.30.72.59]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4F6hcl0B06z19PfR; Sat, 27 Mar 2021 09:50:31 +0800 (CST) Received: from [10.136.110.154] (10.136.110.154) by smtp.huawei.com (10.3.19.210) with Microsoft SMTP Server (TLS) id 14.3.498.0; Sat, 27 Mar 2021 09:52:29 +0800 Subject: Re: [f2fs-dev] [PATCH] Revert "f2fs: give a warning only for readonly partition" To: Jaegeuk Kim CC: , References: <20210323064155.12582-1-yuchao0@huawei.com> <107e671d-68ea-1a74-521e-ab2b6fe36416@huawei.com> <8b0b0782-a667-9edc-5ee9-98ac9f67b7b7@huawei.com> <84688aac-75da-1226-df4d-47ac97087c51@huawei.com> From: Chao Yu Message-ID: <4b64099b-064d-43a8-461d-b54007f2c16c@huawei.com> Date: Sat, 27 Mar 2021 09:52:28 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.136.110.154] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021/3/27 1:30, Jaegeuk Kim wrote: > On 03/26, Chao Yu wrote: >> On 2021/3/26 9:19, Jaegeuk Kim wrote: >>> On 03/26, Chao Yu wrote: >>>> On 2021/3/25 9:59, Chao Yu wrote: >>>>> On 2021/3/25 6:44, Jaegeuk Kim wrote: >>>>>> On 03/24, Chao Yu wrote: >>>>>>> On 2021/3/24 12:22, Jaegeuk Kim wrote: >>>>>>>> On 03/24, Chao Yu wrote: >>>>>>>>> On 2021/3/24 2:39, Jaegeuk Kim wrote: >>>>>>>>>> On 03/23, Chao Yu wrote: >>>>>>>>>>> This reverts commit 938a184265d75ea474f1c6fe1da96a5196163789. >>>>>>>>>>> >>>>>>>>>>> Because that commit fails generic/050 testcase which expect failure >>>>>>>>>>> during mount a recoverable readonly partition. >>>>>>>>>> >>>>>>>>>> I think we need to change generic/050, since f2fs can recover this partition, >>>>>>>>> >>>>>>>>> Well, not sure we can change that testcase, since it restricts all generic >>>>>>>>> filesystems behavior. At least, ext4's behavior makes sense to me: >>>>>>>>> >>>>>>>>> journal_dev_ro = bdev_read_only(journal->j_dev); >>>>>>>>> really_read_only = bdev_read_only(sb->s_bdev) | journal_dev_ro; >>>>>>>>> >>>>>>>>> if (journal_dev_ro && !sb_rdonly(sb)) { >>>>>>>>> ext4_msg(sb, KERN_ERR, >>>>>>>>> "journal device read-only, try mounting with '-o ro'"); >>>>>>>>> err = -EROFS; >>>>>>>>> goto err_out; >>>>>>>>> } >>>>>>>>> >>>>>>>>> if (ext4_has_feature_journal_needs_recovery(sb)) { >>>>>>>>> if (sb_rdonly(sb)) { >>>>>>>>> ext4_msg(sb, KERN_INFO, "INFO: recovery " >>>>>>>>> "required on readonly filesystem"); >>>>>>>>> if (really_read_only) { >>>>>>>>> ext4_msg(sb, KERN_ERR, "write access " >>>>>>>>> "unavailable, cannot proceed " >>>>>>>>> "(try mounting with noload)"); >>>>>>>>> err = -EROFS; >>>>>>>>> goto err_out; >>>>>>>>> } >>>>>>>>> ext4_msg(sb, KERN_INFO, "write access will " >>>>>>>>> "be enabled during recovery"); >>>>>>>>> } >>>>>>>>> } >>>>>>>>> >>>>>>>>>> even though using it as readonly. And, valid checkpoint can allow for user to >>>>>>>>>> read all the data without problem. >>>>>>>>> >>>>>>>>>>> if (f2fs_hw_is_readonly(sbi)) { >>>>>>>>> >>>>>>>>> Since device is readonly now, all write to the device will fail, checkpoint can >>>>>>>>> not persist recovered data, after page cache is expired, user can see stale data. >>>>>>>> >>>>>>>> My point is, after mount with ro, there'll be no data write which preserves the >>>>>>>> current status. So, in the next time, we can recover fsync'ed data later, if >>>>>>>> user succeeds to mount as rw. Another point is, with the current checkpoint, we >>>>>>>> should not have any corrupted metadata. So, why not giving a chance to show what >>>>>>>> data remained to user? I think this can be doable only with CoW filesystems. >>>>>>> >>>>>>> I guess we're talking about the different things... >>>>>>> >>>>>>> Let me declare two different readonly status: >>>>>>> >>>>>>> 1. filesystem readonly: file system is mount with ro mount option, and >>>>>>> app from userspace can not modify any thing of filesystem, but filesystem >>>>>>> itself can modify data on device since device may be writable. >>>>>>> >>>>>>> 2. device readonly: device is set to readonly status via 'blockdev --setro' >>>>>>> command, and then filesystem should never issue any write IO to the device. >>>>>>> >>>>>>> So, what I mean is, *when device is readonly*, rather than f2fs mountpoint >>>>>>> is readonly (f2fs_hw_is_readonly() returns true as below code, instead of >>>>>>> f2fs_readonly() returns true), in this condition, we should not issue any >>>>>>> write IO to device anyway, because, AFAIK, write IO will fail due to >>>>>>> bio_check_ro() check. >>>>>> >>>>>> In that case, mount(2) will try readonly, no? >>>>> >>>>> Yes, if device is readonly, mount (2) can not mount/remount device to rw >>>>> mountpoint. >>>> >>>> Any other concern about this patch? >>> >>> Indeed we're talking about different things. :) >>> >>> This case is mount(ro) with device(ro) having some data to recover. >>> My point is why not giving a chance to mount(ro) to show the current data >>> covered by a valid checkpoint. This doesn't change anything in the disk, >> Got your idea. >> >> IMO, it has potential issue in above condition: >> >>>>>>>>> Since device is readonly now, all write to the device will fail, checkpoint can >>>>>>>>> not persist recovered data, after page cache is expired, user can see stale data. >> >> e.g. >> >> Recovery writes one inode and then triggers a checkpoint, all writes fail > > I'm confused. Currently we don't trigger the roll-forward recovery. Oh, my miss, sorry. :-P My point is in this condition we can return error and try to notice user to mount with disable_roll_forward or norecovery option, then at least user can know he should not expect last fsynced data in newly mounted image. Or we can use f2fs_recover_fsync_data() to check whether there is fsynced data, if there is no such data, then let mount() succeed. Thanks, > >> due to device is readonly, once inode cache is reclaimed by vm, user will see >> old inode when reloading it, or even see corrupted fs if partial meta inode's >> cache is expired. >> >> Thoughts? >> >> Thanks, >> >>> and in the next time, it allows mount(rw|ro) with device(rw) to recover >>> the data seamlessly. >>> >>>> >>>> Thanks, >>>> >>>>> >>>>> Thanks, >>>>> >>>>>> >>>>>> # blockdev --setro /dev/vdb >>>>>> # mount -t f2fs /dev/vdb /mnt/test/ >>>>>> mount: /mnt/test: WARNING: source write-protected, mounted read-only. >>>>>> >>>>>>> >>>>>>> if (f2fs_hw_is_readonly(sbi)) { >>>>>>> - if (!is_set_ckpt_flags(sbi, CP_UMOUNT_FLAG)) { >>>>>>> - err = -EROFS; >>>>>>> + if (!is_set_ckpt_flags(sbi, CP_UMOUNT_FLAG)) >>>>>>> f2fs_err(sbi, "Need to recover fsync data, but write access unavailable"); >>>>>>> - goto free_meta; >>>>>>> - } >>>>>>> - f2fs_info(sbi, "write access unavailable, skipping recovery"); >>>>>>> + else >>>>>>> + f2fs_info(sbi, "write access unavailable, skipping recovery"); >>>>>>> goto reset_checkpoint; >>>>>>> } >>>>>>> >>>>>>> For the case of filesystem is readonly and device is writable, it's fine >>>>>>> to do recovery in order to let user to see fsynced data. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Am I missing something? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Fixes: 938a184265d7 ("f2fs: give a warning only for readonly partition") >>>>>>>>>>> Signed-off-by: Chao Yu >>>>>>>>>>> --- >>>>>>>>>>> fs/f2fs/super.c | 8 +++++--- >>>>>>>>>>> 1 file changed, 5 insertions(+), 3 deletions(-) >>>>>>>>>>> >>>>>>>>>>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c >>>>>>>>>>> index b48281642e98..2b78ee11f093 100644 >>>>>>>>>>> --- a/fs/f2fs/super.c >>>>>>>>>>> +++ b/fs/f2fs/super.c >>>>>>>>>>> @@ -3952,10 +3952,12 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent) >>>>>>>>>>> * previous checkpoint was not done by clean system shutdown. >>>>>>>>>>> */ >>>>>>>>>>> if (f2fs_hw_is_readonly(sbi)) { >>>>>>>>>>> - if (!is_set_ckpt_flags(sbi, CP_UMOUNT_FLAG)) >>>>>>>>>>> + if (!is_set_ckpt_flags(sbi, CP_UMOUNT_FLAG)) { >>>>>>>>>>> + err = -EROFS; >>>>>>>>>>> f2fs_err(sbi, "Need to recover fsync data, but write access unavailable"); >>>>>>>>>>> - else >>>>>>>>>>> - f2fs_info(sbi, "write access unavailable, skipping recovery"); >>>>>>>>>>> + goto free_meta; >>>>>>>>>>> + } >>>>>>>>>>> + f2fs_info(sbi, "write access unavailable, skipping recovery"); >>>>>>>>>>> goto reset_checkpoint; >>>>>>>>>>> } >>>>>>>>>>> -- >>>>>>>>>>> 2.29.2 >>>>>>>>>> . >>>>>>>>>> >>>>>>>> . >>>>>>>> >>>>>> . >>>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Linux-f2fs-devel mailing list >>>>> Linux-f2fs-devel@lists.sourceforge.net >>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel >>>>> . >>>>> >>> . >>> > . >