Received: by 10.223.176.5 with SMTP id f5csp2383944wra; Mon, 5 Feb 2018 03:12:33 -0800 (PST) X-Google-Smtp-Source: AH8x225u8dQ4SuLn24J941qm5f+KG6nx4wEjQZ/Y6+e4DttDQBtMEPTxFsyIYExXpMITOJoUSAVk X-Received: by 10.99.97.79 with SMTP id v76mr8697556pgb.104.1517829153631; Mon, 05 Feb 2018 03:12:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517829153; cv=none; d=google.com; s=arc-20160816; b=wywEtYAvuGowpSO4ivz4RTUmnMmrC83Xa6LmAQQIBifRMQJDnZAj5Ft95NSgX4xta7 FpJapmVE6XWCtvW2EhfY/Tfx6I5Qy6mGWQoVXaDVv0yXXz5pFVBi3aYStAeYs2bgbFyj 8BmRtR6TdhyFzDfZBHELz0HIsGGizsc+kc3s9Dp8zGfz2hgY3wSZ913QyCgOupHAvoPA FVgSNabVlODGFtMdtsvWabEr+q1bx0pccFQttdMf7AoPI8N3RJLjN4GJ8Qdd6iE6yjt/ fTIzXAKA0LaH3jNU+IM4T4aKvNn5oixTkiX9HPrbrHKS/QC+idnz1YLcY6s9Cbgi/PFj nugg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=CRFjk6LgcJRyyeYxfKVldToh2NzgONTtNfz8CByo1cE=; b=EsKvPPoOCyuwaWh2SSvUjHedDwcTtDPb9PkIhjiioCOifLkdHzEKH0YtFOfgLyNg0X hjESzrdd1lmlzcqWgd3FbXMmUq0wl1hNLkRu+Du1GH0Tjva1WnD7K4zFprlX6F+B81Gd xLLcpND7xCWglriY8kd8a6WWlVRumMl1U1YOkhzykAlJS/imFLZyOfV7fPehL8tUQu50 cna5ZSjvWKA4SA7dw5eFNvzMHmsz2l0mOJ2e6G2wGjIBe8kj+z1Mm2Uns3ymofuwFM7C 8nMGvfNXuWyfzFQsPAzT0KGOF5IPsS8O6+XPVCxeevStkKihpRMC5iJBGTaSZwrUoHzC 7tNQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y40-v6si6691863pla.319.2018.02.05.03.12.19; Mon, 05 Feb 2018 03:12:33 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752913AbeBELLT (ORCPT + 99 others); Mon, 5 Feb 2018 06:11:19 -0500 Received: from szxga04-in.huawei.com ([45.249.212.190]:5173 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752271AbeBELLP (ORCPT ); Mon, 5 Feb 2018 06:11:15 -0500 Received: from DGGEMS413-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id 47E339492090F; Mon, 5 Feb 2018 19:11:01 +0800 (CST) Received: from [127.0.0.1] (10.134.22.195) by DGGEMS413-HUB.china.huawei.com (10.3.19.213) with Microsoft SMTP Server id 14.3.361.1; Mon, 5 Feb 2018 19:10:56 +0800 Subject: Re: [PATCH 1/2] f2fs: enable to gc page whose inode already atomic commit To: Yunlong Song , Chao Yu , , CC: , , , , , References: <1517626068-49739-1-git-send-email-yunlong.song@huawei.com> <312d70f3-b1ae-9ced-44cb-fde83de362ff@huawei.com> <3182ade9-4153-9e47-f8a5-5c87371a3900@huawei.com> From: Chao Yu Message-ID: <6716d2f9-ee89-0f30-2332-5aee48530a12@huawei.com> Date: Mon, 5 Feb 2018 19:10:55 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: <3182ade9-4153-9e47-f8a5-5c87371a3900@huawei.com> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.134.22.195] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2018/2/5 17:37, Yunlong Song wrote: > >> OK, details as I explained before: >> >> atomic_commit GC >> - file_write_and_wait_range >> - move_data_block >> - f2fs_submit_page_write >> - f2fs_update_data_blkaddr >> - set_page_dirty >> - fsync_node_pages >> >> 1. atomic writes data page #1 & update node #1 >> 2. GC data page #2 & update node #2 >> 3. page #1 & node #1 & #2 can be committed into nand flash before page #2 be >> committed. >> >> After a sudden pow-cut, database transaction will be inconsistent. So I think >> there will be better to exclude gc/atomic_write to each other, with a lock >> instead of flag checking. >> > > I do not understand why this transaction is inconsistent, is it a > problem that page #2 is not committed into nand flash? Since normal Yes, node #2 contains newly updated LBAx of page #2, but if page #2 is not committed to LBAx, after recovery, page #2 's block address in node #2 will point to LBAx which contains random data, result in corrupted db file. > gc also has this problem: > > Suppose that there is db file A, f2fs_gc moves data page #1 of db file > A. But if write checkpoint only commit node page #1 and then a sudden f2fs will ensure GCed data being persisted during checkpoint, so migrated page #1 and updated node #1 will both be committed in this checkpoint. Please check WB_DATA_TYPE macro to see how we define data type that cp guarantees to writeback. > power-cut happens. Data page #1 is not committed to nand flash, but > node page #1 is committed. Is the db transaction broken and > inconsistent? > > Come back to your example, I think data page 2 of atomic file does not > belong to this transaction, so even node page 2 is committed, it is just If node #2 is committed only, it will be harmful to db transaction due to the reason I said above. Thanks, > the same problem as what I have listed above(db file A), and it does not > break this transaction. > >> Thanks, >> >>>>>> >>>>>> So how about just using dio_rwsem[WRITE] during atomic committing to exclude >>>>>> GCing data block of atomic opened file? >>>>>> >>>>>> Thanks, >>>>>> >>>>>>> >>>>>>> Signed-off-by: Yunlong Song >>>>>>> --- >>>>>>> fs/f2fs/data.c | 5 ++--- >>>>>>> fs/f2fs/gc.c | 6 ++++-- >>>>>>> 2 files changed, 6 insertions(+), 5 deletions(-) >>>>>>> >>>>>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c >>>>>>> index 7435830..edafcb6 100644 >>>>>>> --- a/fs/f2fs/data.c >>>>>>> +++ b/fs/f2fs/data.c >>>>>>> @@ -1580,14 +1580,13 @@ bool should_update_outplace(struct inode *inode, struct f2fs_io_info *fio) >>>>>>> return true; >>>>>>> if (S_ISDIR(inode->i_mode)) >>>>>>> return true; >>>>>>> - if (f2fs_is_atomic_file(inode)) >>>>>>> - return true; >>>>>>> if (fio) { >>>>>>> if (is_cold_data(fio->page)) >>>>>>> return true; >>>>>>> if (IS_ATOMIC_WRITTEN_PAGE(fio->page)) >>>>>>> return true; >>>>>>> - } >>>>>>> + } else if (f2fs_is_atomic_file(inode)) >>>>>>> + return true; >>>>>>> return false; >>>>>>> } >>>>>>> >>>>>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c >>>>>>> index b9d93fd..84ab3ff 100644 >>>>>>> --- a/fs/f2fs/gc.c >>>>>>> +++ b/fs/f2fs/gc.c >>>>>>> @@ -622,7 +622,8 @@ static void move_data_block(struct inode *inode, block_t bidx, >>>>>>> if (!check_valid_map(F2FS_I_SB(inode), segno, off)) >>>>>>> goto out; >>>>>>> >>>>>>> - if (f2fs_is_atomic_file(inode)) >>>>>>> + if (f2fs_is_atomic_file(inode) && >>>>>>> + !f2fs_is_commit_atomic_write(inode)) >>>>>>> goto out; >>>>>>> >>>>>>> if (f2fs_is_pinned_file(inode)) { >>>>>>> @@ -729,7 +730,8 @@ static void move_data_page(struct inode *inode, block_t bidx, int gc_type, >>>>>>> if (!check_valid_map(F2FS_I_SB(inode), segno, off)) >>>>>>> goto out; >>>>>>> >>>>>>> - if (f2fs_is_atomic_file(inode)) >>>>>>> + if (f2fs_is_atomic_file(inode) && >>>>>>> + !f2fs_is_commit_atomic_write(inode)) >>>>>>> goto out; >>>>>>> if (f2fs_is_pinned_file(inode)) { >>>>>>> if (gc_type == FG_GC) >>>>>>> >>>>>> >>>>>> . >>>>>> >>>>> >>>> >>>> >>>> . >>>> >>> >> >> >> . >> >