Received: by 10.223.176.5 with SMTP id f5csp684278wra; Fri, 9 Feb 2018 05:39:51 -0800 (PST) X-Google-Smtp-Source: AH8x225T0AHqNn8bIvEyB+/AZo+4dqy/98MU0o0z/WGQjnpz5QSO1S/RpykFbKm/+PgpXGQX1qcU X-Received: by 2002:a17:902:7d8b:: with SMTP id a11-v6mr2594198plm.216.1518183591660; Fri, 09 Feb 2018 05:39:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518183591; cv=none; d=google.com; s=arc-20160816; b=X75xe63OtEuiggP4FePZDBZ7Dg+8y23/r4pLhPA4ThMXFaRksAV7gHUZWZIAib5Ncn Iegplga+s14fYHwpTeCAsgbFD0jB2sy0Rb/a6h2Onwu1KIB8V3zh61ZFw6XyvWY91h0T YQZntJB+DShwDkEsG32R3s6IOwrhhSPp3V28/bkp8THCdgJ3YhnuPcovqzZIIDZ56tH4 r2VbrOJ8mbuqMJxQrf4TiJuNTCRiF7CkZThsFl6VhRiKTzz2h4PQ1hUY0SDeuGVenPtF UvSzbEKfI/y5G71W8YFc+DiwI4seY/8avyUUKI4cHUy3KtLUkRTh2SOiPgUyRiiqJ6Qq dGiw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dmarc-filter :arc-authentication-results; bh=O4hQl1+FXafQX9sSKKfRN0AtaoFVgndHt8S3l1K69no=; b=NbIXnwBhDZXQPQt+fWaYjUxOiyoPA3X291BvYCwqh03Xe4XD0xjUFxyc2tA4MB/Fyj LMEyLln+skCrMb5gw7UOtWjXOLRXftDYPJdKqczbBel0KoTZcJQJqgHtGXCb4/d9/MPP I1YHVsxt2hGgBREoM8MT4MZxdnwGgHnBP2KtCBA5o+RI9JcRBLrHqU/nB1pIUq05wuLS nJtbn3g9/4Fk2yJYzIImX81fAzBg3vcCGdJB45icxCi/zbrjkLlNw2k2wfxjeGnoIXov 7Akf4okFQREWPNVzclT0GtB4xEEjMhd2qar9Xx8VKzGHbGe90z3tci0IPH9sfZp5/ScA dagw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t25si1375907pge.368.2018.02.09.05.39.37; Fri, 09 Feb 2018 05:39:51 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751070AbeBINiv (ORCPT + 99 others); Fri, 9 Feb 2018 08:38:51 -0500 Received: from mail.kernel.org ([198.145.29.99]:51424 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750924AbeBINiu (ORCPT ); Fri, 9 Feb 2018 08:38:50 -0500 Received: from [192.168.0.101] (unknown [49.77.233.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 7E74A2173B; Fri, 9 Feb 2018 13:38:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7E74A2173B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=chao@kernel.org Subject: Re: [f2fs-dev] [PATCH] f2fs: add fi->commit_lock to protect commit GCed pages To: Yunlong Song , jaegeuk@kernel.org, yuchao0@huawei.com, yunlong.song@icloud.com Cc: linux-kernel@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, heyunlei@huawei.com, miaoxie@huawei.com References: <1517626068-49739-1-git-send-email-yunlong.song@huawei.com> <1517888990-96478-1-git-send-email-yunlong.song@huawei.com> <4493cbf2-6f37-6c04-a012-4b2516b3b4e7@kernel.org> <134848f9-2dd6-0efa-3ccf-3c29eeaf5534@huawei.com> <33b262b8-7701-a150-09e5-9ae1b4733835@huawei.com> From: Chao Yu Message-ID: <0b3d2792-b3cf-3302-8313-404a25c86854@kernel.org> Date: Fri, 9 Feb 2018 21:38:33 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <33b262b8-7701-a150-09e5-9ae1b4733835@huawei.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2018/2/9 21:29, Yunlong Song wrote: > Back to the problem, if we skip out, then the f2fs_gc will go > into dead loop if the apps only atomic start but never atomic That's another issue, which I have suggest to set a threshold time to release atomic/volatile pages by balance_fs_bg. Thanks, > commit. The main aim of my two patches is to remove the skip > action to avoid the dead loop. > > On 2018/2/9 21:26, Chao Yu wrote: >> On 2018/2/9 20:56, Yunlong Song wrote: >>> As what I point in last mail, if the atomic file is not committed >>> yet, gc_data_segment will register_inmem_page the GCed data pages. >> >> We will skip GCing that page as below check: >> >> - move_data_{page,block} >>   - f2fs_is_atomic_file() >>     skip out; >> >> No? >> >> Thanks, >> >>> This will cause these data pages written twice, the first write >>> happens in move_data_page->do_write_data_page, and the second >>> write happens in later __commit_inmem_pages->do_write_data_page. >>> >>> On 2018/2/9 20:44, Chao Yu wrote: >>>> On 2018/2/8 11:11, Yunlong Song wrote: >>>>> Then the GCed data pages are totally mixed with the inmem atomic pages, >>>> >>>> If we add dio_rwsem, GC flow is exclude with atomic write flow. There >>>> will be not race case to mix GCed page into atomic pages. >>>> >>>> Or you mean: >>>> >>>>                      - gc_data_segment >>>>                       - move_data_page >>>>                        - f2fs_is_atomic_file >>>> - f2fs_ioc_start_atomic_write >>>>    - set_inode_flag(inode, FI_ATOMIC_FILE); >>>>                        - f2fs_set_data_page_dirty >>>>                         - register_inmem_page >>>> >>>> In this case, GCed page can be mixed into database transaction, but could >>>> it cause any problem except break rule of isolation for transaction. >>>> >>>>> this will cause the atomic commit ops write the GCed data pages twice >>>>> (the first write happens in GC). >>>>> >>>>> How about using the early two patches to separate the inmem data pages >>>>> and GCed data pages, and use dio_rwsem instead of this patch to fix the >>>>> dnode page problem (dnode page commited but data page are not committed >>>>> for the GCed page)? >>>> >>>> Could we fix the race case first, based on that fixing, and then find the >>>> place that we can improve? >>>> >>>>> >>>>> >>>>> On 2018/2/7 20:16, Chao Yu wrote: >>>>>> On 2018/2/6 11:49, Yunlong Song wrote: >>>>>>> This patch adds fi->commit_lock to avoid the case that GCed node pages >>>>>>> are committed but GCed data pages are not committed. This can avoid the >>>>>>> db file run into inconsistent state when sudden-power-off happens if >>>>>>> data pages of atomic file is allowed to be GCed before. >>>>>> >>>>>> do_fsync:                GC: >>>>>> - mutex_lock(&fi->commit_lock); >>>>>>                       - lock_page() >>>>>>                        - mutex_lock(&fi->commit_lock); >>>>>>     - lock_page() >>>>>> >>>>>> >>>>>> Well, please consider lock dependency & code complexity, IMO, reuse >>>>>> fi->dio_rwsem[WRITE] will be enough as below: >>>>>> >>>>>> --- >>>>>>     fs/f2fs/file.c | 3 +++ >>>>>>     fs/f2fs/gc.c   | 5 ----- >>>>>>     2 files changed, 3 insertions(+), 5 deletions(-) >>>>>> >>>>>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c >>>>>> index 672a542e5464..1bdc11feb8d0 100644 >>>>>> --- a/fs/f2fs/file.c >>>>>> +++ b/fs/f2fs/file.c >>>>>> @@ -1711,6 +1711,8 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp) >>>>>> >>>>>>         inode_lock(inode); >>>>>> >>>>>> +    down_write(&F2FS_I(inode)->dio_rwsem[WRITE]); >>>>>> + >>>>>>         if (f2fs_is_volatile_file(inode)) >>>>>>             goto err_out; >>>>>> >>>>>> @@ -1729,6 +1731,7 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp) >>>>>>             ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false); >>>>>>         } >>>>>>     err_out: >>>>>> +    up_write(&F2FS_I(inode)->dio_rwsem[WRITE]); >>>>>>         inode_unlock(inode); >>>>>>         mnt_drop_write_file(filp); >>>>>>         return ret; >>>>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c >>>>>> index b9d93fd532a9..e49416283563 100644 >>>>>> --- a/fs/f2fs/gc.c >>>>>> +++ b/fs/f2fs/gc.c >>>>>> @@ -622,9 +622,6 @@ static void move_data_block(struct inode *inode, block_t bidx, >>>>>>         if (!check_valid_map(F2FS_I_SB(inode), segno, off)) >>>>>>             goto out; >>>>>> >>>>>> -    if (f2fs_is_atomic_file(inode)) >>>>>> -        goto out; >>>> >>>> Seems that we need this check. >>>> >>>>>> - >>>>>>         if (f2fs_is_pinned_file(inode)) { >>>>>>             f2fs_pin_file_control(inode, true); >>>>>>             goto out; >>>>>> @@ -729,8 +726,6 @@ static void move_data_page(struct inode *inode, block_t bidx, int gc_type, >>>>>>         if (!check_valid_map(F2FS_I_SB(inode), segno, off)) >>>>>>             goto out; >>>>>> >>>>>> -    if (f2fs_is_atomic_file(inode)) >>>>>> -        goto out; >>>> >>>> Ditto. >>>> >>>> Thanks, >>>> >>>>>>         if (f2fs_is_pinned_file(inode)) { >>>>>>             if (gc_type == FG_GC) >>>>>>                 f2fs_pin_file_control(inode, true); >>>>>> >>>>> >>>> >>>> . >>>> >>> >> >> . >> >