Received: by 10.223.176.5 with SMTP id f5csp676845wra; Fri, 9 Feb 2018 05:32:45 -0800 (PST) X-Google-Smtp-Source: AH8x227lHZJ3Nlr53dY8YsUazDndta3fMOK3XfL075vmeZm+mX7Z0csLHo5o2IGyOP6q+g0/a96m X-Received: by 2002:a17:902:bb8d:: with SMTP id m13-v6mr2610869pls.49.1518183165553; Fri, 09 Feb 2018 05:32:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518183165; cv=none; d=google.com; s=arc-20160816; b=zb/RBDv3NU3rQOk8l8B1W/debMYU01M8MNasaJJ99Bqpqxj3dTvhYAYKU6K7V3MT/Z 0moSZo5YQrZygW02Q1pnB1LtoEguw7b1ud/xCwMDH+6nO+dR8wH01aKeuQAq5UNXqYfF 9nJVpr5kn4ClfE9Gxr+TyTOU8yTXp6zAx68rQ9EJ/1buVp6NUeCszF4Y0V4p/aqLzBr8 uClN9W5whGtz+LkOqg0FMXqBWKIAij3agTuO469U+g5MnacFBcE6eaM9h7tBhoRy9ix1 Dm0lWzKqZIL2slKoes8Q74TKfKmfvwuq5cd55Eo/2vzYRFMCZgYXBXdRqCS8uXCPif+9 2pLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=j2OMijxbUbPAZ2MLmXWZlouFiA7wFkKJ9eOurLVwtXI=; b=N09NUfQdqVo3bqB4b8j9IkGZaVi84Ksk9vN8HPvBOiiYySXnq3ZUOgHzlzgHnu5BCK yNEnKNoWTV11zweKjmjAYrU/i1/YHCqDx0aiDSDgoERWUszd95OxoNlM74Ij1bUphPho g/TUzJclbCEo66uX+JMG2VZtja07eTei+8L4pOSkPjB5AMBwdJhhatOA7vz9If/CYeR5 oWAokJy7EwsES+qEYeXiVer9MFzUpsnB3GwDNCEGG+L5hJSaWLUq7HWWubTHVmidkQlb /RHjIK/CGU/wF32DMG4CbywITG5AusywVJIqVC4BoBEKIjVFVeOeg0YoDtIcB3Ey2dR0 aYZQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h18si1648907pfi.167.2018.02.09.05.32.31; Fri, 09 Feb 2018 05:32:45 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751091AbeBINbt (ORCPT + 99 others); Fri, 9 Feb 2018 08:31:49 -0500 Received: from szxga04-in.huawei.com ([45.249.212.190]:5189 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750970AbeBINbs (ORCPT ); Fri, 9 Feb 2018 08:31:48 -0500 Received: from DGGEMS411-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id 9AE92E502817D; Fri, 9 Feb 2018 21:31:30 +0800 (CST) Received: from [127.0.0.1] (10.111.220.140) by DGGEMS411-HUB.china.huawei.com (10.3.19.211) with Microsoft SMTP Server id 14.3.361.1; Fri, 9 Feb 2018 21:31:21 +0800 Subject: Re: [PATCH] f2fs: add fi->commit_lock to protect commit GCed pages To: Chao Yu , , , CC: , , , , , References: <1517626068-49739-1-git-send-email-yunlong.song@huawei.com> <1517888990-96478-1-git-send-email-yunlong.song@huawei.com> <4493cbf2-6f37-6c04-a012-4b2516b3b4e7@kernel.org> <134848f9-2dd6-0efa-3ccf-3c29eeaf5534@huawei.com> From: Yunlong Song Message-ID: <33b262b8-7701-a150-09e5-9ae1b4733835@huawei.com> Date: Fri, 9 Feb 2018 21:29:48 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.111.220.140] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Back to the problem, if we skip out, then the f2fs_gc will go into dead loop if the apps only atomic start but never atomic commit. The main aim of my two patches is to remove the skip action to avoid the dead loop. On 2018/2/9 21:26, Chao Yu wrote: > On 2018/2/9 20:56, Yunlong Song wrote: >> As what I point in last mail, if the atomic file is not committed >> yet, gc_data_segment will register_inmem_page the GCed data pages. > > We will skip GCing that page as below check: > > - move_data_{page,block} > - f2fs_is_atomic_file() > skip out; > > No? > > Thanks, > >> This will cause these data pages written twice, the first write >> happens in move_data_page->do_write_data_page, and the second >> write happens in later __commit_inmem_pages->do_write_data_page. >> >> On 2018/2/9 20:44, Chao Yu wrote: >>> On 2018/2/8 11:11, Yunlong Song wrote: >>>> Then the GCed data pages are totally mixed with the inmem atomic pages, >>> >>> If we add dio_rwsem, GC flow is exclude with atomic write flow. There >>> will be not race case to mix GCed page into atomic pages. >>> >>> Or you mean: >>> >>> - gc_data_segment >>> - move_data_page >>> - f2fs_is_atomic_file >>> - f2fs_ioc_start_atomic_write >>> - set_inode_flag(inode, FI_ATOMIC_FILE); >>> - f2fs_set_data_page_dirty >>> - register_inmem_page >>> >>> In this case, GCed page can be mixed into database transaction, but could >>> it cause any problem except break rule of isolation for transaction. >>> >>>> this will cause the atomic commit ops write the GCed data pages twice >>>> (the first write happens in GC). >>>> >>>> How about using the early two patches to separate the inmem data pages >>>> and GCed data pages, and use dio_rwsem instead of this patch to fix the >>>> dnode page problem (dnode page commited but data page are not committed >>>> for the GCed page)? >>> >>> Could we fix the race case first, based on that fixing, and then find the >>> place that we can improve? >>> >>>> >>>> >>>> On 2018/2/7 20:16, Chao Yu wrote: >>>>> On 2018/2/6 11:49, Yunlong Song wrote: >>>>>> This patch adds fi->commit_lock to avoid the case that GCed node pages >>>>>> are committed but GCed data pages are not committed. This can avoid the >>>>>> db file run into inconsistent state when sudden-power-off happens if >>>>>> data pages of atomic file is allowed to be GCed before. >>>>> >>>>> do_fsync: GC: >>>>> - mutex_lock(&fi->commit_lock); >>>>> - lock_page() >>>>> - mutex_lock(&fi->commit_lock); >>>>> - lock_page() >>>>> >>>>> >>>>> Well, please consider lock dependency & code complexity, IMO, reuse >>>>> fi->dio_rwsem[WRITE] will be enough as below: >>>>> >>>>> --- >>>>> fs/f2fs/file.c | 3 +++ >>>>> fs/f2fs/gc.c | 5 ----- >>>>> 2 files changed, 3 insertions(+), 5 deletions(-) >>>>> >>>>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c >>>>> index 672a542e5464..1bdc11feb8d0 100644 >>>>> --- a/fs/f2fs/file.c >>>>> +++ b/fs/f2fs/file.c >>>>> @@ -1711,6 +1711,8 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp) >>>>> >>>>> inode_lock(inode); >>>>> >>>>> + down_write(&F2FS_I(inode)->dio_rwsem[WRITE]); >>>>> + >>>>> if (f2fs_is_volatile_file(inode)) >>>>> goto err_out; >>>>> >>>>> @@ -1729,6 +1731,7 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp) >>>>> ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false); >>>>> } >>>>> err_out: >>>>> + up_write(&F2FS_I(inode)->dio_rwsem[WRITE]); >>>>> inode_unlock(inode); >>>>> mnt_drop_write_file(filp); >>>>> return ret; >>>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c >>>>> index b9d93fd532a9..e49416283563 100644 >>>>> --- a/fs/f2fs/gc.c >>>>> +++ b/fs/f2fs/gc.c >>>>> @@ -622,9 +622,6 @@ static void move_data_block(struct inode *inode, block_t bidx, >>>>> if (!check_valid_map(F2FS_I_SB(inode), segno, off)) >>>>> goto out; >>>>> >>>>> - if (f2fs_is_atomic_file(inode)) >>>>> - goto out; >>> >>> Seems that we need this check. >>> >>>>> - >>>>> if (f2fs_is_pinned_file(inode)) { >>>>> f2fs_pin_file_control(inode, true); >>>>> goto out; >>>>> @@ -729,8 +726,6 @@ static void move_data_page(struct inode *inode, block_t bidx, int gc_type, >>>>> if (!check_valid_map(F2FS_I_SB(inode), segno, off)) >>>>> goto out; >>>>> >>>>> - if (f2fs_is_atomic_file(inode)) >>>>> - goto out; >>> >>> Ditto. >>> >>> Thanks, >>> >>>>> if (f2fs_is_pinned_file(inode)) { >>>>> if (gc_type == FG_GC) >>>>> f2fs_pin_file_control(inode, true); >>>>> >>>> >>> >>> . >>> >> > > . > -- Thanks, Yunlong Song