Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BEFF7C636CC for ; Tue, 31 Jan 2023 21:57:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230221AbjAaV5Z (ORCPT ); Tue, 31 Jan 2023 16:57:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34438 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229907AbjAaV5X (ORCPT ); Tue, 31 Jan 2023 16:57:23 -0500 Received: from mail-yb1-xb34.google.com (mail-yb1-xb34.google.com [IPv6:2607:f8b0:4864:20::b34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3F3A147401 for ; Tue, 31 Jan 2023 13:57:22 -0800 (PST) Received: by mail-yb1-xb34.google.com with SMTP id o187so3494845ybg.3 for ; Tue, 31 Jan 2023 13:57:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=gmJtAREyo8ivJ4Kqeq2bHJyT3OX19Q3/5Agieb/eCv4=; b=FzV5ad/w9oI7JUyKjIOehmdS6FJzcwhMWrQLXXTbMOvALvL5lkZBmWdBxTPlEf5qX2 bs2nbOtWaXdg+pvv24rRWzsueuY43fa+LaIY2eADp0ic+7sgntGa/OA6A/X+oxleng59 In0F6AXJyJPvHlbeDvk8MVSXfmfJqv5xyg8JyJh9RyXI253ZGWCLnygVvvqo0+98saNK r0okEEu0L+gI81aeeFUYzHUajHLaxA7DKZN/7GLfyLYyxl66vClaT/MYDT6Rfbl88nYJ /ryyp/XgSaf0CNXtWFbxC1iJYqWq7/+wNNjzZPWThW7k67WRRxTJCMBNVdC/JbV+nwei jKLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=gmJtAREyo8ivJ4Kqeq2bHJyT3OX19Q3/5Agieb/eCv4=; b=I4m+9V7kgAjWK68A/vOkutXBSB+I6sst9/AeCNITNk7rw444gObJSTfUKyYYXmXuBu aavG7XA+znTMoi8O9Bl83wA0EW+cFzNaRwK5SQaoXPERhooYnBu0z5gp9f07LOZ9WvkU CBaMWW8Xa+NhrvmPpTDnqRDlsuxK0tnacSrNawJc1Jx4+hmYONd0ktXEOMML0ni7JEbI Awqxa463QwOna5lPyEZZb8tBB1amhfcSSY7dM+kZz18RhWg8W0HAbw9zaISuz//76PD+ a5BAf5CJW5d/8osxYWdiZwwRKELirBMdbwUWIDk9fbPisLG8ak/O4GsylsnK+hDgI+XM Zp/Q== X-Gm-Message-State: AO0yUKUfj/WtNA+IfoxdcvwU2txcANYM530RqhyfPHjrC/DqtcVNNCd9 O6D6EPvK7ZsjTQm8Ny+tCMN8f1w36EhCJeGtagQ= X-Google-Smtp-Source: AK7set+Ot0MQgVfuUTCbU8X1TkUfaJLbC3fNPaSKAiTBe4drEQA2+P0s8G/TDe4betMcz+jtgywV1axIxu/GtJ9GJw4= X-Received: by 2002:a25:7412:0:b0:802:f667:b222 with SMTP id p18-20020a257412000000b00802f667b222mr66794ybc.100.1675202241394; Tue, 31 Jan 2023 13:57:21 -0800 (PST) MIME-Version: 1.0 References: <20230113004933.2082072-1-daeho43@gmail.com> <8f1c15a3-d056-7709-af45-fe7cba56463f@kernel.org> In-Reply-To: From: Daeho Jeong Date: Tue, 31 Jan 2023 13:57:09 -0800 Message-ID: Subject: Re: [f2fs-dev] [PATCH] f2fs: synchronize atomic write aborts To: Chao Yu Cc: linux-kernel@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, kernel-team@android.com, Daeho Jeong , syzbot+823000d23b3400619f7c@syzkaller.appspotmail.com Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 31, 2023 at 1:38 PM Daeho Jeong wrote: > > On Tue, Jan 31, 2023 at 11:13 AM Daeho Jeong wrote: > > > > Hi Chao, > > > > On Tue, Jan 31, 2023 at 3:37 AM Chao Yu wrote: > > > > > > Hi Daeho, > > > > > > On 2023/1/31 0:34, Daeho Jeong wrote: > > > > Hi Chao, > > > > > > > > I read your patch series now and I like it. > > > > > > Thank you for checking the patches. :) > > > > > > > However, how about a race condition between start_atomic_write and > > > > abort_atomic_write? > > > > > > Yup, I noticed that issue, I guess we can avoid this race condition by > > > covering these two flows w/ i_atomic_sem. > > > > > > > abort_atomic_write is called without inode_lock in closing filp scenarios. > > > > What do you think about this? > > > > > > I'm fine w/ your change as it's more clean, but it's better to drop cow_inode's > > > page cache if atomic_write is committed or aborted to avoid caching obsolete page? > > > > It's better to put that part in f2fs_abort_atomic_write(). > > On top of that, maybe, we should move > > f2fs_do_truncate_blocks(fi->cow_inode, 0, true) part from > > f2fs_ioc_start_atomic_write() to f2fs_abort_atomic_write(), too. > > Oh, we shouldn't touch the f2fs_do_truncate_blocks() part, since there > might be some left writeback after aborting atomic write. > Plz. review it related to the timing of calling truncate_inode_pages_final(). Looks like the scenario becomes too complicated if I think about more than one writer's scenario. How about we check writecount in commit_atomic_write ioctl and return EBUSY when it's not only one writer? In that case, we can make the scenario simple and effective, and we can release all the resources in abort_atomic_write(). > > > > > Thanks, > > > > > > > > Thanks, > > > > > > > > > > > Thanks, > > > > > > > > > > > > On Fri, Jan 27, 2023 at 6:07 PM Chao Yu wrote: > > > >> > > > >> Hi Daeho, Jaegeuk, > > > >> > > > >> Please take a look at patchset in below link: > > > >> > > > >> https://lore.kernel.org/linux-f2fs-devel/20230109034453.490176-1-chao@kernel.org/T/#t > > > >> > > > >> In PATCH 4/5, I'm trying to fix the same issue w/ alternative way, let me > > > >> know your preference. :) > > > >> > > > >> One comment as below. > > > >> > > > >> On 2023/1/13 8:49, Daeho Jeong wrote: > > > >>> From: Daeho Jeong > > > >>> > > > >>> To fix a race condition between atomic write aborts, I use the inode > > > >>> lock and make COW inode to be re-usable thoroughout the whole > > > >>> atomic file inode lifetime. > > > >>> > > > >>> Reported-by: syzbot+823000d23b3400619f7c@syzkaller.appspotmail.com > > > >>> Fixes: 3db1de0e582c ("f2fs: change the current atomic write way") > > > >>> Signed-off-by: Daeho Jeong > > > >>> --- > > > >>> fs/f2fs/file.c | 43 ++++++++++++++++++++++++++++--------------- > > > >>> fs/f2fs/inode.c | 11 +++++++++-- > > > >>> fs/f2fs/segment.c | 3 --- > > > >>> fs/f2fs/super.c | 2 -- > > > >>> 4 files changed, 37 insertions(+), 22 deletions(-) > > > >>> > > > >>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c > > > >>> index ecbc8c135b49..ff072a9ed258 100644 > > > >>> --- a/fs/f2fs/file.c > > > >>> +++ b/fs/f2fs/file.c > > > >>> @@ -1866,7 +1866,10 @@ static int f2fs_release_file(struct inode *inode, struct file *filp) > > > >>> atomic_read(&inode->i_writecount) != 1) > > > >>> return 0; > > > >>> > > > >>> + inode_lock(inode); > > > >>> f2fs_abort_atomic_write(inode, true); > > > >>> + inode_unlock(inode); > > > >>> + > > > >>> return 0; > > > >>> } > > > >>> > > > >>> @@ -1880,8 +1883,11 @@ static int f2fs_file_flush(struct file *file, fl_owner_t id) > > > >>> * until all the writers close its file. Since this should be done > > > >>> * before dropping file lock, it needs to do in ->flush. > > > >>> */ > > > >>> - if (F2FS_I(inode)->atomic_write_task == current) > > > >>> + if (F2FS_I(inode)->atomic_write_task == current) { > > > >>> + inode_lock(inode); > > > >>> f2fs_abort_atomic_write(inode, true); > > > >>> + inode_unlock(inode); > > > >>> + } > > > >>> return 0; > > > >>> } > > > >>> > > > >>> @@ -2087,19 +2093,28 @@ static int f2fs_ioc_start_atomic_write(struct file *filp, bool truncate) > > > >>> goto out; > > > >>> } > > > >>> > > > >>> - /* Create a COW inode for atomic write */ > > > >>> - pinode = f2fs_iget(inode->i_sb, fi->i_pino); > > > >>> - if (IS_ERR(pinode)) { > > > >>> - f2fs_up_write(&fi->i_gc_rwsem[WRITE]); > > > >>> - ret = PTR_ERR(pinode); > > > >>> - goto out; > > > >>> - } > > > >>> + /* Check if the inode already has a COW inode */ > > > >>> + if (fi->cow_inode == NULL) { > > > >>> + /* Create a COW inode for atomic write */ > > > >>> + pinode = f2fs_iget(inode->i_sb, fi->i_pino); > > > >>> + if (IS_ERR(pinode)) { > > > >>> + f2fs_up_write(&fi->i_gc_rwsem[WRITE]); > > > >>> + ret = PTR_ERR(pinode); > > > >>> + goto out; > > > >>> + } > > > >>> > > > >>> - ret = f2fs_get_tmpfile(mnt_userns, pinode, &fi->cow_inode); > > > >>> - iput(pinode); > > > >>> - if (ret) { > > > >>> - f2fs_up_write(&fi->i_gc_rwsem[WRITE]); > > > >>> - goto out; > > > >>> + ret = f2fs_get_tmpfile(mnt_userns, pinode, &fi->cow_inode); > > > >>> + iput(pinode); > > > >>> + if (ret) { > > > >>> + f2fs_up_write(&fi->i_gc_rwsem[WRITE]); > > > >>> + goto out; > > > >>> + } > > > >>> + > > > >>> + set_inode_flag(fi->cow_inode, FI_COW_FILE); > > > >>> + clear_inode_flag(fi->cow_inode, FI_INLINE_DATA); > > > >>> + } else { > > > >>> + /* Reuse the already created COW inode */ > > > >>> + f2fs_do_truncate_blocks(fi->cow_inode, 0, true); > > > >>> } > > > >>> > > > >>> f2fs_write_inode(inode, NULL); > > > >>> @@ -2107,8 +2122,6 @@ static int f2fs_ioc_start_atomic_write(struct file *filp, bool truncate) > > > >>> stat_inc_atomic_inode(inode); > > > >>> > > > >>> set_inode_flag(inode, FI_ATOMIC_FILE); > > > >>> - set_inode_flag(fi->cow_inode, FI_COW_FILE); > > > >>> - clear_inode_flag(fi->cow_inode, FI_INLINE_DATA); > > > >>> > > > >>> isize = i_size_read(inode); > > > >>> fi->original_i_size = isize; > > > >>> diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c > > > >>> index ff6cf66ed46b..4921f7209e28 100644 > > > >>> --- a/fs/f2fs/inode.c > > > >>> +++ b/fs/f2fs/inode.c > > > >>> @@ -766,11 +766,18 @@ int f2fs_write_inode(struct inode *inode, struct writeback_control *wbc) > > > >>> void f2fs_evict_inode(struct inode *inode) > > > >>> { > > > >>> struct f2fs_sb_info *sbi = F2FS_I_SB(inode); > > > >>> - nid_t xnid = F2FS_I(inode)->i_xattr_nid; > > > >>> + struct f2fs_inode_info *fi = F2FS_I(inode); > > > >>> + nid_t xnid = fi->i_xattr_nid; > > > >>> int err = 0; > > > >>> > > > >>> f2fs_abort_atomic_write(inode, true); > > > >>> > > > >>> + if (fi->cow_inode) { > > > >>> + clear_inode_flag(fi->cow_inode, FI_COW_FILE); > > > >>> + iput(fi->cow_inode); > > > >>> + fi->cow_inode = NULL; > > > >>> + } > > > >>> + > > > >>> trace_f2fs_evict_inode(inode); > > > >>> truncate_inode_pages_final(&inode->i_data); > > > >>> > > > >>> @@ -857,7 +864,7 @@ void f2fs_evict_inode(struct inode *inode) > > > >>> stat_dec_inline_inode(inode); > > > >>> stat_dec_compr_inode(inode); > > > >>> stat_sub_compr_blocks(inode, > > > >>> - atomic_read(&F2FS_I(inode)->i_compr_blocks)); > > > >>> + atomic_read(&fi->i_compr_blocks)); > > > >>> > > > >>> if (likely(!f2fs_cp_error(sbi) && > > > >>> !is_sbi_flag_set(sbi, SBI_CP_DISABLED))) > > > >>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > > > >>> index ae3c4e5474ef..536d7c674b04 100644 > > > >>> --- a/fs/f2fs/segment.c > > > >>> +++ b/fs/f2fs/segment.c > > > >>> @@ -192,9 +192,6 @@ void f2fs_abort_atomic_write(struct inode *inode, bool clean) > > > >>> if (!f2fs_is_atomic_file(inode)) > > > >>> return; > > > >>> > > > >>> - clear_inode_flag(fi->cow_inode, FI_COW_FILE); > > > >>> - iput(fi->cow_inode); > > > >>> - fi->cow_inode = NULL; > > > >>> release_atomic_write_cnt(inode); > > > >>> clear_inode_flag(inode, FI_ATOMIC_COMMITTED); > > > >>> clear_inode_flag(inode, FI_ATOMIC_REPLACE); > > > >>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c > > > >>> index 1f812b9ce985..10463f084d30 100644 > > > >>> --- a/fs/f2fs/super.c > > > >>> +++ b/fs/f2fs/super.c > > > >>> @@ -1430,8 +1430,6 @@ static int f2fs_drop_inode(struct inode *inode) > > > >>> atomic_inc(&inode->i_count); > > > >>> spin_unlock(&inode->i_lock); > > > >>> > > > >>> - f2fs_abort_atomic_write(inode, true); > > > >> > > > >> In order to avoid caching obsolete page of cow_inode, how about truncating > > > >> them here? > > > >> > > > >> if (f2fs_is_atomic_file() && cow_inode) > > > >> truncate_inode_pages_final(&cow_inode->i_data); > > > >> > > > >> Thanks, > > > >> > > > >>> - > > > >>> /* should remain fi->extent_tree for writepage */ > > > >>> f2fs_destroy_extent_node(inode); > > > >>>