Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933132AbbHXJj3 (ORCPT ); Mon, 24 Aug 2015 05:39:29 -0400 Received: from mailout3.samsung.com ([203.254.224.33]:40268 "EHLO mailout3.samsung.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753041AbbHXJj1 convert rfc822-to-8bit (ORCPT ); Mon, 24 Aug 2015 05:39:27 -0400 X-AuditID: cbfee61a-f79a06d000005c6f-1a-55dae64df258 From: Chao Yu To: "'Jaegeuk Kim'" Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net References: <1439887589-35190-1-git-send-email-jaegeuk@kernel.org> <1439887589-35190-5-git-send-email-jaegeuk@kernel.org> <01bb01d0db28$61844ea0$248cebe0$@samsung.com> <20150820153523.GA42028@jaegeuk-mac02.mot-mobility.com> <021501d0dc0f$d365f900$7a31eb00$@samsung.com> In-reply-to: Subject: RE: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of newly allocated nid Date: Mon, 24 Aug 2015 17:38:37 +0800 Message-id: <001901d0de50$c3b40810$4b1c1830$@samsung.com> MIME-version: 1.0 Content-type: text/plain; charset=UTF-8 Content-transfer-encoding: 8BIT X-Mailer: Microsoft Outlook 14.0 Thread-index: AQMbK+U27TEaBuzOkuL9EYWgZcyS+gL3ywKSAcOoX9wC3KaX2AGajc7WAan0ryKbLysSoA== Content-language: zh-cn X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrJLMWRmVeSWpSXmKPExsVy+t9jAV3fZ7dCDVa2m1g8WT+L2eLSIneL PXtPslhc3jWHzYHFY9OqTjaP3Qs+M3l83iQXwBzFZZOSmpNZllqkb5fAlfH9+USmgjeqFU+v tTE2MDbIdTFyckgImEjMX3aAEcIWk7hwbz1bFyMXh5DAUkaJ/42nmSGcV4wSi+7MA6tiE1CR WN7xnwnEFhFQk+jdNwXMZhbIlJjzejIrRMNJJom25XvZQBKcAnYS3/c0sHcxcnAIC8RLvL2g BBJmEVCVWPq3C2wmr4ClxJkvn9ghbEGJH5PvsUDMVJeYNG8RM4StLfHk3QVWiEsVJHacfc0I cUOExJFHh9kgasQlNh65xTKBUWgWklGzkIyahWTULCQtCxhZVjFKpBYkFxQnpeca5qWW6xUn 5haX5qXrJefnbmIER8AzqR2MB3e5H2IU4GBU4uFlsLwVKsSaWFZcmXuIUYKDWUmEV+nlzVAh 3pTEyqrUovz4otKc1OJDjNIcLErivLIbNocKCaQnlqRmp6YWpBbBZJk4OKUaGAuDIpViRSY8 mvvavUU83GPhtY+bk+78uv9+fcSmwqUPbay2rd8hkPxVcm+ZYOmkBcWMkZcuMX+94Xs7TGPV AqZduTxPGDYsjl47rX0T3+nE3acdDIOt78n/O/QuWGNzY/niT+8iozO1tubxV00s2WzClZXE uNBpXe8KNgUmDbtGv0Y+GZ6TO5RYijMSDbWYi4oTAdJPrSZ8AgAA Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4879 Lines: 124 Hi Jaegeuk, > -----Original Message----- > From: Chao Yu [mailto:yuchaochina@hotmail.com] > Sent: Friday, August 21, 2015 11:00 PM > To: Chao Yu > Cc: Jaegeuk Kim; linux-fsdevel@vger.kernel.org; linux-kernel@vger.kernel.org; > linux-f2fs-devel@lists.sourceforge.net > Subject: Re: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of newly allocated nid > > > On Aug 21, 2015, at 8:48 PM, Chao Yu wrote: > > > > Hi Jaegeuk, > > > >> -----Original Message----- > >> From: Jaegeuk Kim [mailto:jaegeuk@kernel.org] > >> Sent: Thursday, August 20, 2015 11:35 PM > >> To: Chao Yu > >> Cc: linux-kernel@vger.kernel.org; linux-fsdevel@vger.kernel.org; > >> linux-f2fs-devel@lists.sourceforge.net > >> Subject: Re: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of newly allocated > nid > >> > >> On Thu, Aug 20, 2015 at 05:12:03PM +0800, Chao Yu wrote: > >>> Hi Jaegeuk, > >>> > >>>> -----Original Message----- > >>>> From: Jaegeuk Kim [mailto:jaegeuk@kernel.org] > >>>> Sent: Tuesday, August 18, 2015 4:46 PM > >>>> To: linux-kernel@vger.kernel.org; linux-fsdevel@vger.kernel.org; > >>>> linux-f2fs-devel@lists.sourceforge.net > >>>> Cc: Jaegeuk Kim > >>>> Subject: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of newly allocated nid > >>>> > >>>> This patch adds a routine which checks the block address of newly allocated nid. > >>>> If an nid has already allocated by other thread due to subtle data races, it > >>>> will result in filesystem corruption. > >>>> So, it needs to check whether its block address was already allocated or not > >>>> in prior to nid allocation as the last chance. > >>>> > >>>> Signed-off-by: Jaegeuk Kim > >>>> --- > >>>> fs/f2fs/node.c | 9 +++++++++ > >>>> 1 file changed, 9 insertions(+) > >>>> > >>>> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c > >>>> index 3cc32b8..6bef5a2 100644 > >>>> --- a/fs/f2fs/node.c > >>>> +++ b/fs/f2fs/node.c > >>>> @@ -1573,6 +1573,8 @@ retry: > >>>> > >>>> /* We should not use stale free nids created by build_free_nids */ > >>>> if (nm_i->fcnt && !on_build_free_nids(nm_i)) { > >>>> + struct node_info ni; > >>>> + > >>>> f2fs_bug_on(sbi, list_empty(&nm_i->free_nid_list)); > >>>> list_for_each_entry(i, &nm_i->free_nid_list, list) > >>>> if (i->state == NID_NEW) > >>>> @@ -1583,6 +1585,13 @@ retry: > >>>> i->state = NID_ALLOC; > >>>> nm_i->fcnt--; > >>>> spin_unlock(&nm_i->free_nid_list_lock); > >>>> + > >>>> + /* check nid is allocated already */ > >>>> + get_node_info(sbi, *nid, &ni); > >>>> + if (ni.blk_addr != NULL_ADDR) { > >>> > >>> I didn't get it, why free nid is with non-NULL blkaddr? > >>> Could you please explain more about this? > >> > >> As I wrote in the description, I've been suffering from wrongly added free nids > >> which results in fs corruption. I suspect somewhat race condition in > >> build_free_nids, but it is very subtle to figure out exactly. > >> So, I wrote this patch to fix that. > >> > >> The concern would be performance regarding to cold cache miss at an NAT entry. > >> However, I expect that it would be tolerable since get_node_info will be called > >> after alloc_nid later. > > > > After investigating, I think I can reproduce this bug: > > > > 1. touch a (nid = 4) & touch b (nid = 5) > > 2. sync > > 3. rm a & rm b > > a) rm a to make next_scan_nid = 4. > > b) I change the logical of f2fs code making remove_inode_page failed when > > file b is being removed, so file b's nat entry is not set dirty; > > 4. sync > > 5. touch 1815 files > > 6. echo 3 > /proc/sys/vm/drop_caches > > drop clean nat entry of inode (nid:5), it makes we can pass blkaddr > > verification in add_free_nid: > > if (build) { > > /* do not add allocated nids */ > > > > 7. touch c > > because there is no free nids in cache, we try to build cache by two steps: > > a) build nids by loading from nat pages; > > b) build nids by loading from curseg and try to unload nids which has valid > > blkaddr in curseg. > > > > unfortunately, our build operation is not atomic, so after step a), nid:5 > > After rethinking about this issue on the way coming back home, I find that > it seems not right here, because we will try to check build_lock status in > on_build_free_nids, allocation will not happen during building free nid > cache. I missed that previously. > > Sorry for my wrong conclusion, please ignore them. :( > > I’d like to reinvestigate this issue. I reinvestigate this issue and find one possible call path for reproducing this issue, and I wrote patches for fxing, can you please help to review the following patches? Thanks, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/