Received: by 2002:a25:7ec1:0:0:0:0:0 with SMTP id z184csp9044083ybc; Fri, 29 Nov 2019 23:23:54 -0800 (PST) X-Google-Smtp-Source: APXvYqzRQ5OiB33et7rwFVCr2HI3AW3cBG/iIYXa9NtkRoA05LuOPIlCoaEH7gVaYJ6yBWu8rUDt X-Received: by 2002:a05:6402:1601:: with SMTP id f1mr5103930edv.230.1575098634136; Fri, 29 Nov 2019 23:23:54 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1575098634; cv=none; d=google.com; s=arc-20160816; b=VpafZnM1UR1JeRqU9yOKR8J/zvlJjgrRuxT47B981U7Aej7SeuU7n31pZNbn70s2Ub /mOB+uaCS4Xa7UpsEXE4yxefTCeiIZc77a80I7Z1Y0p1NxgtRIrwj4F434EiMQO1qn0c m4Un63vUBmIVRHRufEJhwn+6KCA7lx4WH05CFsgHl1MVTfmGmB9RVHNUVH1c0mDLEKFX 1bdvR7+2lUb40NVU+Z7jtnVG1i7SD9qitsR8WwCcYeyhywkWScVwwrAyZNH18azJE0zM rVD89OrxBdoGeFMYM+YOzAbTxfBDT1h2J+GNmWN1sGnoQ1w9NSqNQvLm9MFVJ8oX4k5t hBoA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=sSAnIfpV5RUTLeHYiU3Ovk/e97+JCZSFeQyDsoAvFRA=; b=q2lFxYCL1Bb1wvk/UmzKtVBxjbJrS2kj7e1K9ZJiivACiwPkcYlXd5D6N5Psb05DdB kJItgyBj0pCqgzAKtemW0xIyLkTS6KDRwmy1waGbrwr5apCSb7WQXCIT9DJCMQmiwk6F SwP27Rq9Q/aEwwcIQX5uWObsLFITlUyeOImt23oCfs0GfZIrqiMc2z70Wxkk8J8+6MfI LNeUWE1KlaUkowxyNxMpRnncIFWnpQ6QR/geE19i21fXCKJo1KVWb+Hl1ntXsYdJcKoz DzrjrQNT55Q/BagmgvWi6trqtMxChVUNu9vRueJZoaRxxamhMJHKU/qVzGKZtG+zbPBQ QhZA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r17si15763472eje.139.2019.11.29.23.23.30; Fri, 29 Nov 2019 23:23:54 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726924AbfK3HTu (ORCPT + 99 others); Sat, 30 Nov 2019 02:19:50 -0500 Received: from szxga06-in.huawei.com ([45.249.212.32]:42362 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725811AbfK3HTu (ORCPT ); Sat, 30 Nov 2019 02:19:50 -0500 Received: from DGGEMS409-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id B6418840280E0BEBEE0C; Sat, 30 Nov 2019 15:19:47 +0800 (CST) Received: from [10.134.22.195] (10.134.22.195) by smtp.huawei.com (10.3.19.209) with Microsoft SMTP Server (TLS) id 14.3.439.0; Sat, 30 Nov 2019 15:19:45 +0800 Subject: Re: [PATCH] f2fs: Fix direct IO handling To: Damien Le Moal , Ritesh Harjani , "linux-f2fs-devel@lists.sourceforge.net" , "linux-kernel@vger.kernel.org" , Jaegeuk Kim CC: Javier Gonzalez , Shinichiro Kawasaki References: <20191126075719.1046485-1-damien.lemoal@wdc.com> <20191126083443.F1FD5A405B@b06wcsmtp001.portsmouth.uk.ibm.com> From: Chao Yu Message-ID: Date: Sat, 30 Nov 2019 15:19:44 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="windows-1252" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.134.22.195] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org -Cc fsdevel mailing list On 2019/11/28 10:10, Damien Le Moal wrote: > On 2019/11/26 17:34, Ritesh Harjani wrote: >> Hello Damien, >> >> IIUC, you are trying to fix a stale data read by DIO read for the case >> you explained in your patch w.r.t. DIO-write forced to write as buffIO. >> >> Coincidentally I was just looking at the same code path just now. >> So I do have a query to you/f2fs group. Below could be silly one, as I >> don't understand F2FS in great detail. >> >> How is the stale data by DIO read, is protected against a mmap >> writes via f2fs_vm_page_mkwrite? >> >> f2fs_vm_page_mkwrite() f2fs_direct_IO (read) >> filemap_write_and_wait_range() - writepages lock_page - writepage unlock_page lock_page >> -> f2fs_get_blocks() - f2fs_map_blocks >> -> submit_bio() >> >> -> set_page_dirty() unlock_page I guess lock range is as above, so the race can happen, however, 1) If mkwrite() creates data in hole, direct_IO->f2fs_map_blocks should return NEW_ADDR, which means that is a hole of file, so direct_IO should read all zeroed data. 2) If mkwrite() overwrite data in block, mkwrite->f2fs_get_blocks won't change old block address, then direct_IO->f2fs_map_blocks could get that block address, and won't read stale data. But I doubt could we read stale data with below race condition: kworker DIO reader - writepages - f2fs_map_blocks - get old block address - writepage trigger OPU, update old block address to new one someone trigger checkpoint, data in old block address becomes stale, then anyone else can write data into there. - submit_bio get stale data Or am I missing something that maybe vfs has did such synchronization. Thanks, >> >> Is above race possible with current f2fs code? >> i.e. f2fs_direct_IO could read the stale data from the blocks >> which were allocated due to mmap fault? > > The faulted page is locked until the fault is fully processed so direct > IO has to wait for that to complete first. > >> >> Am I missing something here? >> >> -ritesh >> >> On 11/26/19 1:27 PM, Damien Le Moal wrote: >>> f2fs_preallocate_blocks() identifies direct IOs using the IOCB_DIRECT >>> flag for a kiocb structure. However, the file system direct IO handler >>> function f2fs_direct_IO() may have decided that a direct IO has to be >>> exececuted as a buffered IO using the function f2fs_force_buffered_io(). >>> This is the case for instance for volumes including zoned block device >>> and for unaligned write IOs with LFS mode enabled. >>> >>> These 2 different methods of identifying direct IOs can result in >>> inconsistencies generating stale data access for direct reads after a >>> direct IO write that is treated as a buffered write. Fix this >>> inconsistency by combining the IOCB_DIRECT flag test with the result >>> of f2fs_force_buffered_io(). >>> >>> Reported-by: Javier Gonzalez >>> Signed-off-by: Damien Le Moal >>> --- >>> fs/f2fs/data.c | 4 +++- >>> 1 file changed, 3 insertions(+), 1 deletion(-) >>> >>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c >>> index 5755e897a5f0..8ac2d3b70022 100644 >>> --- a/fs/f2fs/data.c >>> +++ b/fs/f2fs/data.c >>> @@ -1073,6 +1073,8 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from) >>> int flag; >>> int err = 0; >>> bool direct_io = iocb->ki_flags & IOCB_DIRECT; >>> + bool do_direct_io = direct_io && >>> + !f2fs_force_buffered_io(inode, iocb, from); >>> >>> /* convert inline data for Direct I/O*/ >>> if (direct_io) { >>> @@ -1081,7 +1083,7 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from) >>> return err; >>> } >>> >>> - if (direct_io && allow_outplace_dio(inode, iocb, from)) >>> + if (do_direct_io && allow_outplace_dio(inode, iocb, from)) >>> return 0; >>> >>> if (is_inode_flag_set(inode, FI_NO_PREALLOC)) >>> >> >> > >