From: =?ISO-8859-1?Q?P=E1draig_Brady?= Subject: Re: Files full of zeros with coreutils-8.11 and xfs (FIEMAP related?) Date: Fri, 15 Apr 2011 00:29:46 +0100 Message-ID: <4DA7836A.5040604@draigBrady.com> References: <20110414102608.GA1678@x4.trippels.de> <20110414120635.GB1678@x4.trippels.de> <20110414140222.GB1679@x4.trippels.de> <4DA70BD3.1070409@draigBrady.com> <4DA717B2.3020305@sandeen.net> <20110414225904.GK21395@dastard> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Eric Sandeen , linux-ext4@vger.kernel.org, coreutils@gnu.org, Markus Trippelsdorf , xfs-oss To: Dave Chinner Return-path: Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]:2445 "HELO mail1.slb.deg.dub.stisp.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1752090Ab1DNXb3 (ORCPT ); Thu, 14 Apr 2011 19:31:29 -0400 In-Reply-To: <20110414225904.GK21395@dastard> Sender: linux-ext4-owner@vger.kernel.org List-ID: On 14/04/11 23:59, Dave Chinner wrote: > On Thu, Apr 14, 2011 at 10:50:10AM -0500, Eric Sandeen wrote: >> On 4/14/11 9:59 AM, P=E1draig Brady wrote: >>> On 14/04/11 15:02, Markus Trippelsdorf wrote: >>>>>> Hi P=E1draig, >>>>>> >>>>>> here you go: >>>>>> + filefrag -v unwritten.withdata = = =20 >>>>>> Filesystem type is: ef53 = = =20 >>>>>> File size of unwritten.withdata is 5120 (2 blocks, blocksize 409= 6) = =20 >>>>>> ext logical physical expected length flags = = =20 >>>>>> 0 0 274432 2560 unwritten,eof = = =20 >>>>>> unwritten.withdata: 1 extent found >>>>>> >>>>>> Please notice that this also happens with ext4 on the same kerne= l.=20 >>>>>> Btrfs is fine. >>>>> >>>> `filefrag -vs` fixes the issue on both xfs and ext4. >>> >>> So in summary, currently on (2.6.39-rc3), the following >>> will (usually?) report a single unwritten extent, >>> on both ext4 and xfs >>> >>> fallocate -l 10MiB -n k >>> dd count=3D10 if=3D/dev/urandom conv=3Dnotrunc iflag=3Dfullblock = of=3Dk >>> filefrag -v k # grep for an extent without unwritten || fail >> >> right, that's what I see too in testing. >> >> But would the coreutils install have done a preallocation of the des= tination file? >> >> Otherwise this looks like a different bug... >> >>> This particular issue has been discussed so far at: >>> http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D8411 >>> Note there it was stated there that ext4 had this >>> fixed as of 2.6.39-rc1, so maybe there is something lurking? >> >> ext4 got a fix, but not xfs, I guess. My poor brain can't remember,= I think I started looking into it, but it's clearly still broken. >> >> Still, I don't know for sure what happened to Markus - did something= preallocate, in his case? >=20 > Unwritten extent mapping behaves in an unexpected way due to > buffered writeback not occurring immediately. Extent conversion > doesn't occur until the data is on disk, and for buffered IO you > need an fdatasync to ensure that has occurred. >=20 > That is:=20 >=20 > $ xfs_io -f -c "resvsp 0 10m" -c "pwrite 0 5120" -c "bmap -vp" /mnt/t= est/foo > wrote 5120/5120 bytes at offset 0 > 5 KiB, 2 ops; 0.0000 sec (62.600 MiB/sec and 25641.0256 ops/sec) > /mnt/test/foo: > EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLA= GS > 0: [0..20479]: 268984..289463 0 (268984..289463) 20480 100= 00 >=20 > Data has not been written yet, so it is still unwritten. The same > test with a fsync shows: >=20 > $ sudo xfs_io -f -c "resvsp 0 10m" -c "pwrite 0 5120" -c fsync -c "bm= ap -vp" /mnt/test/foo > wrote 5120/5120 bytes at offset 0 > 5 KiB, 2 ops; 0.0000 sec (87.193 MiB/sec and 35714.2857 ops/sec) > /mnt/test/foo: > EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLA= GS > 0: [0..15]: 268984..268999 0 (268984..268999) 16 000= 00 > 1: [16..20479]: 269000..289463 0 (269000..289463) 20464 100= 00 >=20 > Everything is fine. >=20 > So this seems like an application error to me. If you are going to > use fiemap to determine what ranges to copy, then you have to > fdatasync the source file first to guarantee that preallocated > extents have been converted to written state before mapping the > file.... Well IMHO there should be a difference between knowing where you are going to write, and actually writing to disk. I.E. one shouldn't need to write the whole way to the device before returning a valid fiemap. If a particular file system implementation needs to sync to return a valid fiemap, then it should be implicit. cheers, P=E1draig. -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" i= n the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html