From: Kay Diederichs Subject: Re: ext4 performance regression 2.6.27-stable versus 2.6.32 and later Date: Tue, 03 Aug 2010 15:31:16 +0200 Message-ID: <4C581A24.6090709@uni-konstanz.de> References: <4C508A54.7070002@uni-konstanz.de> <20100729232856.GP655@dastard> <4C56DBB0.9080405@uni-konstanz.de> <4C56EE67.4070905@redhat.com> Mime-Version: 1.0 Content-Type: multipart/signed; protocol="application/x-pkcs7-signature"; micalg=sha1; boundary="------------ms080903040607090303070102" Cc: Dave Chinner , linux , Ext4 Developers List , Karsten Schaefer , Ted Ts'o To: Eric Sandeen Return-path: In-Reply-To: <4C56EE67.4070905@redhat.com> Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-ext4.vger.kernel.org This is a cryptographically signed message in MIME format. --------------ms080903040607090303070102 Content-Type: multipart/mixed; boundary="------------080907010207070709010903" This is a multi-part message in MIME format. --------------080907010207070709010903 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Eric Sandeen schrieb: > On 08/02/2010 09:52 AM, Kay Diederichs wrote: >> Dave, >> >> as you suggested, we reverted "ext4: Avoid group preallocation for >> closed files" and this indeed fixes a big part of the problem: after >> booting the NFS server we get >> >> NFS-Server: turn5 2.6.32.16p i686 >> NFS-Client: turn10 2.6.18-194.8.1.el5 x86_64 >> >> exported directory on the nfs-server: >> /dev/md5 /mnt/md5 ext4 >> rw,seclabel,noatime,barrier=1,stripe=512,data=writeback 0 0 >> >> 48 seconds for preparations >> 28 seconds to rsync 100 frames with 597M from nfs directory >> 57 seconds to rsync 100 frames with 595M to nfs directory >> 70 seconds to untar 24353 kernel files with 323M to nfs directory >> 57 seconds to rsync 24353 kernel files with 323M from nfs directory >> 133 seconds to run xds_par in nfs directory >> 425 seconds to run the script > > Interesting, I had found this commit to be a problem for small files > which are constantly created & deleted; the commit had the effect of > packing the newly created files in the first free space that could be > found, rather than walking down the disk leaving potentially fragmented > freespace behind (see seekwatcher graph attached). Reverting the patch > sped things up for this test, but left the filesystem freespace in bad > shape. > > But you seem to see one of the largest effects in here: > > 261 seconds to rsync 100 frames with 595M to nfs directory > vs > 57 seconds to rsync 100 frames with 595M to nfs directory > > with the patch reverted making things go faster. So you are doing 100 > 6MB writes to the server, correct? Is the filesystem mkfs'd fresh > before each test, or is it aged? If not mkfs'd, is it at least > completely empty prior to the test, or does data remain on it? I'm just > wondering if fragmented freespace is contributing to this behavior as > well. If there is fragmented freespace, then with the patch I think the > allocator is more likely to hunt around for small discontiguous chunks > of free sapce, rather than going further out in the disk looking for a > large area to allocate from. > > It might be interesting to use seekwatcher on the server to visualize > the allocation/IO patterns for the test running just this far? > > -Eric > > > ------------------------------------------------------------------------ > Eric, seekwatcher does not seem to understand the blktrace output of old kernels, so I rolled my own primitive plotting, e.g. blkparse -i md5.xds_par.2.6.32.16p_run1 > blkparse.out grep flush blkparse.out | grep W > flush_W grep flush blkparse.out | grep R > flush_R grep nfsd blkparse.out | grep R > nfsd_R grep nfsd blkparse.out | grep W > nfsd_W grep sync blkparse.out | grep R > sync_R grep sync blkparse.out | grep W > sync_W gnuplot<