Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756179Ab0HCNbW (ORCPT ); Tue, 3 Aug 2010 09:31:22 -0400 Received: from pyrimidin.rz.uni-konstanz.de ([134.34.240.46]:39960 "EHLO pyrimidin.rz.uni-konstanz.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754219Ab0HCNbU (ORCPT ); Tue, 3 Aug 2010 09:31:20 -0400 X-IronPort-AV: E=Sophos;i="4.55,309,1278288000"; d="p7s'?png'150?scan'150,208,150";a="10897429" Message-ID: <4C581A24.6090709@uni-konstanz.de> Date: Tue, 03 Aug 2010 15:31:16 +0200 From: Kay Diederichs User-Agent: Thunderbird 2.0.0.24 (X11/20100721) MIME-Version: 1.0 To: Eric Sandeen CC: Dave Chinner , linux , Ext4 Developers List , Karsten Schaefer , "Ted Ts'o" Subject: Re: ext4 performance regression 2.6.27-stable versus 2.6.32 and later References: <4C508A54.7070002@uni-konstanz.de> <20100729232856.GP655@dastard> <4C56DBB0.9080405@uni-konstanz.de> <4C56EE67.4070905@redhat.com> In-Reply-To: <4C56EE67.4070905@redhat.com> X-Enigmail-Version: 0.96.0 Content-Type: multipart/signed; protocol="application/x-pkcs7-signature"; micalg=sha1; boundary="------------ms080903040607090303070102" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 45933 Lines: 718 This is a cryptographically signed message in MIME format. --------------ms080903040607090303070102 Content-Type: multipart/mixed; boundary="------------080907010207070709010903" This is a multi-part message in MIME format. --------------080907010207070709010903 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Eric Sandeen schrieb: > On 08/02/2010 09:52 AM, Kay Diederichs wrote: >> Dave, >> >> as you suggested, we reverted "ext4: Avoid group preallocation for >> closed files" and this indeed fixes a big part of the problem: after >> booting the NFS server we get >> >> NFS-Server: turn5 2.6.32.16p i686 >> NFS-Client: turn10 2.6.18-194.8.1.el5 x86_64 >> >> exported directory on the nfs-server: >> /dev/md5 /mnt/md5 ext4 >> rw,seclabel,noatime,barrier=1,stripe=512,data=writeback 0 0 >> >> 48 seconds for preparations >> 28 seconds to rsync 100 frames with 597M from nfs directory >> 57 seconds to rsync 100 frames with 595M to nfs directory >> 70 seconds to untar 24353 kernel files with 323M to nfs directory >> 57 seconds to rsync 24353 kernel files with 323M from nfs directory >> 133 seconds to run xds_par in nfs directory >> 425 seconds to run the script > > Interesting, I had found this commit to be a problem for small files > which are constantly created & deleted; the commit had the effect of > packing the newly created files in the first free space that could be > found, rather than walking down the disk leaving potentially fragmented > freespace behind (see seekwatcher graph attached). Reverting the patch > sped things up for this test, but left the filesystem freespace in bad > shape. > > But you seem to see one of the largest effects in here: > > 261 seconds to rsync 100 frames with 595M to nfs directory > vs > 57 seconds to rsync 100 frames with 595M to nfs directory > > with the patch reverted making things go faster. So you are doing 100 > 6MB writes to the server, correct? Is the filesystem mkfs'd fresh > before each test, or is it aged? If not mkfs'd, is it at least > completely empty prior to the test, or does data remain on it? I'm just > wondering if fragmented freespace is contributing to this behavior as > well. If there is fragmented freespace, then with the patch I think the > allocator is more likely to hunt around for small discontiguous chunks > of free sapce, rather than going further out in the disk looking for a > large area to allocate from. > > It might be interesting to use seekwatcher on the server to visualize > the allocation/IO patterns for the test running just this far? > > -Eric > > > ------------------------------------------------------------------------ > Eric, seekwatcher does not seem to understand the blktrace output of old kernels, so I rolled my own primitive plotting, e.g. blkparse -i md5.xds_par.2.6.32.16p_run1 > blkparse.out grep flush blkparse.out | grep W > flush_W grep flush blkparse.out | grep R > flush_R grep nfsd blkparse.out | grep R > nfsd_R grep nfsd blkparse.out | grep W > nfsd_W grep sync blkparse.out | grep R > sync_R grep sync blkparse.out | grep W > sync_W gnuplot<