Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752235Ab1BLS0F (ORCPT ); Sat, 12 Feb 2011 13:26:05 -0500 Received: from mail-pv0-f174.google.com ([74.125.83.174]:49328 "EHLO mail-pv0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751246Ab1BLSZ7 convert rfc822-to-8bit (ORCPT ); Sat, 12 Feb 2011 13:25:59 -0500 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=UCO9QM0JDjaRe/pX2zsUUc0OTQbYDL/tNQ4oZK+qmjE1uyGJrZKY4gDl7gXYYsfgyZ O81N3eq5XSBlC7pcR+haxQMcVIuWV+wgjFNleIHJ1FAUlnf+hflWTMi6UDnLay+FVczP f00TW+azIlzjOnoaSrfBMvXvx62AoO6lCtA3g= MIME-Version: 1.0 In-Reply-To: <1297502512.29573.26.camel@debian> References: <1295402148.4773.143.camel@debian> <1295402606.1949.871.camel@sli10-conroe> <20110120151656.GC18875@redhat.com> <20110126081529.GA28909@sli10-conroe.sh.intel.com> <1297502512.29573.26.camel@debian> Date: Sat, 12 Feb 2011 19:25:58 +0100 Message-ID: Subject: Re: [performance bug] kernel building regression on 64 LCPUs machine From: Corrado Zoccolo To: "Alex,Shi" Cc: "Li, Shaohua" , Vivek Goyal , "jack@suse.cz" , "tytso@mit.edu" , "jaxboe@fusionio.com" , "linux-kernel@vger.kernel.org" , "Chen, Tim C" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5277 Lines: 109 On Sat, Feb 12, 2011 at 10:21 AM, Alex,Shi wrote: > On Wed, 2011-01-26 at 16:15 +0800, Li, Shaohua wrote: >> On Thu, Jan 20, 2011 at 11:16:56PM +0800, Vivek Goyal wrote: >> > On Wed, Jan 19, 2011 at 10:03:26AM +0800, Shaohua Li wrote: >> > > add Jan and Theodore to the loop. >> > > >> > > On Wed, 2011-01-19 at 09:55 +0800, Shi, Alex wrote: >> > > > Shaohua and I tested kernel building performance on latest kernel. and >> > > > found it is drop about 15% on our 64 LCPUs NHM-EX machine on ext4 file >> > > > system. We find this performance dropping is due to commit >> > > > 749ef9f8423054e326f. If we revert this patch or just change the >> > > > WRITE_SYNC back to WRITE in jbd2/commit.c file. the performance can be >> > > > recovered. >> > > > >> > > > iostat report show with the commit, read request merge number increased >> > > > and write request merge dropped. The total request size increased and >> > > > queue length dropped. So we tested another patch: only change WRITE_SYNC >> > > > to WRITE_SYNC_PLUG in jbd2/commit.c, but nothing effected. >> > > since WRITE_SYNC_PLUG doesn't work, this isn't a simple no-write-merge issue. >> > > >> > >> > Yep, it does sound like reduce write merging. But moving journal commits >> > back to WRITE, then fsync performance will drop as there will be idling >> > introduced between fsync thread and journalling thread. So that does >> > not sound like a good idea either. >> > >> > Secondly, in presence of mixed workload (some other sync read happening) >> > WRITES can get less bandwidth and sync workload much more. So by >> > marking journal commits as WRITES you might increase the delay there >> > in completion in presence of other sync workload. >> > >> > So Jan Kara's approach makes sense that if somebody is waiting on >> > commit then make it WRITE_SYNC otherwise make it WRITE. Not sure why >> > did it not work for you. Is it possible to run some traces and do >> > more debugging that figure out what's happening. >> Sorry for the long delay. >> >> Looks fedora enables ccache by default. While our kbuild test is on ext4 disk >> but rootfs is on ext3 where ccache cache files live. Jan's patch only covers >> ext4, maybe this is the reason. >> I changed jbd to use WRITE for journal_commit_transaction. With the change and >> Jan's patch, the test seems fine. > Let me clarify the bug situation again. > With the following scenarios, the regression is clear. > 1, ccache_dir setup at rootfs that format is ext3 on /dev/sda1; 2, > kbuild on /dev/sdb1 with ext4. > but if we disable the ccache, only do kbuild on sdb1 with ext4. There is > no regressions whenever with or without Jan's patch. > So, problem focus on the ccache scenario, (from fedora 11, ccache is > default setting). > > If we compare the vmstat output with or without ccache, there is too > many write when ccache enabled. According the result, it should to do > some tunning on ext3 fs. Is ext3 configured with data ordered or writeback? I think ccache might be performing fsyncs, and this is a bad workload for ext3, especially in ordered mode. It might be that my patch introduced a regression in ext3 fsync performance, but I don't understand how reverting only the change in jbd2 (that is the ext4 specific journaling daemon) could restore it. The two partitions are on different disks, so each one should be isolated from the I/O perspective (do they share a single controller?). The only interaction I see happens at the VM level, since changing performance of any of the two changes the rate at which pages can be cleaned. Corrado > > > vmstat average output per 10 seconds, without ccache > procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu----- >  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st > 26.8 0.5 0.0 63930192.3 9677.0 96544.9 0.0 0.0 2486.9 337.9 17729.9 4496.4 17.5 2.5 79.8 0.2 0.0 > > vmstat average output per 10 seconds, with ccache > procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu----- >  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st > 2.4 40.7 0.0 64316231.0 17260.6 119533.8 0.0 0.0 2477.6 1493.1 8606.4 3565.2 2.5 1.1 83.0 13.5 0.0 > > >> >> Jan, >> can you send a patch with similar change for ext3? So we can do more tests. >> >> Thanks, >> Shaohua > > > > -- __________________________________________________________________________ dott. Corrado Zoccolo                          mailto:czoccolo@gmail.com PhD - Department of Computer Science - University of Pisa, Italy -------------------------------------------------------------------------- The self-confidence of a warrior is not the self-confidence of the average man. The average man seeks certainty in the eyes of the onlooker and calls that self-confidence. The warrior seeks impeccability in his own eyes and calls that humbleness.                                Tales of Power - C. Castaneda -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/