Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757309AbXJBMTg (ORCPT ); Tue, 2 Oct 2007 08:19:36 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751955AbXJBMT2 (ORCPT ); Tue, 2 Oct 2007 08:19:28 -0400 Received: from brick.kernel.dk ([87.55.233.238]:2734 "EHLO kernel.dk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751960AbXJBMT0 (ORCPT ); Tue, 2 Oct 2007 08:19:26 -0400 Date: Tue, 2 Oct 2007 14:21:18 +0200 From: Jens Axboe To: "Alan D. Brunelle" Cc: Mathieu Desnoyers , linux-kernel@vger.kernel.org, btrace Subject: Re: Linux Kernel Markers - performance characterization with large IO load on large-ish system Message-ID: <20071002122118.GL5236@kernel.dk> References: <46F92219.9020406@hp.com> <20070925171349.GA6057@Krystal> <46FA7A86.6090804@hp.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <46FA7A86.6090804@hp.com> Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5362 Lines: 120 On Wed, Sep 26 2007, Alan D. Brunelle wrote: > Mathieu Desnoyers wrote: >> * Alan D. Brunelle (Alan.Brunelle@hp.com) wrote: >>> Taking Linux 2.6.23-rc6 + 2.6.23-rc6-mm1 as a basis, I took some sample >>> runs of the following on both it and after applying Mathieu Desnoyers >>> 11-patch sequence (19 September 2007). >>> >>> * 32-way IA64 + 132GiB + 10 FC adapters + 10 HP MSA 1000s (one 72GiB >>> volume per MSA used) >>> >>> * 10 runs with each configuration, averages shown below >>> o 2.6.23-rc6 + 2.6.23-rc6-mm1 without blktrace running >>> o 2.6.23-rc6 + 2.6.23-rc6-mm1 with blktrace running >>> o 2.6.23-rc6 + 2.6.23-rc6-mm1 + markers without blktrace running >>> o 2.6.23-rc6 + 2.6.23-rc6-mm1 + markers with blktrace running >>> >>> * A run consists of doing the following in parallel: >>> o Make an ext3 FS on each of the 10 volumes >>> o Mount & unmount each volume >>> + The unmounting generates a tremendous amount of writes >>> to the disks - thus stressing the intended storage >>> devices (10 volumes) plus the separate volume for all >>> the blktrace data (when blk tracing is enabled). >>> + Note the times reported below only cover the >>> make/mount/unmount time - the actual blktrace runs >>> extended beyond the times measured (took quite a while >>> for the blk trace data to be output). We're only >>> concerned with the impact on the "application" >>> performance in this instance. >>> >>> Results are: >>> >>> Kernel w/out BT STDDEV w/ BT >>> STDDEV >>> ------------------------------------- --------- ------ --------- >>> ------ >>> 2.6.23-rc6 + 2.6.23-rc6-mm1 14.679982 0.34 27.754796 >>> 2.09 >>> 2.6.23-rc6 + 2.6.23-rc6-mm1 + markers 14.993041 0.59 26.694993 >>> 3.23 >>> >> >> Interesting results, although we cannot say any of the solutions has much >> impact due to the std dev. >> >> Also, it could be interesting to add the "blktrace compiled out" as a >> base line. >> >> Thanks for running those tests, >> >> Mathieu > Mathieu: > > Here are the results from 6 different kernels (including ones with blktrace > not configured in), with now performing 40 runs per kernel. > > o All kernels start off with Linux 2.6.23-rc6 + 2.6.23-rc6-mm1 > > o '- bt cfg' or '+ bt cfg' means a kernel without or with blktrace > configured respectively. > > o '- markers' or '+ markers' means a kernel without or with the 11-patch > marker series respectively. > > 38 runs without blk traces being captured (dropped hi/lo value from 40 > runs) > > Kernel Options Min val Avg val Max val Std Dev > ------------------ --------- --------- --------- --------- > - markers - bt cfg 15.349127 16.169459 16.372980 0.184417 > + markers - bt cfg 15.280382 16.202398 16.409257 0.191861 > > - markers + bt cfg 14.464366 14.754347 16.052306 0.463665 > + markers + bt cfg 14.421765 14.644406 15.690871 0.233885 > > 38 runs with blk traces being captured (dropped hi/lo value from 40 runs) > > Kernel Options Min val Avg val Max val Std Dev > ------------------ --------- --------- --------- --------- > - markers + bt cfg 24.675859 28.480446 32.571484 1.713603 > + markers + bt cfg 18.713280 27.054927 31.684325 2.857186 > > o It is not at all clear why running without blk trace configured into > the kernel runs slower than with blk trace configured in. (9.6 to 10.6% > reduction) > o The data is still not conclusive with respect to whether the marker > patches change performance characteristics when we're not gathering traces. > It appears > that any change in performance is minimal at worst for this test. > o The data so far still doesn't conclusively show a win in this case > even when we are capturing traces, although, the average certainly seems to > be in its favor. > One concern that I should be able to deal easily with is the choice of > the IO scheduler being used for both the volume being used to perform the > test on, as well as the one used for storing blk traces (when enabled). > Right now I was using the default CFQ, when perhaps NOOP or DEADLINE would > be a better choice. If there is enough interest in seeing how that changes > things I could try to get some runs in later this week. Alan, Thanks for running these numbers as well. I don't think you have to bother with it more. My main concern was a performance regression, increasing the overhead of running blktrace. So while we (well, you :-)) could run more tests, I'd say the above is Good Enough for me. Mathieu, you can add my Acked-by: Jens Axboe to the blktrace part of your marker series. I do wonder about that performance _increase_ with blktrace enabled. I remember that we have seen and discussed something like this before, it's still a puzzle to me... -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/