Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757392AbZF0GoW (ORCPT ); Sat, 27 Jun 2009 02:44:22 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752055AbZF0GoN (ORCPT ); Sat, 27 Jun 2009 02:44:13 -0400 Received: from mx2.mail.elte.hu ([157.181.151.9]:36083 "EHLO mx2.mail.elte.hu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751063AbZF0GoM (ORCPT ); Sat, 27 Jun 2009 02:44:12 -0400 Date: Sat, 27 Jun 2009 08:44:04 +0200 From: Ingo Molnar To: Vince Weaver Cc: Peter Zijlstra , Paul Mackerras , linux-kernel@vger.kernel.org, Mike Galbraith Subject: [numbers] perfmon/pfmon overhead of 17%-94% Message-ID: <20090627064404.GA19368@elte.hu> References: <20090624151010.GA12799@elte.hu> <20090627060432.GB16200@elte.hu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090627060432.GB16200@elte.hu> User-Agent: Mutt/1.5.18 (2008-05-17) X-ELTE-SpamScore: -1.5 X-ELTE-SpamLevel: X-ELTE-SpamCheck: no X-ELTE-SpamVersion: ELTE 2.0 X-ELTE-SpamCheck-Details: score=-1.5 required=5.9 tests=BAYES_00 autolearn=no SpamAssassin version=3.2.5 -1.5 BAYES_00 BODY: Bayesian spam probability is 0 to 1% [score: 0.0000] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2967 Lines: 98 * Ingo Molnar wrote: > Besides, you compare perfcounters to perfmon (which you seem to be > a contributor of), while in reality perfmon has much, much worse > (and unfixable, because designed-in) measurement overhead. > > So why are you criticising perfcounters for a 5000 cycles > measurement overhead while perfmon has huge, _hundreds of > millions_ of cycles measurement overhead (per second) for various > realistic workloads? [ In fact in one of the scheduler-tests > perfmon has a whopping measurement overhead of _nine billion_ > cycles, it increased total runtime of the workload from 3.3 > seconds to 6.6 seconds. (!) ] Here are the more detailed perfmon/pfmon measurement overhead numbers. Test system is a "Intel Core2 E6800 @ 2.93GHz", 1 GB of RAM, default Fedora install. I've measured two workloads: hackbench.c # messaging server benchmark test-1m-pipes.c # does 1 million pipe ops, similar to lat_pipe v2.6.28+perfmon patches (v3, full): ./hackbench 10 0.496400985 seconds time elapsed ( +- 1.699% ) pfmon --follow-fork--aggregate-results ./hackbench 10 0.580812999 seconds time elapsed ( +- 2.233% ) I.e. this workload runs 17% slower under pfmon, the measurement overhead is about 1.45 billion cycles. Furthermore, when running a 'pipe latency benchmark', an app that does one million pipe reads and writes between two tasks (source code attached below), i measured the following perfmon/pfmon overhead: ./pipe-test-1m 3.344280347 seconds time elapsed ( +- 0.361% ) pfmon --follow-fork --aggregate-results ./pipe-test-1m 6.508737983 seconds time elapsed ( +- 0.243% ) That's an about 94% measurement overhead, or about 9.2 _billion_ cycles overhead on this test-system. These perfmon/pfmon overhead figures are consistently reproducible, and they happen on other test-systems as well, and with other workloads as well. Basically for any app that involves task creation or context-switching, perfmon adds considerable runtime overhead - well beyond the overhead of perfcounters. Ingo -----------------{ pipe-test-1m.c }--------------------> #include #include #include #include #include #include #define LOOPS 1000000 int main (void) { unsigned long long t0, t1; int pipe_1[2], pipe_2[2]; int m = 0, i; pipe(pipe_1); pipe(pipe_2); if (!fork()) { for (i = 0; i < LOOPS; i++) { read(pipe_1[0], &m, sizeof(int)); write(pipe_2[1], &m, sizeof(int)); } } else { for (i = 0; i < LOOPS; i++) { write(pipe_1[1], &m, sizeof(int)); read(pipe_2[0], &m, sizeof(int)); } } return 0; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/