MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <18884.6416.293431.357565@cargo.ozlabs.ibm.com>
Date: Sat, 21 Mar 2009 09:30:40 +1100
From: Paul Mackerras <paulus@samba.org>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>, linux-kernel@vger.kernel.org
Subject: Re: [PATCH/RFC] perfcounters: record time running and time enabled
 for each counter
In-Reply-To: <1237560631.24626.103.camel@twins>
References: <18883.34555.748843.35920@cargo.ozlabs.ibm.com>
	<1237560631.24626.103.camel@twins>
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1977
Lines: 42

Peter Zijlstra writes:

> On Fri, 2009-03-20 at 23:07 +1100, Paul Mackerras wrote:
> > Impact: new functionality
> > 
> > Currently, if there are more counters enabled than can fit on the CPU,
> > the kernel will multiplex the counters on to the hardware using
> > round-robin scheduling.  That isn't too bad for sampling counters, but
> > for counting counters it means that the value read from a counter
> > represents some unknown fraction of the true count of events that
> > occurred while the counter was enabled.
> > 
> > This remedies the situation by keeping track of how long each counter
> > is enabled for, and how long it is actually on the cpu and counting
> > events.  These times are recorded in nanoseconds using the task clock
> > for per-task counters and the cpu clock for per-cpu counters.
> 
> Can't we do this by simply adding some software counters? A task local
> to each group and a task local on its own.
> 
> Since the solo task local will always be scheduled, you can get the
> sampling fraction by comparing the group task local one to the solo one.

You can get time_running that way but you can't really get
time_enabled accurately in general, since there's no way to enable or
disable two groups simultaneously.  (Yes, in the special case where
you are doing self-monitoring and you have no other counters, you can
use prctl(PR_TASK_PERF_COUNTERS_{EN,DIS}ABLE), but that's not a
general solution.)

The other thing is that the two extra software counters will add
overhead, whereas my patch will add negligible overhead - just some
integer arithmetic and accesses to cache lines which will already be
in exclusive state due to the stores we're already doing to things
such as counter->state.

Paul.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/