Date: Thu, 14 Sep 2006 20:15:57 +0200
From: Ingo Molnar <mingo@elte.hu>
To: Roman Zippel <zippel@linux-m68k.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>,
       linux-kernel@vger.kernel.org, Christoph Hellwig <hch@infradead.org>,
       Andrew Morton <akpm@osdl.org>, Ingo Molnar <mingo@redhat.com>,
       Greg Kroah-Hartman <gregkh@suse.de>,
       Thomas Gleixner <tglx@linutronix.de>, Tom Zanussi <zanussi@us.ibm.com>,
       ltt-dev@shafik.org, Michel Dagenais <michel.dagenais@polymtl.ca>
Subject: Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
Message-ID: <20060914181557.GA22469@elte.hu>
References: <20060914033826.GA2194@Krystal> <20060914112718.GA7065@elte.hu> <Pine.LNX.4.64.0609141537120.6762@scrub.home> <20060914135548.GA24393@elte.hu> <Pine.LNX.4.64.0609141623570.6761@scrub.home> <20060914171320.GB1105@elte.hu> <Pine.LNX.4.64.0609141935080.6761@scrub.home>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <Pine.LNX.4.64.0609141935080.6761@scrub.home>
User-Agent: Mutt/1.4.2.1i
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 4732
Lines: 103


* Roman Zippel <zippel@linux-m68k.org> wrote:

> > for me these are all _independent_ grounds for rejection, as a generic 
> > kernel infrastructure.
> 
> Tracepoints of course need to be managed, but that's true for both 
> dynamic and static tracepoints. [...]

that's not true, and this is the important thing that i believe you are 
missing. A dynamic tracepoint is _detached_ from the normal source code 
and thus is zero maintainance overhead. You dont have to maintain it 
during normal development - only if you need it. You dont see the 
dynamic tracepoints in the source code.

a static tracepoint, once it's in the mainline kernel, is a nonzero 
maintainance overhead _until eternity_. It is a constant visual 
hindrance and a constant build-correctness and boot-correctness problem 
if you happen to change the code that is being traced by a static 
tracepoint. Again, I am talking out of actual experience with static 
tracepoints: i frequently break my kernel via static tracepoints and i 
have constant maintainance cost from them. So what i do is that i try to 
minimize the number of static tracepoints to _zero_. I.e. i only add 
them when i need them for a given bug.

static tracepoints are inferior to dynamic tracepoints in almost every 
way.

> [...]  Both have their advantages and disadvantages and just hammering 
> on the possible problems of static ones [...]

how about giving a line by line rebuttal to the very real problems of 
static tracepoints i listed (twice already), instead of calling them 
"possible problems"?

i am giving a line by line rebuttal of all arguments that come up. 
Please be fair and do the same. Here are the arguments again, for a 
third time. Thanks!

> > also, the other disadvantages i listed very much count too. Static 
> > tracepoints are fundamentally limited because:
> > 
> >   - they can only be added at the source code level
> > 
> >   - modifying them requires a reboot which is not practical in a
> >     production environment
> > 
> >   - there can only be a limited set of them, while many problems need
> >     finegrained tracepoints tailored to the problem at hand
> > 
> >   - conditional tracepoints are typically either nonexistent or very
> >     limited.


> > the kprobes infrastructure, despite being fairly young, is widely 
> > available: powerpc, i386, x86_64, ia64 and sparc64. The other 
> > architectures are free to implement them too, there's nothing 
> > hardware-specific about kprobes and the "porting overhead" is in 
> > essence a one-time cost - while for static tracepoints the 
> > maintainance overhead goes on forever and scales linearly with the 
> > number of tracepoints added.
> 
> kprobes are not trivial to implement [...]

nor are smp-alternatives, which was suggested as a solution to reduce 
the overhead of static tracepoints. So what's the point? It's a one-off 
development overhead that has already been done for all the major 
arches. If another arch needs it they can certainly implement it.

it's like arguing against ptrace on the grounds of: "application 
developers can add printf if they want to debug their apps, or they can 
add static tracepoints too, and besides, ptrace is hard to implement".

> I also think you highly exaggerate the maintaince overhead of static 
> tracepoints, once added they hardly need any maintainance, most of the 
> time you can just ignore them. [...]

hundreds (or possibly thousands) of tracepoints? Have you ever tried to 
maintain that? I have and it's a nightmare.

Even assuming a rich set of hundreds of static tracepoints, it doesnt 
even solve the problems at hand: people want to do much more when they 
probe the kernel - and today, with DTrace under Solaris people _know_ 
that much better tracing _can be done_, and they _demand_ that Linux 
adopts an intelligent solution. The clock is ticking for dinosaurs like 
static printks and static tracepoints to debug the kernel...

> [...] The kernel is full debug prints, do you seriously suggest to 
> throw them out because of their "high maintainance"?

oh yes, these days i frequently throw them out when i find them in code 
i modify. (my most recent such zap was rwsemtrace()). Also, obviously 
when most of them were added we didnt have good kernel debugging 
infrastructure (in fact we didnt have any kernel debugging 
infrastructure besides printk), so _something_ had to be used back then. 
But today there's little reason to keep them. Welcome to 2006 :-)

	Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/