2009-04-14 17:26:58

by Steven Rostedt

[permalink] [raw]
Subject: [PATCH 2/8] tracing: create automated trace defines

From: Steven Rostedt <[email protected]>

This patch lowers the number of places a developer must modify to add
new tracepoints. The current method to add a new tracepoint
into an existing system is to write the trace point macro in the
trace header with one of the macros TRACE_EVENT, TRACE_FORMAT or
DECLARE_TRACE, then they must add the same named item into the C file
with the macro DEFINE_TRACE(name) and then add the trace point.

This change cuts out the needing to add the DEFINE_TRACE(name).
Every file that uses the tracepoint must still include the trace/<type>.h
file, but the one C file must also add a define before the including
of that file.

#define CREATE_TRACE_POINTS
#include <trace/mytrace.h>

This will cause the trace/mytrace.h file to also produce the C code
necessary to implement the trace point.

Note, if more than one trace/<type>.h is used to create the C code
it is best to list them all together.

#define CREATE_TRACE_POINTS
#include <trace/foo.h>
#include <trace/bar.h>
#include <trace/fido.h>

Thanks to Mathieu Desnoyers and Christoph Hellwig for coming up with
the cleaner solution of the define above the includes over my first
design to have the C code include a "special" header.

This patch converts sched, irq and lockdep and skb to use this new
method.

Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Neil Horman <[email protected]>
Cc: Zhao Lei <[email protected]>
Cc: Eduard - Gabriel Munteanu <[email protected]>
Cc: Pekka Enberg <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
---
include/trace/define_trace.h | 75 ++++++++++++++++++++++++++++++++++++++++++
include/trace/irq.h | 5 ++-
include/trace/kmem.h | 4 ++-
include/trace/lockdep.h | 3 ++
include/trace/sched.h | 3 ++
include/trace/skb.h | 3 ++
kernel/exit.c | 4 --
kernel/fork.c | 2 -
kernel/irq/handle.c | 7 ++--
kernel/kthread.c | 3 --
kernel/lockdep.c | 12 ++-----
kernel/sched.c | 10 ++----
kernel/signal.c | 2 -
kernel/softirq.c | 3 --
mm/util.c | 11 ++----
net/core/net-traces.c | 4 +-
16 files changed, 105 insertions(+), 46 deletions(-)
create mode 100644 include/trace/define_trace.h

diff --git a/include/trace/define_trace.h b/include/trace/define_trace.h
new file mode 100644
index 0000000..de9dc7d
--- /dev/null
+++ b/include/trace/define_trace.h
@@ -0,0 +1,75 @@
+/*
+ * Trace files that want to automate creationg of all tracepoints defined
+ * in their file should include this file. The following are macros that the
+ * trace file may define:
+ *
+ * TRACE_SYSTEM defines the system the tracepoint is for
+ *
+ * TRACE_INCLUDE_FILE if the file name is something other than TRACE_SYSTEM.h
+ * This macro may be defined to tell define_trace.h what file to include.
+ * Note, leave off the ".h".
+ *
+ * TRACE_INCLUDE_PATH if the path is something other than core kernel include/trace
+ * then this macro can define the path to use. Note, the path is relative to
+ * define_trace.h, not the file including it. Full path names for out of tree
+ * modules must be used.
+ */
+
+#ifdef CREATE_TRACE_POINTS
+
+/* Prevent recursion */
+#undef CREATE_TRACE_POINTS
+
+#include <linux/stringify.h>
+
+#undef TRACE_EVENT
+#define TRACE_EVENT(name, proto, args, tstruct, assign, print) \
+ DEFINE_TRACE(name)
+
+#undef TRACE_FORMAT
+#define TRACE_FORMAT(name, proto, args, print) \
+ DEFINE_TRACE(name)
+
+#undef DECLARE_TRACE
+#define DECLARE_TRACE(name, proto, args) \
+ DEFINE_TRACE(name)
+
+#undef TRACE_INCLUDE
+#undef __TRACE_INCLUDE
+
+#ifndef TRACE_INCLUDE_FILE
+# define TRACE_INCLUDE_FILE TRACE_SYSTEM
+# define UNDEF_TRACE_INCLUDE_FILE
+#endif
+
+#ifndef TRACE_INCLUDE_PATH
+# define __TRACE_INCLUDE(system) <trace/system.h>
+# define UNDEF_TRACE_INCLUDE_FILE
+#else
+# define __TRACE_INCLUDE(system) __stringify(TRACE_INCLUDE_PATH/system.h)
+#endif
+
+# define TRACE_INCLUDE(system) __TRACE_INCLUDE(system)
+
+/* Let the trace headers be reread */
+#define TRACE_HEADER_MULTI_READ
+
+#include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
+
+#undef TRACE_HEADER_MULTI_READ
+
+/* Only undef what we defined in this file */
+#ifdef UNDEF_TRACE_INCLUDE_FILE
+# undef TRACE_INCLUDE_PATH
+# undef UNDEF_TRACE_INCLUDE_FILE
+#endif
+
+#ifdef UNDEF_TRACE_INCLUDE_FILE
+# undef TRACE_INCLUDE_PATH
+# undef UNDEF_TRACE_INCLUDE_FILE
+#endif
+
+/* We may be processing more files */
+#define CREATE_TRACE_POINTS
+
+#endif /* CREATE_TRACE_POINTS */
diff --git a/include/trace/irq.h b/include/trace/irq.h
index 04ab4c6..75e3468 100644
--- a/include/trace/irq.h
+++ b/include/trace/irq.h
@@ -51,4 +51,7 @@ TRACE_FORMAT(softirq_exit,
TP_FMT("softirq=%d action=%s", (int)(h - vec), softirq_to_name[h-vec])
);

-#endif
+#endif /* _TRACE_IRQ_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/include/trace/kmem.h b/include/trace/kmem.h
index d7d1218..c22c42f 100644
--- a/include/trace/kmem.h
+++ b/include/trace/kmem.h
@@ -188,5 +188,7 @@ TRACE_EVENT(kmem_cache_free,

TP_printk("call_site=%lx ptr=%p", __entry->call_site, __entry->ptr)
);
+#endif /* _TRACE_KMEM_H */

-#endif
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/include/trace/lockdep.h b/include/trace/lockdep.h
index 8ee7900..4d301e7 100644
--- a/include/trace/lockdep.h
+++ b/include/trace/lockdep.h
@@ -55,3 +55,6 @@ TRACE_EVENT(lock_acquired,
#endif

#endif /* _TRACE_LOCKDEP_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/include/trace/sched.h b/include/trace/sched.h
index 5b1cf4a..ffa1cab 100644
--- a/include/trace/sched.h
+++ b/include/trace/sched.h
@@ -334,3 +334,6 @@ TRACE_EVENT(sched_signal_send,
);

#endif /* _TRACE_SCHED_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/include/trace/skb.h b/include/trace/skb.h
index e6fd281..1e8fabb 100644
--- a/include/trace/skb.h
+++ b/include/trace/skb.h
@@ -35,3 +35,6 @@ TRACE_EVENT(kfree_skb,
);

#endif /* _TRACE_SKB_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/kernel/exit.c b/kernel/exit.c
index abf9cf3..2fe9d2c 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -56,10 +56,6 @@
#include <asm/mmu_context.h>
#include "cred-internals.h"

-DEFINE_TRACE(sched_process_free);
-DEFINE_TRACE(sched_process_exit);
-DEFINE_TRACE(sched_process_wait);
-
static void exit_mm(struct task_struct * tsk);

static void __unhash_process(struct task_struct *p)
diff --git a/kernel/fork.c b/kernel/fork.c
index b9e2edd..4bebf26 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -83,8 +83,6 @@ DEFINE_PER_CPU(unsigned long, process_counts) = 0;

__cacheline_aligned DEFINE_RWLOCK(tasklist_lock); /* outer */

-DEFINE_TRACE(sched_process_fork);
-
int nr_processes(void)
{
int cpu;
diff --git a/kernel/irq/handle.c b/kernel/irq/handle.c
index d82142b..983d8be 100644
--- a/kernel/irq/handle.c
+++ b/kernel/irq/handle.c
@@ -17,9 +17,11 @@
#include <linux/kernel_stat.h>
#include <linux/rculist.h>
#include <linux/hash.h>
-#include <trace/irq.h>
#include <linux/bootmem.h>

+#define CREATE_TRACE_POINTS
+#include <trace/irq.h>
+
#include "internals.h"

/*
@@ -348,9 +350,6 @@ static void warn_no_thread(unsigned int irq, struct irqaction *action)
"but no thread function available.", irq, action->name);
}

-DEFINE_TRACE(irq_handler_entry);
-DEFINE_TRACE(irq_handler_exit);
-
/**
* handle_IRQ_event - irq action chain handler
* @irq: the interrupt number
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 4ebaf85..e1c7692 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -21,9 +21,6 @@ static DEFINE_SPINLOCK(kthread_create_lock);
static LIST_HEAD(kthread_create_list);
struct task_struct *kthreadd_task;

-DEFINE_TRACE(sched_kthread_stop);
-DEFINE_TRACE(sched_kthread_stop_ret);
-
struct kthread_create_info
{
/* Information passed to kthread() from kthreadd. */
diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index c4582a6..257f21a 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -42,12 +42,14 @@
#include <linux/hash.h>
#include <linux/ftrace.h>
#include <linux/stringify.h>
-#include <trace/lockdep.h>

#include <asm/sections.h>

#include "lockdep_internals.h"

+#define CREATE_TRACE_POINTS
+#include <trace/lockdep.h>
+
#ifdef CONFIG_PROVE_LOCKING
int prove_locking = 1;
module_param(prove_locking, int, 0644);
@@ -2929,8 +2931,6 @@ void lock_set_class(struct lockdep_map *lock, const char *name,
}
EXPORT_SYMBOL_GPL(lock_set_class);

-DEFINE_TRACE(lock_acquire);
-
/*
* We are not always called with irqs disabled - do that here,
* and also avoid lockdep recursion:
@@ -2957,8 +2957,6 @@ void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
}
EXPORT_SYMBOL_GPL(lock_acquire);

-DEFINE_TRACE(lock_release);
-
void lock_release(struct lockdep_map *lock, int nested,
unsigned long ip)
{
@@ -3061,8 +3059,6 @@ found_it:
put_lock_stats(stats);
}

-DEFINE_TRACE(lock_acquired);
-
static void
__lock_acquired(struct lockdep_map *lock, unsigned long ip)
{
@@ -3118,8 +3114,6 @@ found_it:
lock->ip = ip;
}

-DEFINE_TRACE(lock_contended);
-
void lock_contended(struct lockdep_map *lock, unsigned long ip)
{
unsigned long flags;
diff --git a/kernel/sched.c b/kernel/sched.c
index 5724508..e6d4518 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -72,13 +72,15 @@
#include <linux/debugfs.h>
#include <linux/ctype.h>
#include <linux/ftrace.h>
-#include <trace/sched.h>

#include <asm/tlb.h>
#include <asm/irq_regs.h>

#include "sched_cpupri.h"

+#define CREATE_TRACE_POINTS
+#include <trace/sched.h>
+
/*
* Convert user-nice values [ -20 ... 0 ... 19 ]
* to static priority [ MAX_RT_PRIO..MAX_PRIO-1 ],
@@ -118,12 +120,6 @@
*/
#define RUNTIME_INF ((u64)~0ULL)

-DEFINE_TRACE(sched_wait_task);
-DEFINE_TRACE(sched_wakeup);
-DEFINE_TRACE(sched_wakeup_new);
-DEFINE_TRACE(sched_switch);
-DEFINE_TRACE(sched_migrate_task);
-
#ifdef CONFIG_SMP

static void double_rq_lock(struct rq *rq1, struct rq *rq2);
diff --git a/kernel/signal.c b/kernel/signal.c
index d803473..1d5703f 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -41,8 +41,6 @@

static struct kmem_cache *sigqueue_cachep;

-DEFINE_TRACE(sched_signal_send);
-
static void __user *sig_handler(struct task_struct *t, int sig)
{
return t->sighand->action[sig - 1].sa.sa_handler;
diff --git a/kernel/softirq.c b/kernel/softirq.c
index 2fecefa..a2d9b45 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -186,9 +186,6 @@ EXPORT_SYMBOL(local_bh_enable_ip);
*/
#define MAX_SOFTIRQ_RESTART 10

-DEFINE_TRACE(softirq_entry);
-DEFINE_TRACE(softirq_exit);
-
asmlinkage void __do_softirq(void)
{
struct softirq_action *h;
diff --git a/mm/util.c b/mm/util.c
index 2599e83..0e74a22 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -4,9 +4,11 @@
#include <linux/module.h>
#include <linux/err.h>
#include <linux/sched.h>
-#include <linux/tracepoint.h>
#include <asm/uaccess.h>

+#define CREATE_TRACE_POINTS
+#include <trace/kmem.h>
+
/**
* kstrdup - allocate space for and copy an existing string
* @s: the string to duplicate
@@ -239,13 +241,6 @@ int __attribute__((weak)) get_user_pages_fast(unsigned long start,
EXPORT_SYMBOL_GPL(get_user_pages_fast);

/* Tracepoints definitions. */
-DEFINE_TRACE(kmalloc);
-DEFINE_TRACE(kmem_cache_alloc);
-DEFINE_TRACE(kmalloc_node);
-DEFINE_TRACE(kmem_cache_alloc_node);
-DEFINE_TRACE(kfree);
-DEFINE_TRACE(kmem_cache_free);
-
EXPORT_TRACEPOINT_SYMBOL(kmalloc);
EXPORT_TRACEPOINT_SYMBOL(kmem_cache_alloc);
EXPORT_TRACEPOINT_SYMBOL(kmalloc_node);
diff --git a/net/core/net-traces.c b/net/core/net-traces.c
index c8fb456..8017720 100644
--- a/net/core/net-traces.c
+++ b/net/core/net-traces.c
@@ -19,11 +19,11 @@
#include <linux/workqueue.h>
#include <linux/netlink.h>
#include <linux/net_dropmon.h>
-#include <trace/skb.h>

#include <asm/unaligned.h>
#include <asm/bitops.h>

+#define CREATE_TRACE_POINTS
+#include <trace/skb.h>

-DEFINE_TRACE(kfree_skb);
EXPORT_TRACEPOINT_SYMBOL_GPL(kfree_skb);
--
1.6.2.1

--


2009-04-14 23:44:22

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines

How about something like this:

>From f7fa71c046bc383706bf58503c4b75e05e252152 Mon Sep 17 00:00:00 2001
From: Jeremy Fitzhardinge <[email protected]>
Date: Tue, 14 Apr 2009 16:41:18 -0700
Subject: [PATCH] tracing: move __DO_TRACE out of line

Mainly simplify linux/tracepoint.h's include dependencies (removes
rcupdate.h), but it can't help with icache locality, since it
definitely moves the code out of line, rather than relying on gcc
to do it.

Signed-off-by: Jeremy Fitzhardinge <[email protected]>

diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index 4353f3f..1052e33 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -15,7 +15,6 @@
*/

#include <linux/types.h>
-#include <linux/rcupdate.h>

struct module;
struct tracepoint;
@@ -42,19 +41,20 @@ struct tracepoint {
* it_func[0] is never NULL because there is at least one element in the array
* when the array itself is non NULL.
*/
-#define __DO_TRACE(tp, proto, args) \
- do { \
+#define DEFINE_DO_TRACE(name, proto, args) \
+ void __do_trace_##name(struct tracepoint *tp, TP_PROTO(proto)) \
+ { \
void **it_func; \
\
rcu_read_lock_sched_notrace(); \
- it_func = rcu_dereference((tp)->funcs); \
+ it_func = rcu_dereference(tp->funcs); \
if (it_func) { \
do { \
((void(*)(proto))(*it_func))(args); \
} while (*(++it_func)); \
} \
rcu_read_unlock_sched_notrace(); \
- } while (0)
+ }

/*
* Make sure the alignment of the structure in the __tracepoints section will
@@ -63,11 +63,13 @@ struct tracepoint {
*/
#define DECLARE_TRACE(name, proto, args) \
extern struct tracepoint __tracepoint_##name; \
+ extern void __do_trace_##name(struct tracepoint *tp, \
+ TP_PROTO(proto)); \
static inline void trace_##name(proto) \
{ \
if (unlikely(__tracepoint_##name.state)) \
- __DO_TRACE(&__tracepoint_##name, \
- TP_PROTO(proto), TP_ARGS(args)); \
+ __do_trace_##name(&__tracepoint_##name, \
+ TP_ARGS(args)); \
} \
static inline int register_trace_##name(void (*probe)(proto)) \
{ \
@@ -151,10 +153,7 @@ extern int tracepoint_get_iter_range(struct tracepoint **tracepoint,
* probe unregistration and the end of module exit to make sure there is no
* caller executing a probe when it is freed.
*/
-static inline void tracepoint_synchronize_unregister(void)
-{
- synchronize_sched();
-}
+extern void tracepoint_synchronize_unregister(void);

#define PARAMS(args...) args

diff --git a/include/trace/define_trace.h b/include/trace/define_trace.h
index 980eb66..1a0502e 100644
--- a/include/trace/define_trace.h
+++ b/include/trace/define_trace.h
@@ -24,14 +24,17 @@

#undef TRACE_EVENT
#define TRACE_EVENT(name, proto, args, tstruct, assign, print) \
+ DEFINE_DO_TRACE(name, TP_PROTO(proto), TP_ARGS(args)) \
DEFINE_TRACE(name)

#undef TRACE_FORMAT
-#define TRACE_FORMAT(name, proto, args, print) \
+#define TRACE_FORMAT(name, proto, args, print) \
+ DEFINE_DO_TRACE(name, TP_PROTO(proto), TP_ARGS(args)) \
DEFINE_TRACE(name)

#undef DECLARE_TRACE
-#define DECLARE_TRACE(name, proto, args) \
+#define DECLARE_TRACE(name, proto, args) \
+ DEFINE_DO_TRACE(name, TP_PROTO(proto), TP_ARGS(args)) \
DEFINE_TRACE(name)

#undef TRACE_INCLUDE
diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
index 1ef5d3a..6ac1f48 100644
--- a/kernel/tracepoint.c
+++ b/kernel/tracepoint.c
@@ -545,6 +545,12 @@ void tracepoint_iter_reset(struct tracepoint_iter *iter)
}
EXPORT_SYMBOL_GPL(tracepoint_iter_reset);

+void tracepoint_synchronize_unregister(void)
+{
+ synchronize_sched();
+}
+EXPORT_SYMBOL_GPL(tracepoint_synchronize_unregister);
+
#ifdef CONFIG_MODULES

int tracepoint_module_notify(struct notifier_block *self,

2009-04-15 01:46:21

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines

* Jeremy Fitzhardinge ([email protected]) wrote:
> How about something like this:
>
> From f7fa71c046bc383706bf58503c4b75e05e252152 Mon Sep 17 00:00:00 2001
> From: Jeremy Fitzhardinge <[email protected]>
> Date: Tue, 14 Apr 2009 16:41:18 -0700
> Subject: [PATCH] tracing: move __DO_TRACE out of line
>
> Mainly simplify linux/tracepoint.h's include dependencies (removes
> rcupdate.h), but it can't help with icache locality, since it
> definitely moves the code out of line, rather than relying on gcc
> to do it.
>
> Signed-off-by: Jeremy Fitzhardinge <[email protected]>
>
> diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
> index 4353f3f..1052e33 100644
> --- a/include/linux/tracepoint.h
> +++ b/include/linux/tracepoint.h
> @@ -15,7 +15,6 @@
> */
>
> #include <linux/types.h>
> -#include <linux/rcupdate.h>
>
> struct module;
> struct tracepoint;
> @@ -42,19 +41,20 @@ struct tracepoint {
> * it_func[0] is never NULL because there is at least one element in the array
> * when the array itself is non NULL.
> */
> -#define __DO_TRACE(tp, proto, args) \
> - do { \
> +#define DEFINE_DO_TRACE(name, proto, args) \
> + void __do_trace_##name(struct tracepoint *tp, TP_PROTO(proto)) \

I fear that won't work with "void" prototype. If we need this kind of
flexibility, we will need to create a special case for empty prototype.

Mathieu

> + { \
> void **it_func; \
> \
> rcu_read_lock_sched_notrace(); \
> - it_func = rcu_dereference((tp)->funcs); \
> + it_func = rcu_dereference(tp->funcs); \
> if (it_func) { \
> do { \
> ((void(*)(proto))(*it_func))(args); \
> } while (*(++it_func)); \
> } \
> rcu_read_unlock_sched_notrace(); \
> - } while (0)
> + }
>
> /*
> * Make sure the alignment of the structure in the __tracepoints section will
> @@ -63,11 +63,13 @@ struct tracepoint {
> */
> #define DECLARE_TRACE(name, proto, args) \
> extern struct tracepoint __tracepoint_##name; \
> + extern void __do_trace_##name(struct tracepoint *tp, \
> + TP_PROTO(proto)); \
> static inline void trace_##name(proto) \
> { \
> if (unlikely(__tracepoint_##name.state)) \
> - __DO_TRACE(&__tracepoint_##name, \
> - TP_PROTO(proto), TP_ARGS(args)); \
> + __do_trace_##name(&__tracepoint_##name, \
> + TP_ARGS(args)); \
> } \
> static inline int register_trace_##name(void (*probe)(proto)) \
> { \
> @@ -151,10 +153,7 @@ extern int tracepoint_get_iter_range(struct tracepoint **tracepoint,
> * probe unregistration and the end of module exit to make sure there is no
> * caller executing a probe when it is freed.
> */
> -static inline void tracepoint_synchronize_unregister(void)
> -{
> - synchronize_sched();
> -}
> +extern void tracepoint_synchronize_unregister(void);
>
> #define PARAMS(args...) args
>
> diff --git a/include/trace/define_trace.h b/include/trace/define_trace.h
> index 980eb66..1a0502e 100644
> --- a/include/trace/define_trace.h
> +++ b/include/trace/define_trace.h
> @@ -24,14 +24,17 @@
>
> #undef TRACE_EVENT
> #define TRACE_EVENT(name, proto, args, tstruct, assign, print) \
> + DEFINE_DO_TRACE(name, TP_PROTO(proto), TP_ARGS(args)) \
> DEFINE_TRACE(name)
>
> #undef TRACE_FORMAT
> -#define TRACE_FORMAT(name, proto, args, print) \
> +#define TRACE_FORMAT(name, proto, args, print) \
> + DEFINE_DO_TRACE(name, TP_PROTO(proto), TP_ARGS(args)) \
> DEFINE_TRACE(name)
>
> #undef DECLARE_TRACE
> -#define DECLARE_TRACE(name, proto, args) \
> +#define DECLARE_TRACE(name, proto, args) \
> + DEFINE_DO_TRACE(name, TP_PROTO(proto), TP_ARGS(args)) \
> DEFINE_TRACE(name)
>
> #undef TRACE_INCLUDE
> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
> index 1ef5d3a..6ac1f48 100644
> --- a/kernel/tracepoint.c
> +++ b/kernel/tracepoint.c
> @@ -545,6 +545,12 @@ void tracepoint_iter_reset(struct tracepoint_iter *iter)
> }
> EXPORT_SYMBOL_GPL(tracepoint_iter_reset);
>
> +void tracepoint_synchronize_unregister(void)
> +{
> + synchronize_sched();
> +}
> +EXPORT_SYMBOL_GPL(tracepoint_synchronize_unregister);
> +
> #ifdef CONFIG_MODULES
>
> int tracepoint_module_notify(struct notifier_block *self,
>
>

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-04-15 07:05:01

by Zhao Lei

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines

Steven Rostedt wrote:
> From: Steven Rostedt <[email protected]>
>
> This patch lowers the number of places a developer must modify to add
> new tracepoints. The current method to add a new tracepoint
> into an existing system is to write the trace point macro in the
> trace header with one of the macros TRACE_EVENT, TRACE_FORMAT or
> DECLARE_TRACE, then they must add the same named item into the C file
> with the macro DEFINE_TRACE(name) and then add the trace point.
>
> This change cuts out the needing to add the DEFINE_TRACE(name).
> Every file that uses the tracepoint must still include the trace/<type>.h
> file, but the one C file must also add a define before the including
> of that file.
>
> #define CREATE_TRACE_POINTS
> #include <trace/mytrace.h>
>
> This will cause the trace/mytrace.h file to also produce the C code
> necessary to implement the trace point.
>
> Note, if more than one trace/<type>.h is used to create the C code
> it is best to list them all together.
>
> #define CREATE_TRACE_POINTS
> #include <trace/foo.h>
> #include <trace/bar.h>
> #include <trace/fido.h>
>
> Thanks to Mathieu Desnoyers and Christoph Hellwig for coming up with
> the cleaner solution of the define above the includes over my first
> design to have the C code include a "special" header.
>
> This patch converts sched, irq and lockdep and skb to use this new
> method.
>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Neil Horman <[email protected]>
> Cc: Zhao Lei <[email protected]>
> Cc: Eduard - Gabriel Munteanu <[email protected]>
> Cc: Pekka Enberg <[email protected]>
> Signed-off-by: Steven Rostedt <[email protected]>
> ---
> include/trace/define_trace.h | 75 ++++++++++++++++++++++++++++++++++++++++++
> include/trace/irq.h | 5 ++-
> include/trace/kmem.h | 4 ++-
> include/trace/lockdep.h | 3 ++
> include/trace/sched.h | 3 ++
> include/trace/skb.h | 3 ++
> kernel/exit.c | 4 --
> kernel/fork.c | 2 -
> kernel/irq/handle.c | 7 ++--
> kernel/kthread.c | 3 --
> kernel/lockdep.c | 12 ++-----
> kernel/sched.c | 10 ++----
> kernel/signal.c | 2 -
> kernel/softirq.c | 3 --
> mm/util.c | 11 ++----
> net/core/net-traces.c | 4 +-
> 16 files changed, 105 insertions(+), 46 deletions(-)
> create mode 100644 include/trace/define_trace.h
>
> diff --git a/include/trace/define_trace.h b/include/trace/define_trace.h
> new file mode 100644
> index 0000000..de9dc7d
> --- /dev/null
> +++ b/include/trace/define_trace.h
> @@ -0,0 +1,75 @@
> +/*
> + * Trace files that want to automate creationg of all tracepoints defined
> + * in their file should include this file. The following are macros that the
> + * trace file may define:
> + *
> + * TRACE_SYSTEM defines the system the tracepoint is for
> + *
> + * TRACE_INCLUDE_FILE if the file name is something other than TRACE_SYSTEM.h
> + * This macro may be defined to tell define_trace.h what file to include.
> + * Note, leave off the ".h".
> + *
> + * TRACE_INCLUDE_PATH if the path is something other than core kernel include/trace
> + * then this macro can define the path to use. Note, the path is relative to
> + * define_trace.h, not the file including it. Full path names for out of tree
> + * modules must be used.
> + */
> +
> +#ifdef CREATE_TRACE_POINTS
> +
> +/* Prevent recursion */
> +#undef CREATE_TRACE_POINTS
> +
> +#include <linux/stringify.h>
> +
> +#undef TRACE_EVENT
> +#define TRACE_EVENT(name, proto, args, tstruct, assign, print) \
> + DEFINE_TRACE(name)
> +
> +#undef TRACE_FORMAT
> +#define TRACE_FORMAT(name, proto, args, print) \
> + DEFINE_TRACE(name)
> +
> +#undef DECLARE_TRACE
> +#define DECLARE_TRACE(name, proto, args) \
> + DEFINE_TRACE(name)
> +
> +#undef TRACE_INCLUDE
> +#undef __TRACE_INCLUDE
> +
> +#ifndef TRACE_INCLUDE_FILE
> +# define TRACE_INCLUDE_FILE TRACE_SYSTEM
> +# define UNDEF_TRACE_INCLUDE_FILE
> +#endif
> +
> +#ifndef TRACE_INCLUDE_PATH
> +# define __TRACE_INCLUDE(system) <trace/system.h>
> +# define UNDEF_TRACE_INCLUDE_FILE
> +#else
> +# define __TRACE_INCLUDE(system) __stringify(TRACE_INCLUDE_PATH/system.h)
> +#endif
> +
> +# define TRACE_INCLUDE(system) __TRACE_INCLUDE(system)
> +
> +/* Let the trace headers be reread */
> +#define TRACE_HEADER_MULTI_READ
> +
> +#include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
Hello, Steven

Include header file again is a power trap full of imagination.
It is great... but maybe we can choose a stupid way if we have.
And, TRACE_SYSTEM must be same as header file name if we use
current way.

How about just
#define TRACE_EVENT(name, proto, args, fmt) \
DECLARE_TRACE(name, PARAMS(proto), PARAMS(args)) \
DEFINE_TRACE(name)
in tracepoint.h?

It means we always define both whether for tracepoint provider or for
tracepoint user.(and make compiler a bit slow)

Thanks
Zhaolei

> +
> +#undef TRACE_HEADER_MULTI_READ
> +
> +/* Only undef what we defined in this file */
> +#ifdef UNDEF_TRACE_INCLUDE_FILE
> +# undef TRACE_INCLUDE_PATH
> +# undef UNDEF_TRACE_INCLUDE_FILE
> +#endif
> +
> +#ifdef UNDEF_TRACE_INCLUDE_FILE
> +# undef TRACE_INCLUDE_PATH
> +# undef UNDEF_TRACE_INCLUDE_FILE
> +#endif
> +
> +/* We may be processing more files */
> +#define CREATE_TRACE_POINTS
> +
> +#endif /* CREATE_TRACE_POINTS */
> diff --git a/include/trace/irq.h b/include/trace/irq.h
> index 04ab4c6..75e3468 100644
> --- a/include/trace/irq.h
> +++ b/include/trace/irq.h
> @@ -51,4 +51,7 @@ TRACE_FORMAT(softirq_exit,
> TP_FMT("softirq=%d action=%s", (int)(h - vec), softirq_to_name[h-vec])
> );
>
> -#endif
> +#endif /* _TRACE_IRQ_H */
> +
> +/* This part must be outside protection */
> +#include <trace/define_trace.h>
> diff --git a/include/trace/kmem.h b/include/trace/kmem.h
> index d7d1218..c22c42f 100644
> --- a/include/trace/kmem.h
> +++ b/include/trace/kmem.h
> @@ -188,5 +188,7 @@ TRACE_EVENT(kmem_cache_free,
>
> TP_printk("call_site=%lx ptr=%p", __entry->call_site, __entry->ptr)
> );
> +#endif /* _TRACE_KMEM_H */
>
> -#endif
> +/* This part must be outside protection */
> +#include <trace/define_trace.h>
> diff --git a/include/trace/lockdep.h b/include/trace/lockdep.h
> index 8ee7900..4d301e7 100644
> --- a/include/trace/lockdep.h
> +++ b/include/trace/lockdep.h
> @@ -55,3 +55,6 @@ TRACE_EVENT(lock_acquired,
> #endif
>
> #endif /* _TRACE_LOCKDEP_H */
> +
> +/* This part must be outside protection */
> +#include <trace/define_trace.h>
> diff --git a/include/trace/sched.h b/include/trace/sched.h
> index 5b1cf4a..ffa1cab 100644
> --- a/include/trace/sched.h
> +++ b/include/trace/sched.h
> @@ -334,3 +334,6 @@ TRACE_EVENT(sched_signal_send,
> );
>
> #endif /* _TRACE_SCHED_H */
> +
> +/* This part must be outside protection */
> +#include <trace/define_trace.h>
> diff --git a/include/trace/skb.h b/include/trace/skb.h
> index e6fd281..1e8fabb 100644
> --- a/include/trace/skb.h
> +++ b/include/trace/skb.h
> @@ -35,3 +35,6 @@ TRACE_EVENT(kfree_skb,
> );
>
> #endif /* _TRACE_SKB_H */
> +
> +/* This part must be outside protection */
> +#include <trace/define_trace.h>
> diff --git a/kernel/exit.c b/kernel/exit.c
> index abf9cf3..2fe9d2c 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -56,10 +56,6 @@
> #include <asm/mmu_context.h>
> #include "cred-internals.h"
>
> -DEFINE_TRACE(sched_process_free);
> -DEFINE_TRACE(sched_process_exit);
> -DEFINE_TRACE(sched_process_wait);
> -
> static void exit_mm(struct task_struct * tsk);
>
> static void __unhash_process(struct task_struct *p)
> diff --git a/kernel/fork.c b/kernel/fork.c
> index b9e2edd..4bebf26 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -83,8 +83,6 @@ DEFINE_PER_CPU(unsigned long, process_counts) = 0;
>
> __cacheline_aligned DEFINE_RWLOCK(tasklist_lock); /* outer */
>
> -DEFINE_TRACE(sched_process_fork);
> -
> int nr_processes(void)
> {
> int cpu;
> diff --git a/kernel/irq/handle.c b/kernel/irq/handle.c
> index d82142b..983d8be 100644
> --- a/kernel/irq/handle.c
> +++ b/kernel/irq/handle.c
> @@ -17,9 +17,11 @@
> #include <linux/kernel_stat.h>
> #include <linux/rculist.h>
> #include <linux/hash.h>
> -#include <trace/irq.h>
> #include <linux/bootmem.h>
>
> +#define CREATE_TRACE_POINTS
> +#include <trace/irq.h>
> +
> #include "internals.h"
>
> /*
> @@ -348,9 +350,6 @@ static void warn_no_thread(unsigned int irq, struct irqaction *action)
> "but no thread function available.", irq, action->name);
> }
>
> -DEFINE_TRACE(irq_handler_entry);
> -DEFINE_TRACE(irq_handler_exit);
> -
> /**
> * handle_IRQ_event - irq action chain handler
> * @irq: the interrupt number
> diff --git a/kernel/kthread.c b/kernel/kthread.c
> index 4ebaf85..e1c7692 100644
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -21,9 +21,6 @@ static DEFINE_SPINLOCK(kthread_create_lock);
> static LIST_HEAD(kthread_create_list);
> struct task_struct *kthreadd_task;
>
> -DEFINE_TRACE(sched_kthread_stop);
> -DEFINE_TRACE(sched_kthread_stop_ret);
> -
> struct kthread_create_info
> {
> /* Information passed to kthread() from kthreadd. */
> diff --git a/kernel/lockdep.c b/kernel/lockdep.c
> index c4582a6..257f21a 100644
> --- a/kernel/lockdep.c
> +++ b/kernel/lockdep.c
> @@ -42,12 +42,14 @@
> #include <linux/hash.h>
> #include <linux/ftrace.h>
> #include <linux/stringify.h>
> -#include <trace/lockdep.h>
>
> #include <asm/sections.h>
>
> #include "lockdep_internals.h"
>
> +#define CREATE_TRACE_POINTS
> +#include <trace/lockdep.h>
> +
> #ifdef CONFIG_PROVE_LOCKING
> int prove_locking = 1;
> module_param(prove_locking, int, 0644);
> @@ -2929,8 +2931,6 @@ void lock_set_class(struct lockdep_map *lock, const char *name,
> }
> EXPORT_SYMBOL_GPL(lock_set_class);
>
> -DEFINE_TRACE(lock_acquire);
> -
> /*
> * We are not always called with irqs disabled - do that here,
> * and also avoid lockdep recursion:
> @@ -2957,8 +2957,6 @@ void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
> }
> EXPORT_SYMBOL_GPL(lock_acquire);
>
> -DEFINE_TRACE(lock_release);
> -
> void lock_release(struct lockdep_map *lock, int nested,
> unsigned long ip)
> {
> @@ -3061,8 +3059,6 @@ found_it:
> put_lock_stats(stats);
> }
>
> -DEFINE_TRACE(lock_acquired);
> -
> static void
> __lock_acquired(struct lockdep_map *lock, unsigned long ip)
> {
> @@ -3118,8 +3114,6 @@ found_it:
> lock->ip = ip;
> }
>
> -DEFINE_TRACE(lock_contended);
> -
> void lock_contended(struct lockdep_map *lock, unsigned long ip)
> {
> unsigned long flags;
> diff --git a/kernel/sched.c b/kernel/sched.c
> index 5724508..e6d4518 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -72,13 +72,15 @@
> #include <linux/debugfs.h>
> #include <linux/ctype.h>
> #include <linux/ftrace.h>
> -#include <trace/sched.h>
>
> #include <asm/tlb.h>
> #include <asm/irq_regs.h>
>
> #include "sched_cpupri.h"
>
> +#define CREATE_TRACE_POINTS
> +#include <trace/sched.h>
> +
> /*
> * Convert user-nice values [ -20 ... 0 ... 19 ]
> * to static priority [ MAX_RT_PRIO..MAX_PRIO-1 ],
> @@ -118,12 +120,6 @@
> */
> #define RUNTIME_INF ((u64)~0ULL)
>
> -DEFINE_TRACE(sched_wait_task);
> -DEFINE_TRACE(sched_wakeup);
> -DEFINE_TRACE(sched_wakeup_new);
> -DEFINE_TRACE(sched_switch);
> -DEFINE_TRACE(sched_migrate_task);
> -
> #ifdef CONFIG_SMP
>
> static void double_rq_lock(struct rq *rq1, struct rq *rq2);
> diff --git a/kernel/signal.c b/kernel/signal.c
> index d803473..1d5703f 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -41,8 +41,6 @@
>
> static struct kmem_cache *sigqueue_cachep;
>
> -DEFINE_TRACE(sched_signal_send);
> -
> static void __user *sig_handler(struct task_struct *t, int sig)
> {
> return t->sighand->action[sig - 1].sa.sa_handler;
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index 2fecefa..a2d9b45 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -186,9 +186,6 @@ EXPORT_SYMBOL(local_bh_enable_ip);
> */
> #define MAX_SOFTIRQ_RESTART 10
>
> -DEFINE_TRACE(softirq_entry);
> -DEFINE_TRACE(softirq_exit);
> -
> asmlinkage void __do_softirq(void)
> {
> struct softirq_action *h;
> diff --git a/mm/util.c b/mm/util.c
> index 2599e83..0e74a22 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -4,9 +4,11 @@
> #include <linux/module.h>
> #include <linux/err.h>
> #include <linux/sched.h>
> -#include <linux/tracepoint.h>
> #include <asm/uaccess.h>
>
> +#define CREATE_TRACE_POINTS
> +#include <trace/kmem.h>
> +
> /**
> * kstrdup - allocate space for and copy an existing string
> * @s: the string to duplicate
> @@ -239,13 +241,6 @@ int __attribute__((weak)) get_user_pages_fast(unsigned long start,
> EXPORT_SYMBOL_GPL(get_user_pages_fast);
>
> /* Tracepoints definitions. */
> -DEFINE_TRACE(kmalloc);
> -DEFINE_TRACE(kmem_cache_alloc);
> -DEFINE_TRACE(kmalloc_node);
> -DEFINE_TRACE(kmem_cache_alloc_node);
> -DEFINE_TRACE(kfree);
> -DEFINE_TRACE(kmem_cache_free);
> -
> EXPORT_TRACEPOINT_SYMBOL(kmalloc);
> EXPORT_TRACEPOINT_SYMBOL(kmem_cache_alloc);
> EXPORT_TRACEPOINT_SYMBOL(kmalloc_node);
> diff --git a/net/core/net-traces.c b/net/core/net-traces.c
> index c8fb456..8017720 100644
> --- a/net/core/net-traces.c
> +++ b/net/core/net-traces.c
> @@ -19,11 +19,11 @@
> #include <linux/workqueue.h>
> #include <linux/netlink.h>
> #include <linux/net_dropmon.h>
> -#include <trace/skb.h>
>
> #include <asm/unaligned.h>
> #include <asm/bitops.h>
>
> +#define CREATE_TRACE_POINTS
> +#include <trace/skb.h>
>
> -DEFINE_TRACE(kfree_skb);
> EXPORT_TRACEPOINT_SYMBOL_GPL(kfree_skb);

2009-04-15 16:08:34

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines

Mathieu Desnoyers wrote:
> * Jeremy Fitzhardinge ([email protected]) wrote:
>> -#define __DO_TRACE(tp, proto, args) \
>> - do { \
>> +#define DEFINE_DO_TRACE(name, proto, args) \
>> + void __do_trace_##name(struct tracepoint *tp, TP_PROTO(proto)) \
>>
>
> I fear that won't work with "void" prototype. If we need this kind of
> flexibility, we will need to create a special case for empty prototype.
>

Yes, that has been a bit awkward. I couldn't find a way to create a
no-param tracepoint, and so ended up passing a dummy arg. Stupid C syntax.

On the other hand, I can get something that actually compiles this way...

J

2009-04-16 02:40:28

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines

* Jeremy Fitzhardinge ([email protected]) wrote:
> Mathieu Desnoyers wrote:
>> * Jeremy Fitzhardinge ([email protected]) wrote:
>>> -#define __DO_TRACE(tp, proto, args) \
>>> - do { \
>>> +#define DEFINE_DO_TRACE(name, proto, args) \
>>> + void __do_trace_##name(struct tracepoint *tp, TP_PROTO(proto)) \
>>>
>>
>> I fear that won't work with "void" prototype. If we need this kind of
>> flexibility, we will need to create a special case for empty prototype.
>>
>
> Yes, that has been a bit awkward. I couldn't find a way to create a
> no-param tracepoint, and so ended up passing a dummy arg. Stupid C
> syntax.
>
> On the other hand, I can get something that actually compiles this way...
>
> J

Is your only problem the fact that tracepoints include rcupdate.h ? This
can easily be solved by moving rcu_read_(un)lock_sched_notrace to a
rcu-update-<insert meaningful name here> and include this header in
rcupdate.h and tracepoint.h.

We could keep the indirection layer you proposed for synchronize_sched()
though, even if it adds an unnecessary function call. It's a slow path
anyway.

If by doing these modifications we succeed in keeping the "void"
parameters working _and_ make your stuff to compile, I think we would
have done something great. :-)

Mathieu

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-04-16 02:57:15

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines

Mathieu Desnoyers wrote:
> Is your only problem the fact that tracepoints include rcupdate.h ?

No. That was the first roadblock, which caused massive cyclic
dependencies between includes and consequent failure to define
everything required. I solved that by pushing __DO_TRACE out of line.
Everything since then is a separate issue.

> This
> can easily be solved by moving rcu_read_(un)lock_sched_notrace to a
> rcu-update-<insert meaningful name here> and include this header in
> rcupdate.h and tracepoint.h.
>
I suppose, but I think pushing __do_trace_##name out of line is cleaner
anyway. And I think it's very important that tracepoint.h have a
*absolutely minimal* #include set, so that it can be safely included in
as many contexts as possible. asm/paravirt.h is complex enough as it
is, and I really don't want tracepoint bringing in any extra headers at
all. linux/types.h is about the only acceptable one.

> If by doing these modifications we succeed in keeping the "void"
> parameters working _and_ make your stuff to compile, I think we would
> have done something great. :-)
>

The void issue is irritating, but relatively minor compared to the
rest. If everything else gets solved except for the need to pass a
dummy param to no-arg tracepoints, then I think it'll be a generally
useful facility.

J

2009-04-16 23:44:24

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines

* Jeremy Fitzhardinge ([email protected]) wrote:
> Mathieu Desnoyers wrote:
>> Is your only problem the fact that tracepoints include rcupdate.h ?
>
> No. That was the first roadblock, which caused massive cyclic
> dependencies between includes and consequent failure to define
> everything required. I solved that by pushing __DO_TRACE out of line.
> Everything since then is a separate issue.
>
>> This
>> can easily be solved by moving rcu_read_(un)lock_sched_notrace to a
>> rcu-update-<insert meaningful name here> and include this header in
>> rcupdate.h and tracepoint.h.
>>
> I suppose, but I think pushing __do_trace_##name out of line is cleaner
> anyway. And I think it's very important that tracepoint.h have a
> *absolutely minimal* #include set, so that it can be safely included in
> as many contexts as possible. asm/paravirt.h is complex enough as it
> is, and I really don't want tracepoint bringing in any extra headers at
> all. linux/types.h is about the only acceptable one.
>
>> If by doing these modifications we succeed in keeping the "void"
>> parameters working _and_ make your stuff to compile, I think we would
>> have done something great. :-)
>>
>
> The void issue is irritating, but relatively minor compared to the rest.
> If everything else gets solved except for the need to pass a dummy param
> to no-arg tracepoints, then I think it'll be a generally useful facility.
>

The other point I dislike about the out-of-line approach is that the
tracer will suffer from a pointless supplementary function call at each
event. Given how slow function calls are, at least on x86, I'd prefer
leaving the handler call chain inline, unless there is a very strong
reason not to do so.

I'll come up with a patch that leaves the tracepoint inline, but fixes
the header dependency.

Mathieu

> J
>

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-04-17 00:03:37

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines

Mathieu Desnoyers wrote:
> The other point I dislike about the out-of-line approach is that the
> tracer will suffer from a pointless supplementary function call at each
> event. Given how slow function calls are, at least on x86, I'd prefer
> leaving the handler call chain inline, unless there is a very strong
> reason not to do so.
>

Are they? They're generally considered to be "free", because the call
and return are predicted 100% accurately.

> I'll come up with a patch that leaves the tracepoint inline, but fixes
> the header dependency.
>
OK.

J

2009-04-17 00:13:42

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines

* Jeremy Fitzhardinge ([email protected]) wrote:
> Mathieu Desnoyers wrote:
>> The other point I dislike about the out-of-line approach is that the
>> tracer will suffer from a pointless supplementary function call at each
>> event. Given how slow function calls are, at least on x86, I'd prefer
>> leaving the handler call chain inline, unless there is a very strong
>> reason not to do so.
>>
>
> Are they? They're generally considered to be "free", because the call
> and return are predicted 100% accurately.
>

Adding a simple function call within the tracer fast path, in LTTng, has
a very measurable performance impact on the tbench workload. This is why
I don't use any function call-based trace clocks in LTTng, but rather my
own inline trace clock.

Mathieu

>> I'll come up with a patch that leaves the tracepoint inline, but fixes
>> the header dependency.
>>
> OK.
>
> J
>




--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-04-17 00:18:52

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines

Mathieu Desnoyers wrote:
>> Are they? They're generally considered to be "free", because the call
>> and return are predicted 100% accurately.
>>
>>
>
> Adding a simple function call within the tracer fast path, in LTTng, has
> a very measurable performance impact on the tbench workload. This is why
> I don't use any function call-based trace clocks in LTTng, but rather my
> own inline trace clock.

I'm a bit concerned about all the code that tracing puts inline though.
It seems it would put quite a lot of icache overhead on the codepath
when the tracepoint is disabled, not least because its duplicated in
every instance of the tracepoint. And if the compiler decides to put
the unlikely() branch code out of line, then that's the same as making
it a function call (except that if it is a function call, all the
tracepoints will share the same code, and get a higher likelihood of
getting icache hits).


J

2009-04-17 00:28:44

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines

* Jeremy Fitzhardinge ([email protected]) wrote:
> Mathieu Desnoyers wrote:
>>> Are they? They're generally considered to be "free", because the
>>> call and return are predicted 100% accurately.
>>>
>>>
>>
>> Adding a simple function call within the tracer fast path, in LTTng, has
>> a very measurable performance impact on the tbench workload. This is why
>> I don't use any function call-based trace clocks in LTTng, but rather my
>> own inline trace clock.
>
> I'm a bit concerned about all the code that tracing puts inline though.
> It seems it would put quite a lot of icache overhead on the codepath
> when the tracepoint is disabled, not least because its duplicated in
> every instance of the tracepoint. And if the compiler decides to put
> the unlikely() branch code out of line, then that's the same as making
> it a function call (except that if it is a function call, all the
> tracepoints will share the same code, and get a higher likelihood of
> getting icache hits).
>

"all this code" is actually :

rcu_read_lock_sched_notrace(); \
it_func = rcu_dereference((tp)->funcs); \
if (it_func) { \
do { \
((void(*)(proto))(*it_func))(args); \
} while (*(++it_func)); \
} \
rcu_read_unlock_sched_notrace(); \

Which does nothing more than disabling preemption and a for loop to
call all the tracepoint handlers. I don't see the big win in laying out
the stack to call this code out-of-line; we would just remove the
preempt disable and the loop, which are minimal compared to most
call stacks.

So basically, tracepoints are already just doing a function call, with a
few more bytes for preempt disable and multiple handler support.

About the compiler deciding to put the unlikely branch out-of-line, I've
never seen any function calls generated just for the sake of saving
those few bytes, that would be crazy of the part of the compiler.
However, it can (and should) freely put the stack setup in the coldest
cache-lines possible, which are reachable by a near jump.

Mathieu

>
> J
>

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-04-17 00:47:35

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines

Mathieu Desnoyers wrote:
> "all this code" is actually :
>
> rcu_read_lock_sched_notrace(); \
> it_func = rcu_dereference((tp)->funcs); \
> if (it_func) { \
> do { \
> ((void(*)(proto))(*it_func))(args); \
> } while (*(++it_func)); \
> } \
> rcu_read_unlock_sched_notrace(); \
>
> Which does nothing more than disabling preemption and a for loop to
> call all the tracepoint handlers. I don't see the big win in laying out
> the stack to call this code out-of-line; we would just remove the
> preempt disable and the loop, which are minimal compared to most
> call stacks.
>

Well, look at it from my perspective: Ingo has been repeatedly beating
me up for the overhead pvops adds to a native kernel, where it really is
just a (direct) function call. I want to instrument each pvop site with
a tracepoint so I can actually work out which calls are being called how
frequently to look for new optimisation opportunities.

I would guess the tracepoint code sequence is going to increase the
impact of each pvop call site by a fair bit, and that's not counting the
effects the extra register pressure will have. That's a pile of code to
add.

And frankly, that's fine by me, because I would expect this degree of
introspection to have some performance hit. But it does make the need
for per-subsystem tracing Kconfig entries fairly important, because I
don't think this would be acceptable to ship in a non-debug-everything
kernel build, even though other tracepoints might be.

> So basically, tracepoints are already just doing a function call, with a
> few more bytes for preempt disable and multiple handler support.
>
> About the compiler deciding to put the unlikely branch out-of-line, I've
> never seen any function calls generated just for the sake of saving
> those few bytes, that would be crazy of the part of the compiler.
> However, it can (and should) freely put the stack setup in the coldest
> cache-lines possible, which are reachable by a near jump.
>

No, it wouldn't generate a call. But if its going to put the code out
of line into cold cache-lines, then it may as well generate a call.

Anyway, the important point from my perspective is that tracepoint.h
have no #include dependencies beyond linux/types.h (compiler.h, etc).

J

2009-04-17 03:05:42

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH] tracepoints : let subsystem nop-out the tracepoints at build time

* Jeremy Fitzhardinge ([email protected]) wrote:
> Mathieu Desnoyers wrote:
>> "all this code" is actually :
>>
>> rcu_read_lock_sched_notrace(); \
>> it_func = rcu_dereference((tp)->funcs); \
>> if (it_func) { \
>> do { \
>> ((void(*)(proto))(*it_func))(args); \
>> } while (*(++it_func)); \
>> } \
>> rcu_read_unlock_sched_notrace(); \
>>
>> Which does nothing more than disabling preemption and a for loop to
>> call all the tracepoint handlers. I don't see the big win in laying out
>> the stack to call this code out-of-line; we would just remove the
>> preempt disable and the loop, which are minimal compared to most
>> call stacks.
>>
>
> Well, look at it from my perspective: Ingo has been repeatedly beating
> me up for the overhead pvops adds to a native kernel, where it really is
> just a (direct) function call. I want to instrument each pvop site with
> a tracepoint so I can actually work out which calls are being called how
> frequently to look for new optimisation opportunities.
>
> I would guess the tracepoint code sequence is going to increase the
> impact of each pvop call site by a fair bit, and that's not counting the
> effects the extra register pressure will have. That's a pile of code to
> add.
>
> And frankly, that's fine by me, because I would expect this degree of
> introspection to have some performance hit. But it does make the need
> for per-subsystem tracing Kconfig entries fairly important, because I
> don't think this would be acceptable to ship in a non-debug-everything
> kernel build, even though other tracepoints might be.
>

Agreed. Tracepoints might change the code surrounding the pvops in a
similar fashion as the pvops themselves would change the code.
Therefore, it makes sense to have a Kconfig option to enable the pvops
tracepoints.

In terms of tracepoints (with the DECLARE_TRACE/DEFINE_TRACE semantic),
we could have something like :

in include/trace/pvops.h :

#include <linux/tracepoint.h>

#ifdef CONFIG_PVOPS_TRACEPOINTS

#define DECLARE_PVOPS_TRACE DECLARE_TRACE
#define DEFINE_PVOPS_TRACE DEFINE_TRACE
#define EXPORT_PVOPS_TRACEPOINT_SYMBOL_GPL EXPORT_TRACEPOINT_SYMBOL_GPL
#define EXPORT_PVOPS_TRACEPOINT_SYMBOL EXPORT_TRACEPOINT_SYMBOL

#else /* !CONFIG_PVOPS_TRACEPOINTS */

#define DECLARE_PVOPS_TRACE DECLARE_TRACE_NOP
#define DEFINE_PVOPS_TRACE DEFINE_TRACE_NOP
#define EXPORT_PVOPS_TRACEPOINT_SYMBOL_GPL EXPORT_TRACEPOINT_SYMBOL_GPL_NOP
#define EXPORT_PVOPS_TRACEPOINT_SYMBOL EXPORT_TRACEPOINT_SYMBOL_NOP

#endif /* CONFIG_PVOPS_TRACEPOINTS */

And then do the declarations/definitions using the new

DECLARE_PVOPS_TRACE / DEFINE_PVOPS_TRACE.

For that you'll need the patch I am attaching below. I'll let Steven
figure out how to tweak TRACE_EVENT() to support this new tracepoint
feature.

>> So basically, tracepoints are already just doing a function call, with a
>> few more bytes for preempt disable and multiple handler support.
>>
>> About the compiler deciding to put the unlikely branch out-of-line, I've
>> never seen any function calls generated just for the sake of saving
>> those few bytes, that would be crazy of the part of the compiler.
>> However, it can (and should) freely put the stack setup in the coldest
>> cache-lines possible, which are reachable by a near jump.
>>
>
> No, it wouldn't generate a call. But if its going to put the code out
> of line into cold cache-lines, then it may as well generate a call.
>

Jumping out-of-line was somewhat faster than calling a function if I
recall well my performance tests. But that's all been done long ago.

And note that whenever the tracer becomes active, the out-of-line code
of busy tracepoints becomes cache-hot, which means that there is no more
cache line fetch to perform, which leaves the stack setup and other
overhead of function call/return vs 2*jump very measurable.

> Anyway, the important point from my perspective is that tracepoint.h
> have no #include dependencies beyond linux/types.h (compiler.h, etc).
>

Is preempt.h a problem ?

Here is the patch.

Mathieu


tracepoints : let subsystem nop-out the tracepoints at build time

Signed-off-by: Mathieu Desnoyers <[email protected]>
CC: Jeremy Fitzhardinge <[email protected]>
CC: "Paul E. McKenney" <[email protected]>
CC: Steven Rostedt <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: Andrew Morton <[email protected]>
CC: Christoph Hellwig <[email protected]>
---
include/linux/tracepoint.h | 38 ++++++++++++++++++++++----------------
1 file changed, 22 insertions(+), 16 deletions(-)

Index: linux.trees.git/include/linux/tracepoint.h
===================================================================
--- linux.trees.git.orig/include/linux/tracepoint.h 2009-04-16 22:40:26.000000000 -0400
+++ linux.trees.git/include/linux/tracepoint.h 2009-04-16 22:40:33.000000000 -0400
@@ -37,6 +37,24 @@ struct tracepoint {
#define TP_PROTO(args...) args
#define TP_ARGS(args...) args

+#define DECLARE_TRACE_NOP(name, proto, args) \
+ static inline void _do_trace_##name(struct tracepoint *tp, proto) \
+ { } \
+ static inline void trace_##name(proto) \
+ { } \
+ static inline int register_trace_##name(void (*probe)(proto)) \
+ { \
+ return -ENOSYS; \
+ } \
+ static inline int unregister_trace_##name(void (*probe)(proto)) \
+ { \
+ return -ENOSYS; \
+ }
+
+#define DEFINE_TRACE_NOP(name)
+#define EXPORT_TRACEPOINT_SYMBOL_GPL_NOP(name)
+#define EXPORT_TRACEPOINT_SYMBOL_NOP(name)
+
#ifdef CONFIG_TRACEPOINTS

/*
@@ -95,23 +113,11 @@ extern void tracepoint_update_probe_rang
struct tracepoint *end);

#else /* !CONFIG_TRACEPOINTS */
-#define DECLARE_TRACE(name, proto, args) \
- static inline void _do_trace_##name(struct tracepoint *tp, proto) \
- { } \
- static inline void trace_##name(proto) \
- { } \
- static inline int register_trace_##name(void (*probe)(proto)) \
- { \
- return -ENOSYS; \
- } \
- static inline int unregister_trace_##name(void (*probe)(proto)) \
- { \
- return -ENOSYS; \
- }

-#define DEFINE_TRACE(name)
-#define EXPORT_TRACEPOINT_SYMBOL_GPL(name)
-#define EXPORT_TRACEPOINT_SYMBOL(name)
+#define DECLARE_TRACE DECLARE_TRACE_NOP
+#define DEFINE_TRACE DEFINE_TRACE_NOP
+#define EXPORT_TRACEPOINT_SYMBOL_GPL EXPORT_TRACEPOINT_SYMBOL_GPL_NOP
+#define EXPORT_TRACEPOINT_SYMBOL EXPORT_TRACEPOINT_SYMBOL_NOP

static inline void tracepoint_update_probe_range(struct tracepoint *begin,
struct tracepoint *end)


--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-04-20 07:12:53

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines

Mathieu Desnoyers <[email protected]> writes:

> Given how slow function calls are, at least on x86,

That was with frame pointers right? Frame pointers tend to make
function calls slow.

-Andi
--
[email protected] -- Speaking for myself only.

2009-04-21 15:56:29

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines

* Andi Kleen ([email protected]) wrote:
> Mathieu Desnoyers <[email protected]> writes:
>
> > Given how slow function calls are, at least on x86,
>
> That was with frame pointers right? Frame pointers tend to make
> function calls slow.
>

Looking at my .config, CONFIG_FRAME_POINTER is disabled here. So we
should probably expect an even worse performance impact if we enable
them.

Mathieu


> -Andi
> --
> [email protected] -- Speaking for myself only.
>

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-04-21 17:18:46

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines

Mathieu Desnoyers wrote:
> * Andi Kleen ([email protected]) wrote:
>
>> Mathieu Desnoyers <[email protected]> writes:
>>
>>
>>> Given how slow function calls are, at least on x86,
>>>
>> That was with frame pointers right? Frame pointers tend to make
>> function calls slow.
>>
>>
>
> Looking at my .config, CONFIG_FRAME_POINTER is disabled here. So we
> should probably expect an even worse performance impact if we enable
> them.

I tried disabling frame pointers, but it looked to me like tracing
selects them. Did I misread, or perhaps it was some other config option
doing it...

J

2009-04-21 17:22:08

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines



On Tue, 21 Apr 2009, Jeremy Fitzhardinge wrote:

> Mathieu Desnoyers wrote:
> > * Andi Kleen ([email protected]) wrote:
> >
> > > Mathieu Desnoyers <[email protected]> writes:
> > >
> > >
> > > > Given how slow function calls are, at least on x86,
> > > >
> > > That was with frame pointers right? Frame pointers tend to make
> > > function calls slow.
> > >
> > >
> >
> > Looking at my .config, CONFIG_FRAME_POINTER is disabled here. So we
> > should probably expect an even worse performance impact if we enable
> > them.
>
> I tried disabling frame pointers, but it looked to me like tracing selects
> them. Did I misread, or perhaps it was some other config option doing it...

It is needed for the function tracer (gcc -pg wont work without it). It is
the only tracer that selects it.

-- Steve

2009-04-21 17:43:46

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines

Steven Rostedt wrote:
>
>> I tried disabling frame pointers, but it looked to me like tracing selects
>> them. Did I misread, or perhaps it was some other config option doing it...
>>
>
> It is needed for the function tracer (gcc -pg wont work without it). It is
> the only tracer that selects it.
>

Ah, right. I'm never sure what all the trace options actually control,
so I tend to enable them a bit indescriminately.

J

2009-04-21 20:28:47

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines

> It is needed for the function tracer (gcc -pg wont work without it). It is
> the only tracer that selects it.

FWIW i still have gcc patches to fix that. They were first stalled
on copyright assignment and then on gcc's merge window being closed,
but now with 4.5 open for game I hope to resubmit them soon
again.

With that you can use -pg without frame pointer, but you have
to supply a special mcount function that expects the different
stack layout.

-Andi

--
[email protected] -- Speaking for myself only.

2009-04-21 21:17:38

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines



On Tue, 21 Apr 2009, Andi Kleen wrote:

> > It is needed for the function tracer (gcc -pg wont work without it). It is
> > the only tracer that selects it.
>
> FWIW i still have gcc patches to fix that. They were first stalled
> on copyright assignment and then on gcc's merge window being closed,
> but now with 4.5 open for game I hope to resubmit them soon
> again.
>
> With that you can use -pg without frame pointer, but you have
> to supply a special mcount function that expects the different
> stack layout.

I think it was Ingo that let out the idea, and I'm starting to like it.

Perhaps we should fork off gcc and ship Linux with its own compiler. This
way we can optimize it for the kernel and not worry about any userland
optimizations.

I would like to do something like:

if (unlikely(err)) {
__section__(".error_sect") {
/* put error code here */
}
}

And have gcc in the error section (if it is big enough perhaps) do:

jmp .L123
.L124 [...]

and in the section ".error_sect" we would have:

.L123
/* error code here */
jmp .L124

We could do the same for trace points. That is, any part of code that
really would happen once in a while (error handling for one) we can move
off to its own section and keep hot paths hot.

-- Steve

2009-04-21 21:29:36

by Frank Ch. Eigler

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines

Hi -

On Tue, Apr 21, 2009 at 05:17:17PM -0400, Steven Rostedt wrote:

> [...] Perhaps we should fork off gcc and ship Linux with its own
> compiler. This way we can optimize it for the kernel and not worry
> about any userland optimizations.

In this regard, kernel land does not seem that unlike user land.

> if (unlikely(err)) {
> __section__(".error_sect") {
> /* put error code here */
> }
> }
>
> And have gcc in the error section (if it is big enough perhaps) do:
> jmp .L123
> .L124 [...]
> [...]
> jmp .L124

> We could do the same for trace points. That is, any part of code that
> really would happen once in a while (error handling for one) we can move
> off to its own section and keep hot paths hot.

This is called -freorder-blocks or -freorder-blocks-and-partition
(depending on how far you would like gcc to move unlikely blocks).

- FChE

2009-04-21 21:33:57

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines


[ removed [email protected] due to mail errors ]

On Tue, 21 Apr 2009, Frank Ch. Eigler wrote:

> Hi -
>
> On Tue, Apr 21, 2009 at 05:17:17PM -0400, Steven Rostedt wrote:
>
> > [...] Perhaps we should fork off gcc and ship Linux with its own
> > compiler. This way we can optimize it for the kernel and not worry
> > about any userland optimizations.
>
> In this regard, kernel land does not seem that unlike user land.
>
> > if (unlikely(err)) {
> > __section__(".error_sect") {
> > /* put error code here */
> > }
> > }
> >
> > And have gcc in the error section (if it is big enough perhaps) do:
> > jmp .L123
> > .L124 [...]
> > [...]
> > jmp .L124
>
> > We could do the same for trace points. That is, any part of code that
> > really would happen once in a while (error handling for one) we can move
> > off to its own section and keep hot paths hot.
>
> This is called -freorder-blocks or -freorder-blocks-and-partition
> (depending on how far you would like gcc to move unlikely blocks).

That does not let us pick and choose what and where to put the code.

But still, a fork of gcc would let us optimize it for the kernel, and not
for generic programs.

/me has been sitting too close to the furnace and must have been taking
up some of those fumes, to be considering a fork of gcc a good idea ;-)

-- Steve

2009-04-22 06:03:48

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines

> I think it was Ingo that let out the idea, and I'm starting to like it.
>
> Perhaps we should fork off gcc and ship Linux with its own compiler. This
> way we can optimize it for the kernel and not worry about any userland
> optimizations.
>
> I would like to do something like:
>
> if (unlikely(err)) {
> __section__(".error_sect") {


gcc already supports that, you don't need to fork anything. It's called
hot/cold partitioning. Basically it splits functions into hot and cold
and unlikely parts and all the cold/unlikely parts go into a separate
sections.

I think it's normally not enabled by default on x86 though, probably because
it doesn't help too much.

By default (unless you specify -fno-reorder-blocks) it does the same
without sections, just moving unlikely code out of line.

-Andi
--
[email protected] -- Speaking for myself only.

2009-04-22 06:24:32

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines


On Wed, 22 Apr 2009, Andi Kleen wrote:

> > I think it was Ingo that let out the idea, and I'm starting to like it.
> >
> > Perhaps we should fork off gcc and ship Linux with its own compiler. This
> > way we can optimize it for the kernel and not worry about any userland
> > optimizations.
> >
> > I would like to do something like:
> >
> > if (unlikely(err)) {
> > __section__(".error_sect") {
>
>
> gcc already supports that, you don't need to fork anything. It's called
> hot/cold partitioning. Basically it splits functions into hot and cold
> and unlikely parts and all the cold/unlikely parts go into a separate
> sections.
>
> I think it's normally not enabled by default on x86 though, probably because
> it doesn't help too much.
>
> By default (unless you specify -fno-reorder-blocks) it does the same
> without sections, just moving unlikely code out of line.

The unlikely code does not always get moved out that far. It still sits
inside a function, and looking at the tracepoint code it did not move it
far enough.

If you have a bunch of functions that each with an unlikely statement,
those unlikely sections will still be interleaved within the function
code.

In the case of tracepoints, it would be nice to move all the code that
sets up the function call out of the hot paths. If we could move it to its
own section, that would be much better.

Having all "unlikely"s go into a separate section would not help much,
since according to the branch profiler there are a lot of "unlikely"s in
the kernel that are not too unlikely.

If gcc can indeed move "unlikely" code completely out of the fast path,
and put it into its own sections, then I think we should go through the
kernel and start removing all "likely" and "unlikely"s that are not 99%
accurate. Then we can enable the separate section cold paths and perhaps
see a performance benefit.

-- Steve

2009-04-22 07:22:48

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines

On Wed, Apr 22, 2009 at 02:24:17AM -0400, Steven Rostedt wrote:
>
> On Wed, 22 Apr 2009, Andi Kleen wrote:
>
> > > I think it was Ingo that let out the idea, and I'm starting to like it.
> > >
> > > Perhaps we should fork off gcc and ship Linux with its own compiler. This
> > > way we can optimize it for the kernel and not worry about any userland
> > > optimizations.
> > >
> > > I would like to do something like:
> > >
> > > if (unlikely(err)) {
> > > __section__(".error_sect") {
> >
> >
> > gcc already supports that, you don't need to fork anything. It's called
> > hot/cold partitioning. Basically it splits functions into hot and cold
> > and unlikely parts and all the cold/unlikely parts go into a separate
> > sections.
> >
> > I think it's normally not enabled by default on x86 though, probably because
> > it doesn't help too much.
> >
> > By default (unless you specify -fno-reorder-blocks) it does the same
> > without sections, just moving unlikely code out of line.
>
> The unlikely code does not always get moved out that far. It still sits
> inside a function, and looking at the tracepoint code it did not move it
> far enough.

That's because you didn't enable the hot/cold partioning as I wrote.
These are separate options. By default it doesn't use partitions on x86,
but it can.

> If gcc can indeed move "unlikely" code completely out of the fast path,
> and put it into its own sections, then I think we should go through the
> kernel and start removing all "likely" and "unlikely"s that are not 99%
> accurate. Then we can enable the separate section cold paths and perhaps
> see a performance benefit.

iirc there wasn't much for using separate partitions with the usual
user space benchmarks (SpecCPU etc.) on x86. It helped a bit on POWER
apparently though.

-Andi

--
[email protected] -- Speaking for myself only.

2009-04-22 08:19:36

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH 2/8] tracing: create automated trace defines

* Steven Rostedt ([email protected]) wrote:
>
> [ removed [email protected] due to mail errors ]
>
> On Tue, 21 Apr 2009, Frank Ch. Eigler wrote:
>
> > Hi -
> >
> > On Tue, Apr 21, 2009 at 05:17:17PM -0400, Steven Rostedt wrote:
> >
> > > [...] Perhaps we should fork off gcc and ship Linux with its own
> > > compiler. This way we can optimize it for the kernel and not worry
> > > about any userland optimizations.
> >
> > In this regard, kernel land does not seem that unlike user land.
> >
> > > if (unlikely(err)) {
> > > __section__(".error_sect") {
> > > /* put error code here */
> > > }
> > > }
> > >
> > > And have gcc in the error section (if it is big enough perhaps) do:
> > > jmp .L123
> > > .L124 [...]
> > > [...]
> > > jmp .L124
> >
> > > We could do the same for trace points. That is, any part of code that
> > > really would happen once in a while (error handling for one) we can move
> > > off to its own section and keep hot paths hot.
> >
> > This is called -freorder-blocks or -freorder-blocks-and-partition
> > (depending on how far you would like gcc to move unlikely blocks).
>
> That does not let us pick and choose what and where to put the code.
>
> But still, a fork of gcc would let us optimize it for the kernel, and not
> for generic programs.
>
> /me has been sitting too close to the furnace and must have been taking
> up some of those fumes, to be considering a fork of gcc a good idea ;-)
>

I guess we should have been sitting near the same furnace then. I'm
unsure how different from the current gcc this can go, but it could be a
very interesting exercise. Just removing unneeded front ends could
probably help adding features much faster than if we have to support
Fortran, Java, etc.

Mathieu

> -- Steve
>

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68