2013-04-03 07:50:43

by Mel Gorman

[permalink] [raw]
Subject: systemtap broken by removal of register_timer_hook

Commit ba6fdda4 (profiling: Remove unused timer hook) removed
[un]register_timer_hook due to a lack of upstream users and a belief
that there were no out-of-tree users. However, systemtap uses it and
with that patch applied, some stap scripts fail with

WARNING: "unregister_timer_hook" [/tmp/stapJoLMxD/stap_de6e19eaf7bd94629aba9d6f56cdfca5_8156.ko] undefined!
WARNING: "register_timer_hook" [/tmp/stapJoLMxD/stap_de6e19eaf7bd94629aba9d6f56cdfca5_8156.ko] undefined!
Error inserting module '/tmp/stapJoLMxD/stap_de6e19eaf7bd94629aba9d6f56cdfca5_8156.ko': Unknown symbol in module
WARNING: /usr/bin/staprun exited with status: 1
Pass 5: run failed. Try again with another '--vp 00001' option.
Unexpected exit of STAP script at ./watch-dstate.pl line 305.

Can the patch be reverted or at least put a warning in place about it
being obsoleted until the systemtap folks come up with a replacement?

Systemtap folk, has his problem already been addressed?

--
Mel Gorman
SUSE Labs


2013-04-03 11:01:27

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: systemtap broken by removal of register_timer_hook

2013/4/3 Mel Gorman <[email protected]>:
> Commit ba6fdda4 (profiling: Remove unused timer hook) removed
> [un]register_timer_hook due to a lack of upstream users and a belief
> that there were no out-of-tree users. However, systemtap uses it and
> with that patch applied, some stap scripts fail with
>
> WARNING: "unregister_timer_hook" [/tmp/stapJoLMxD/stap_de6e19eaf7bd94629aba9d6f56cdfca5_8156.ko] undefined!
> WARNING: "register_timer_hook" [/tmp/stapJoLMxD/stap_de6e19eaf7bd94629aba9d6f56cdfca5_8156.ko] undefined!
> Error inserting module '/tmp/stapJoLMxD/stap_de6e19eaf7bd94629aba9d6f56cdfca5_8156.ko': Unknown symbol in module
> WARNING: /usr/bin/staprun exited with status: 1
> Pass 5: run failed. Try again with another '--vp 00001' option.
> Unexpected exit of STAP script at ./watch-dstate.pl line 305.

Ah I missed that.

>
> Can the patch be reverted or at least put a warning in place about it
> being obsoleted until the systemtap folks come up with a replacement?

Sometimes I don't mind keeping around code in the kernel for out of
tree users, depending on the case. But in this specific matter we have
more standard ways to do this kind of hook: kprobes, static
tracepoints. A tracepoint on the timer tick would be useful BTW, and
you could reuse it.

How does that sound to system tap guys? It would be nice if we can
avoid the feature-removal.txt step. I mean, I heard some doleful
whispers once when my pointer selected that file. Instead of opening
the file I just closed the directory instantly. I never told anybody
about that before, this is the first time. I also heard that somebody
added an entry to that file to schedule the removal of that file? This
is devilry.

2013-04-03 12:30:46

by Frank Ch. Eigler

[permalink] [raw]
Subject: Re: systemtap broken by removal of register_timer_hook

Frederic Weisbecker <[email protected]> writes:

> [...]
> Sometimes I don't mind keeping around code in the kernel for out of
> tree users, depending on the case. But in this specific matter we have
> more standard ways to do this kind of hook: kprobes, static
> tracepoints. A tracepoint on the timer tick would be useful BTW, and
> you could reuse it.

Thanks for noticing! IMO kprobes would be too heavyweight for this.
A new static tracepoint would be fine, especially if it passes the
same pt_regs* pointer as the former hook did.

- FChE

2013-04-03 12:49:57

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: systemtap broken by removal of register_timer_hook

2013/4/3 Frank Ch. Eigler <[email protected]>:
> Frederic Weisbecker <[email protected]> writes:
>
>> [...]
>> Sometimes I don't mind keeping around code in the kernel for out of
>> tree users, depending on the case. But in this specific matter we have
>> more standard ways to do this kind of hook: kprobes, static
>> tracepoints. A tracepoint on the timer tick would be useful BTW, and
>> you could reuse it.
>
> Thanks for noticing! IMO kprobes would be too heavyweight for this.
> A new static tracepoint would be fine, especially if it passes the
> same pt_regs* pointer as the former hook did.

Sounds good, would you like to propose a version? We are also
interested in a timer tick event tracepoint for dynticks debugging.

Thanks!

2013-04-03 14:45:17

by Frank Ch. Eigler

[permalink] [raw]
Subject: Re: systemtap broken by removal of register_timer_hook

Hi -

On Wed, Apr 03, 2013 at 02:49:53PM +0200, Frederic Weisbecker wrote:

> Sounds good, would you like to propose a version? We are also
> interested in a timer tick event tracepoint for dynticks debugging.

How about this?

Author: Frank Ch. Eigler <[email protected]>
Date: Wed Apr 3 10:35:21 2013 -0400

profiling: add profile_tick tracepoint

Commit ba6fdda4 removed the timer_hook mechanism for modules to listen
to profiling timer ticks (without having to set up more complicated
perf mechanisms). To reduce the impact on out-of-tree users, a
TRACE_EVENT-flavoured tracepoint is added in its place. Tested with
perf and systemtap.

Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Mel Gorman <[email protected]>
Signed-off-by: Frank Ch. Eigler <[email protected]>

diff --git a/include/trace/events/profile.h b/include/trace/events/profile.h
new file mode 100644
index 0000000..b48b6fe
--- /dev/null
+++ b/include/trace/events/profile.h
@@ -0,0 +1,35 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM profile
+
+#if !defined(_TRACE_PROFILE_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_PROFILE_H
+
+#include <linux/tracepoint.h>
+
+
+struct pt_regs;
+
+/**
+ * profile_tick - called when the profiling timer ticks
+ * @regs: pointer to struct pt_regs*
+ */
+
+TRACE_EVENT(profile_tick,
+ TP_PROTO(int type, struct pt_regs *regs),
+ TP_ARGS(type, regs),
+ TP_STRUCT__entry(
+ __field( int, type )
+ __field( struct pt_regs*, regs )
+ ),
+ TP_fast_assign(
+ __entry->type = type;
+ __entry->regs = regs;
+ ),
+ TP_printk("type=%d regs=%p", __entry->type, __entry->regs)
+);
+
+
+#endif /* _TRACE_PROFILE_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/kernel/profile.c b/kernel/profile.c
index dc3384e..d61f921 100644
--- a/kernel/profile.c
+++ b/kernel/profile.c
@@ -29,6 +29,9 @@
#include <asm/irq_regs.h>
#include <asm/ptrace.h>

+#define CREATE_TRACE_POINTS
+#include <trace/events/profile.h>
+
struct profile_hit {
u32 pc, hits;
};
@@ -414,6 +417,8 @@ void profile_tick(int type)
{
struct pt_regs *regs = get_irq_regs();

+ trace_profile_tick(type, regs);
+
if (!user_mode(regs) && prof_cpu_mask != NULL &&
cpumask_test_cpu(smp_processor_id(), prof_cpu_mask))
profile_hit(type, (void *)profile_pc(regs));

2013-04-03 18:39:13

by Josh Stone

[permalink] [raw]
Subject: Re: systemtap broken by removal of register_timer_hook

On 04/03/2013 07:44 AM, Frank Ch. Eigler wrote:
> Hi -
>
> On Wed, Apr 03, 2013 at 02:49:53PM +0200, Frederic Weisbecker wrote:
>
>> Sounds good, would you like to propose a version? We are also
>> interested in a timer tick event tracepoint for dynticks debugging.
>
> How about this?
>
> Author: Frank Ch. Eigler <[email protected]>
> Date: Wed Apr 3 10:35:21 2013 -0400
>
> profiling: add profile_tick tracepoint
>
> Commit ba6fdda4 removed the timer_hook mechanism for modules to listen
> to profiling timer ticks (without having to set up more complicated
> perf mechanisms). To reduce the impact on out-of-tree users, a
> TRACE_EVENT-flavoured tracepoint is added in its place. Tested with
> perf and systemtap.

One nice benefit over register_timer_hook() is that a tracepoint allows
multiple consumers, like the old register_profile_notifier() did.

>
> Cc: Frederic Weisbecker <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Mel Gorman <[email protected]>
> Signed-off-by: Frank Ch. Eigler <[email protected]>
>
> diff --git a/include/trace/events/profile.h b/include/trace/events/profile.h
> new file mode 100644
> index 0000000..b48b6fe
> --- /dev/null
> +++ b/include/trace/events/profile.h
> @@ -0,0 +1,35 @@
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM profile
> +
> +#if !defined(_TRACE_PROFILE_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_PROFILE_H
> +
> +#include <linux/tracepoint.h>
> +
> +
> +struct pt_regs;
> +
> +/**
> + * profile_tick - called when the profiling timer ticks
> + * @regs: pointer to struct pt_regs*
> + */
> +
> +TRACE_EVENT(profile_tick,
> + TP_PROTO(int type, struct pt_regs *regs),
> + TP_ARGS(type, regs),
> + TP_STRUCT__entry(
> + __field( int, type )
> + __field( struct pt_regs*, regs )
> + ),
> + TP_fast_assign(
> + __entry->type = type;
> + __entry->regs = regs;
> + ),
> + TP_printk("type=%d regs=%p", __entry->type, __entry->regs)
> +);

I agree that full regs are good for the tracepoint in general, but I
doubt that this pointer is useful for the printk. Maybe instead print
instruction_pointer(__entry->regs)?

> +
> +
> +#endif /* _TRACE_PROFILE_H */
> +
> +/* This part must be outside protection */
> +#include <trace/define_trace.h>
> diff --git a/kernel/profile.c b/kernel/profile.c
> index dc3384e..d61f921 100644
> --- a/kernel/profile.c
> +++ b/kernel/profile.c
> @@ -29,6 +29,9 @@
> #include <asm/irq_regs.h>
> #include <asm/ptrace.h>
>
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/profile.h>
> +
> struct profile_hit {
> u32 pc, hits;
> };
> @@ -414,6 +417,8 @@ void profile_tick(int type)
> {
> struct pt_regs *regs = get_irq_regs();
>
> + trace_profile_tick(type, regs);
> +
> if (!user_mode(regs) && prof_cpu_mask != NULL &&
> cpumask_test_cpu(smp_processor_id(), prof_cpu_mask))
> profile_hit(type, (void *)profile_pc(regs));
>

2013-04-04 12:46:09

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: systemtap broken by removal of register_timer_hook

2013/4/3 Frank Ch. Eigler <[email protected]>:
> Hi -
>
> On Wed, Apr 03, 2013 at 02:49:53PM +0200, Frederic Weisbecker wrote:
>
>> Sounds good, would you like to propose a version? We are also
>> interested in a timer tick event tracepoint for dynticks debugging.
>
> How about this?
>
> Author: Frank Ch. Eigler <[email protected]>
> Date: Wed Apr 3 10:35:21 2013 -0400
>
> profiling: add profile_tick tracepoint
>
> Commit ba6fdda4 removed the timer_hook mechanism for modules to listen
> to profiling timer ticks (without having to set up more complicated
> perf mechanisms). To reduce the impact on out-of-tree users, a
> TRACE_EVENT-flavoured tracepoint is added in its place. Tested with
> perf and systemtap.
>
> Cc: Frederic Weisbecker <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Mel Gorman <[email protected]>
> Signed-off-by: Frank Ch. Eigler <[email protected]>
>
> diff --git a/include/trace/events/profile.h b/include/trace/events/profile.h
> new file mode 100644
> index 0000000..b48b6fe
> --- /dev/null
> +++ b/include/trace/events/profile.h
> @@ -0,0 +1,35 @@
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM profile
> +
> +#if !defined(_TRACE_PROFILE_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_PROFILE_H
> +
> +#include <linux/tracepoint.h>
> +
> +
> +struct pt_regs;
> +
> +/**
> + * profile_tick - called when the profiling timer ticks
> + * @regs: pointer to struct pt_regs*
> + */
> +
> +TRACE_EVENT(profile_tick,
> + TP_PROTO(int type, struct pt_regs *regs),
> + TP_ARGS(type, regs),
> + TP_STRUCT__entry(
> + __field( int, type )
> + __field( struct pt_regs*, regs )
> + ),
> + TP_fast_assign(
> + __entry->type = type;
> + __entry->regs = regs;
> + ),
> + TP_printk("type=%d regs=%p", __entry->type, __entry->regs)
> +);
> +
> +
> +#endif /* _TRACE_PROFILE_H */
> +
> +/* This part must be outside protection */
> +#include <trace/define_trace.h>
> diff --git a/kernel/profile.c b/kernel/profile.c
> index dc3384e..d61f921 100644
> --- a/kernel/profile.c
> +++ b/kernel/profile.c
> @@ -29,6 +29,9 @@
> #include <asm/irq_regs.h>
> #include <asm/ptrace.h>
>
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/profile.h>
> +
> struct profile_hit {
> u32 pc, hits;
> };
> @@ -414,6 +417,8 @@ void profile_tick(int type)
> {
> struct pt_regs *regs = get_irq_regs();
>
> + trace_profile_tick(type, regs);
> +

It would be better not to tie this to CONFIG_PROFILING.
A tracepoint in update_process_times() instead would be great but it's
sometimes called several times in a tick from some archs.
Probably we need something like:

static inline tick_trace(struct pt_regs *regs)
{
trace_timer_tick(regs);
profile_tick(CPU_PROFILING);
}

Then replace all calls to profile_tick() with the above.

2013-04-10 11:25:04

by Ingo Molnar

[permalink] [raw]
Subject: Re: systemtap broken by removal of register_timer_hook


* Frank Ch. Eigler <[email protected]> wrote:

> Hi -
>
> On Wed, Apr 03, 2013 at 02:49:53PM +0200, Frederic Weisbecker wrote:
>
> > Sounds good, would you like to propose a version? We are also
> > interested in a timer tick event tracepoint for dynticks debugging.
>
> How about this?
>
> Author: Frank Ch. Eigler <[email protected]>
> Date: Wed Apr 3 10:35:21 2013 -0400
>
> profiling: add profile_tick tracepoint
>
> Commit ba6fdda4 removed the timer_hook mechanism for modules to listen
> to profiling timer ticks (without having to set up more complicated
> perf mechanisms). To reduce the impact on out-of-tree users, a
> TRACE_EVENT-flavoured tracepoint is added in its place. Tested with
> perf and systemtap.

I'd suggest mentioning SystemTap here as the driving motivation. SystemTap
triggered a generic kernel improvement here, no need to hide its identity!

Thanks,

Ingo

2013-04-19 14:47:44

by Frank Ch. Eigler

[permalink] [raw]
Subject: Re: systemtap broken by removal of register_timer_hook

Hi, Frederic -


> > How about this?
> >
> > Author: Frank Ch. Eigler <[email protected]>
> > Date: Wed Apr 3 10:35:21 2013 -0400
> >
> > profiling: add profile_tick tracepoint
> > [...]

> It would be better not to tie this to CONFIG_PROFILING.
> A tracepoint in update_process_times() instead would be great but it's
> sometimes called several times in a tick from some archs.
> Probably we need something like:
>
> static inline tick_trace(struct pt_regs *regs)
> {
> trace_timer_tick(regs);
> profile_tick(CPU_PROFILING);
> }

I looked into this, but found no natural place to define such an
inline function from which to call into a tracepoint, without having
to #include the <event/FOO.h> file many times. Nor does it seem
appropriate to do the identical #define CREATE_TRACE_POINTS part from
all the different arch/.../*.c files that may call into that inline.
If you'd like to stick to this idea, please advise further where you
think the tracepoint definition & declarations should go.

In the alternative, here is v2 of the patch, just changing the
tracepoint-printing argument as suggested by jistone.

- FChE

---------------

Author: Frank Ch. Eigler <[email protected]>
Date: Wed Apr 3 10:35:21 2013 -0400

profiling: add profile_tick tracepoint

Commit ba6fdda4 removed the timer_hook mechanism for modules to listen
to profiling timer ticks (without having to set up more complicated
perf mechanisms). To reduce the impact on out-of-tree users such as
systemtap, a TRACE_EVENT-flavoured tracepoint is added in its place.
Tested with perf and systemtap.

Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Mel Gorman <[email protected]>
Signed-off-by: Frank Ch. Eigler <[email protected]>

diff --git a/include/trace/events/profile.h b/include/trace/events/profile.h
new file mode 100644
index 0000000..445aee7
--- /dev/null
+++ b/include/trace/events/profile.h
@@ -0,0 +1,37 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM profile
+
+#if !defined(_TRACE_PROFILE_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_PROFILE_H
+
+#include <linux/tracepoint.h>
+
+
+struct pt_regs;
+
+/**
+ * profile_tick - called when the profiling timer ticks
+ * @type: profiling tick type, generally @CPU_PROFILING
+ * @regs: pointer to struct pt_regs*
+ */
+
+TRACE_EVENT(profile_tick,
+ TP_PROTO(int type, struct pt_regs *regs),
+ TP_ARGS(type, regs),
+ TP_STRUCT__entry(
+ __field( int, type )
+ __field( struct pt_regs*, regs )
+ ),
+ TP_fast_assign(
+ __entry->type = type;
+ __entry->regs = regs;
+ ),
+ TP_printk("type=%d ip=%p", __entry->type,
+ instruction_pointer(__entry->regs))
+);
+
+
+#endif /* _TRACE_PROFILE_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/kernel/profile.c b/kernel/profile.c
index dc3384e..d61f921 100644
--- a/kernel/profile.c
+++ b/kernel/profile.c
@@ -29,6 +29,9 @@
#include <asm/irq_regs.h>
#include <asm/ptrace.h>

+#define CREATE_TRACE_POINTS
+#include <trace/events/profile.h>
+
struct profile_hit {
u32 pc, hits;
};
@@ -414,6 +417,8 @@ void profile_tick(int type)
{
struct pt_regs *regs = get_irq_regs();

+ trace_profile_tick(type, regs);
+
if (!user_mode(regs) && prof_cpu_mask != NULL &&
cpumask_test_cpu(smp_processor_id(), prof_cpu_mask))
profile_hit(type, (void *)profile_pc(regs));

2013-04-30 00:19:07

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: systemtap broken by removal of register_timer_hook

On Fri, Apr 19, 2013 at 10:46:55AM -0400, Frank Ch. Eigler wrote:
> Hi, Frederic -
>
>
> > > How about this?
> > >
> > > Author: Frank Ch. Eigler <[email protected]>
> > > Date: Wed Apr 3 10:35:21 2013 -0400
> > >
> > > profiling: add profile_tick tracepoint
> > > [...]
>
> > It would be better not to tie this to CONFIG_PROFILING.
> > A tracepoint in update_process_times() instead would be great but it's
> > sometimes called several times in a tick from some archs.
> > Probably we need something like:
> >
> > static inline tick_trace(struct pt_regs *regs)
> > {
> > trace_timer_tick(regs);
> > profile_tick(CPU_PROFILING);
> > }
>
> I looked into this, but found no natural place to define such an
> inline function from which to call into a tracepoint, without having
> to #include the <event/FOO.h> file many times.

You could include it from linux/profile.h or linux/tick.h and
define profile_tick to "{ trace_tick(); __profile_tick()}" .

But in a second thought, using a tracepoint in a general purpose header
resulted in bad circular headers dependency in the past, we had bitter
experiences with trace_softirq_raise for example.

> Nor does it seem
> appropriate to do the identical #define CREATE_TRACE_POINTSw part from
> all the different arch/.../*.c files that may call into that inline.
>
> If you'd like to stick to this idea, please advise further where you
> think the tracepoint definition & declarations should go.
>
> In the alternative, here is v2 of the patch, just changing the
> tracepoint-printing argument as suggested by jistone.

But your tracepoint still depends on CONFIG_PROFILING and I would
really like to avoid that.

How about creating trace_tick() in include/trace/events/timer.h
and call it from tick_periodic() and tick_sched_handle().

This only leaves the archs that don't support generic clockevents
behind. This should be no big deal, they can integrate this tracepoint
later if they need to.

Thanks.

2013-04-30 17:27:37

by Frank Ch. Eigler

[permalink] [raw]
Subject: Re: systemtap broken by removal of register_timer_hook

Hi -

> [...] How about creating trace_tick() in
> include/trace/events/timer.h and call it from tick_periodic() and
> tick_sched_handle(). [...]

Like this?


>From facee64445c0dcc717e99c474c5c7dcdd31b9a74 Mon Sep 17 00:00:00 2001
From: "Frank Ch. Eigler" <[email protected]>
Date: Wed, 3 Apr 2013 10:35:21 -0400
Subject: [PATCH] profiling: add tick tracepoint

Commit ba6fdda4 removed the timer_hook mechanism for modules to listen
to profiling timer ticks (without having to set up more complicated
perf mechanisms). To reduce the impact on out-of-tree users such as
systemtap, a TRACE_EVENT-flavoured tracepoint is added in its place,
invoked right beside profile_tick() in kernel/time/tick-*.c.
Tested with perf and systemtap.

Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Mel Gorman <[email protected]>
Signed-off-by: Frank Ch. Eigler <[email protected]>
---
include/trace/events/timer.h | 20 ++++++++++++++++++++
kernel/time/tick-common.c | 2 ++
kernel/time/tick-sched.c | 2 ++
3 files changed, 24 insertions(+)

diff --git a/include/trace/events/timer.h b/include/trace/events/timer.h
index 425bcfe..ec4c2d0 100644
--- a/include/trace/events/timer.h
+++ b/include/trace/events/timer.h
@@ -323,6 +323,26 @@ TRACE_EVENT(itimer_expire,
(int) __entry->pid, (unsigned long long)__entry->now)
);

+
+struct pt_regs;
+
+/**
+ * tick - called when the profiling timer ticks
+ * @regs: pointer to struct pt_regs*
+ */
+TRACE_EVENT(tick,
+ TP_PROTO(struct pt_regs *regs),
+ TP_ARGS(regs),
+ TP_STRUCT__entry(
+ __field( struct pt_regs*, regs )
+ ),
+ TP_fast_assign(
+ __entry->regs = regs;
+ ),
+ TP_printk("ip=%p", (void *) instruction_pointer(__entry->regs))
+);
+
+
#endif /* _TRACE_TIMER_H */

/* This part must be outside protection */
diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index b1600a6..5f4227f 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -18,6 +18,7 @@
#include <linux/percpu.h>
#include <linux/profile.h>
#include <linux/sched.h>
+#include <trace/events/timer.h>

#include <asm/irq_regs.h>

@@ -74,6 +75,7 @@ static void tick_periodic(int cpu)

update_process_times(user_mode(get_irq_regs()));
profile_tick(CPU_PROFILING);
+ trace_tick(get_irq_regs());
}

/*
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index a19a399..447be56 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -21,6 +21,7 @@
#include <linux/sched.h>
#include <linux/module.h>
#include <linux/irq_work.h>
+#include <trace/events/timer.h>

#include <asm/irq_regs.h>

@@ -140,6 +141,7 @@ static void tick_sched_handle(struct tick_sched *ts, struct pt_regs *regs)
#endif
update_process_times(user_mode(regs));
profile_tick(CPU_PROFILING);
+ trace_tick(get_irq_regs());
}

/*
--
1.8.2

2013-05-06 23:12:48

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: systemtap broken by removal of register_timer_hook

On Tue, Apr 30, 2013 at 01:27:20PM -0400, Frank Ch. Eigler wrote:
> Hi -
>
> > [...] How about creating trace_tick() in
> > include/trace/events/timer.h and call it from tick_periodic() and
> > tick_sched_handle(). [...]
>
> Like this?
>
>
> From facee64445c0dcc717e99c474c5c7dcdd31b9a74 Mon Sep 17 00:00:00 2001
> From: "Frank Ch. Eigler" <[email protected]>
> Date: Wed, 3 Apr 2013 10:35:21 -0400
> Subject: [PATCH] profiling: add tick tracepoint

I'd rather entitle "timer: Add tick tracepoint", as it doesn't depend
on the profiling stuff.

>
> Commit ba6fdda4 removed the timer_hook mechanism for modules to listen
> to profiling timer ticks (without having to set up more complicated
> perf mechanisms). To reduce the impact on out-of-tree users such as
> systemtap, a TRACE_EVENT-flavoured tracepoint is added in its place,
> invoked right beside profile_tick() in kernel/time/tick-*.c.
> Tested with perf and systemtap.
>
> Cc: Frederic Weisbecker <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Mel Gorman <[email protected]>
> Signed-off-by: Frank Ch. Eigler <[email protected]>
> ---
> include/trace/events/timer.h | 20 ++++++++++++++++++++
> kernel/time/tick-common.c | 2 ++
> kernel/time/tick-sched.c | 2 ++
> 3 files changed, 24 insertions(+)
>
> diff --git a/include/trace/events/timer.h b/include/trace/events/timer.h
> index 425bcfe..ec4c2d0 100644
> --- a/include/trace/events/timer.h
> +++ b/include/trace/events/timer.h
> @@ -323,6 +323,26 @@ TRACE_EVENT(itimer_expire,
> (int) __entry->pid, (unsigned long long)__entry->now)
> );
>
> +
> +struct pt_regs;
> +
> +/**
> + * tick - called when the profiling timer ticks
> + * @regs: pointer to struct pt_regs*
> + */
> +TRACE_EVENT(tick,
> + TP_PROTO(struct pt_regs *regs),
> + TP_ARGS(regs),
> + TP_STRUCT__entry(
> + __field( struct pt_regs*, regs )

I doubt the pointer to the regs is interesting in userland.
How about "void *ip" instead?

> + ),
> + TP_fast_assign(
> + __entry->regs = regs;
> + ),
> + TP_printk("ip=%p", (void *) instruction_pointer(__entry->regs))
> +);
> +
> +
> #endif /* _TRACE_TIMER_H */
>
> /* This part must be outside protection */
> diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
> index b1600a6..5f4227f 100644
> --- a/kernel/time/tick-common.c
> +++ b/kernel/time/tick-common.c
> @@ -18,6 +18,7 @@
> #include <linux/percpu.h>
> #include <linux/profile.h>
> #include <linux/sched.h>
> +#include <trace/events/timer.h>
>
> #include <asm/irq_regs.h>
>
> @@ -74,6 +75,7 @@ static void tick_periodic(int cpu)
>
> update_process_times(user_mode(get_irq_regs()));
> profile_tick(CPU_PROFILING);
> + trace_tick(get_irq_regs());

I suggest storing get_irq_regs() in a local variable so the function
doesn't fetch it twice from whatever per cpu area.

> }
>
> /*
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index a19a399..447be56 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -21,6 +21,7 @@
> #include <linux/sched.h>
> #include <linux/module.h>
> #include <linux/irq_work.h>
> +#include <trace/events/timer.h>
>
> #include <asm/irq_regs.h>
>
> @@ -140,6 +141,7 @@ static void tick_sched_handle(struct tick_sched *ts, struct pt_regs *regs)
> #endif
> update_process_times(user_mode(regs));
> profile_tick(CPU_PROFILING);
> + trace_tick(get_irq_regs());

And here you can reuse the regs parameter.

> }
>
> /*
> --
> 1.8.2
>