2009-09-22 02:43:59

by Anton Blanchard

[permalink] [raw]
Subject: [PATCH 0/2] tracing/workqueue fixes


Hi,

The workqueue tracepoint patchset looks to have stalled for a bit. While
we discuss how much functionality we want in kernel, I think we have a
solid case for tracing the latency caused by workqueue events.

For an example we can look at the rt_worker_func work. As memory grew its
duration would also grow, and I measured 14ms spikes on one box. Once found
we placed a cap on it:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c9503e0fe052020e0294cd07d0ecd982eb7c9177

The following patches add entry and exit tracepoints so we can find these
things easily.

Anton


2009-09-22 02:43:48

by Anton Blanchard

[permalink] [raw]
Subject: [PATCH 1/2] tracing/workqueue: Use %pf in workqueue trace events


Using %pf instead of %pF supresses printing of the function offset
which will always be 0.

Signed-off-by: Anton Blanchard <[email protected]>
---

Index: linux.trees.git/include/trace/events/workqueue.h
===================================================================
--- linux.trees.git.orig/include/trace/events/workqueue.h 2009-09-01 15:11:04.000000000 +1000
+++ linux.trees.git/include/trace/events/workqueue.h 2009-09-14 09:45:41.000000000 +1000
@@ -26,7 +26,7 @@ TRACE_EVENT(workqueue_insertion,
__entry->func = work->func;
),

- TP_printk("thread=%s:%d func=%pF", __entry->thread_comm,
+ TP_printk("thread=%s:%d func=%pf", __entry->thread_comm,
__entry->thread_pid, __entry->func)
);

@@ -48,7 +48,7 @@ TRACE_EVENT(workqueue_execution,
__entry->func = work->func;
),

- TP_printk("thread=%s:%d func=%pF", __entry->thread_comm,
+ TP_printk("thread=%s:%d func=%pf", __entry->thread_comm,
__entry->thread_pid, __entry->func)
);

2009-09-22 02:43:57

by Anton Blanchard

[permalink] [raw]
Subject: [PATCH 2/2] tracing/workqueue: Rename workqueue_execute to worklet_entry and add worklet_exit


Keep a common naming convention for tracing the latency of events such as
softirq_entry/softirq_exit.

Based on a patch from KOSAKI Motohiro.

Signed-off-by: Anton Blanchard <[email protected]>
---

Index: linux.trees.git/kernel/workqueue.c
===================================================================
--- linux.trees.git.orig/kernel/workqueue.c 2009-09-14 09:43:00.000000000 +1000
+++ linux.trees.git/kernel/workqueue.c 2009-09-14 09:45:45.000000000 +1000
@@ -279,7 +279,6 @@ static void run_workqueue(struct cpu_wor
*/
struct lockdep_map lockdep_map = work->lockdep_map;
#endif
- trace_workqueue_execution(cwq->thread, work);
cwq->current_work = work;
list_del_init(cwq->worklist.next);
spin_unlock_irq(&cwq->lock);
@@ -288,7 +287,9 @@ static void run_workqueue(struct cpu_wor
work_clear_pending(work);
lock_map_acquire(&cwq->wq->lockdep_map);
lock_map_acquire(&lockdep_map);
+ trace_worklet_entry(cwq->thread, work);
f(work);
+ trace_worklet_exit(cwq->thread, work);
lock_map_release(&lockdep_map);
lock_map_release(&cwq->wq->lockdep_map);

Index: linux.trees.git/include/trace/events/workqueue.h
===================================================================
--- linux.trees.git.orig/include/trace/events/workqueue.h 2009-09-14 09:45:41.000000000 +1000
+++ linux.trees.git/include/trace/events/workqueue.h 2009-09-14 09:45:45.000000000 +1000
@@ -30,7 +30,7 @@ TRACE_EVENT(workqueue_insertion,
__entry->thread_pid, __entry->func)
);

-TRACE_EVENT(workqueue_execution,
+TRACE_EVENT(worklet_entry,

TP_PROTO(struct task_struct *wq_thread, struct work_struct *work),

@@ -52,6 +52,27 @@ TRACE_EVENT(workqueue_execution,
__entry->thread_pid, __entry->func)
);

+/* Declare work as void *, because we can't use work->... in after f(work) */
+TRACE_EVENT(worklet_exit,
+
+ TP_PROTO(struct task_struct *wq_thread, void *work),
+
+ TP_ARGS(wq_thread, work),
+
+ TP_STRUCT__entry(
+ __array(char, thread_comm, TASK_COMM_LEN)
+ __field(pid_t, thread_pid)
+ ),
+
+ TP_fast_assign(
+ memcpy(__entry->thread_comm, wq_thread->comm, TASK_COMM_LEN);
+ __entry->thread_pid = wq_thread->pid;
+ ),
+
+ TP_printk("thread=%s:%d", __entry->thread_comm,
+ __entry->thread_pid)
+);
+
/* Trace the creation of one workqueue thread on a cpu */
TRACE_EVENT(workqueue_creation,

Index: linux.trees.git/kernel/trace/trace_workqueue.c
===================================================================
--- linux.trees.git.orig/kernel/trace/trace_workqueue.c 2009-09-01 15:11:04.000000000 +1000
+++ linux.trees.git/kernel/trace/trace_workqueue.c 2009-09-14 09:45:45.000000000 +1000
@@ -69,7 +69,7 @@ found:

/* Execution of a work */
static void
-probe_workqueue_execution(struct task_struct *wq_thread,
+probe_worklet_entry(struct task_struct *wq_thread,
struct work_struct *work)
{
int cpu = cpumask_first(&wq_thread->cpus_allowed);
@@ -262,13 +262,13 @@ int __init trace_workqueue_early_init(vo
if (ret)
goto out;

- ret = register_trace_workqueue_execution(probe_workqueue_execution);
+ ret = register_trace_worklet_entry(probe_worklet_entry);
if (ret)
goto no_insertion;

ret = register_trace_workqueue_creation(probe_workqueue_creation);
if (ret)
- goto no_execution;
+ goto no_entry;

ret = register_trace_workqueue_destruction(probe_workqueue_destruction);
if (ret)
@@ -283,8 +283,8 @@ int __init trace_workqueue_early_init(vo

no_creation:
unregister_trace_workqueue_creation(probe_workqueue_creation);
-no_execution:
- unregister_trace_workqueue_execution(probe_workqueue_execution);
+no_entry:
+ unregister_trace_worklet_entry(probe_worklet_entry);
no_insertion:
unregister_trace_workqueue_insertion(probe_workqueue_insertion);
out:

2009-09-22 20:34:19

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH 2/2] tracing/workqueue: Rename workqueue_execute to worklet_entry and add worklet_exit

On Tue, Sep 22, 2009 at 12:42:32PM +1000, Anton Blanchard wrote:
>
> Keep a common naming convention for tracing the latency of events such as
> softirq_entry/softirq_exit.
>
> Based on a patch from KOSAKI Motohiro.
>
> Signed-off-by: Anton Blanchard <[email protected]>


Ah. Kosaki Motohiro sent a very similar patch few monthes ago:

http://git.kernel.org/?p=linux/kernel/git/frederic/random-tracing.git;a=commit;h=5c11f399da166ff3bd8cf823add5fff7d036b67e

I haven't proposed it because of the debate about workqueue profiling at this
time.

But now I think this should make its way, as only the trace events are necessary
for the kernel part of such profiling.

Some comments below:

> ---
>
> Index: linux.trees.git/kernel/workqueue.c
> ===================================================================
> --- linux.trees.git.orig/kernel/workqueue.c 2009-09-14 09:43:00.000000000 +1000
> +++ linux.trees.git/kernel/workqueue.c 2009-09-14 09:45:45.000000000 +1000
> @@ -279,7 +279,6 @@ static void run_workqueue(struct cpu_wor
> */
> struct lockdep_map lockdep_map = work->lockdep_map;
> #endif
> - trace_workqueue_execution(cwq->thread, work);
> cwq->current_work = work;
> list_del_init(cwq->worklist.next);
> spin_unlock_irq(&cwq->lock);
> @@ -288,7 +287,9 @@ static void run_workqueue(struct cpu_wor
> work_clear_pending(work);
> lock_map_acquire(&cwq->wq->lockdep_map);
> lock_map_acquire(&lockdep_map);
> + trace_worklet_entry(cwq->thread, work);
> f(work);
> + trace_worklet_exit(cwq->thread, work);
> lock_map_release(&lockdep_map);
> lock_map_release(&cwq->wq->lockdep_map);
>
> Index: linux.trees.git/include/trace/events/workqueue.h
> ===================================================================
> --- linux.trees.git.orig/include/trace/events/workqueue.h 2009-09-14 09:45:41.000000000 +1000
> +++ linux.trees.git/include/trace/events/workqueue.h 2009-09-14 09:45:45.000000000 +1000
> @@ -30,7 +30,7 @@ TRACE_EVENT(workqueue_insertion,
> __entry->thread_pid, __entry->func)
> );
>
> -TRACE_EVENT(workqueue_execution,
> +TRACE_EVENT(worklet_entry,


In Kosaki's patch, it's named workqueue_handler_entry, but worklet_entry
looks sufficient and more concise.


>
> TP_PROTO(struct task_struct *wq_thread, struct work_struct *work),
>
> @@ -52,6 +52,27 @@ TRACE_EVENT(workqueue_execution,
> __entry->thread_pid, __entry->func)
> );
>
> +/* Declare work as void *, because we can't use work->... in after f(work) */
> +TRACE_EVENT(worklet_exit,
> +
> + TP_PROTO(struct task_struct *wq_thread, void *work),
> +
> + TP_ARGS(wq_thread, work),
> +
> + TP_STRUCT__entry(
> + __array(char, thread_comm, TASK_COMM_LEN)
> + __field(pid_t, thread_pid)
> + ),
> +
> + TP_fast_assign(
> + memcpy(__entry->thread_comm, wq_thread->comm, TASK_COMM_LEN);
> + __entry->thread_pid = wq_thread->pid;
> + ),
> +
> + TP_printk("thread=%s:%d", __entry->thread_comm,
> + __entry->thread_pid)



In Kosaki's patch, we had the work struct address displayed too.
Your version is supposed to be sufficient because we know a workqueue
serializes its works. Then we know that an exit event will always follow
and match the previous entry event from the same workqueue thread.
The workqueue pid then provides a sufficient key for that.

That said, we should worry about possible lost events from
perf in some circumstances. And userspace profiling needs something
to ensure the accuracy about this entry/exit pair.

We could have:
entry work1
exit work 1 <--- lost event
entry work2 <--- lost event
exit work2

And then the pair would be misinterpreted.

(Although we could have even other misinterpretation
with other kind of scenarios, even if we have this work
address. But that's still more safety).

So I'd prefer to keep Kosaki's idea about these keys.
But I prefer your event naming.

May be I can unearth Kosaki's patch, change it with your naming
and add your Signed-off-by?

Kosaki, no problem about it?

Thanks,
Frederic.

2009-09-22 20:35:18

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH 1/2] tracing/workqueue: Use %pf in workqueue trace events

On Tue, Sep 22, 2009 at 12:40:33PM +1000, Anton Blanchard wrote:
>
> Using %pf instead of %pF supresses printing of the function offset
> which will always be 0.
>
> Signed-off-by: Anton Blanchard <[email protected]>


I'll queue this one, thanks!


> ---
>
> Index: linux.trees.git/include/trace/events/workqueue.h
> ===================================================================
> --- linux.trees.git.orig/include/trace/events/workqueue.h 2009-09-01 15:11:04.000000000 +1000
> +++ linux.trees.git/include/trace/events/workqueue.h 2009-09-14 09:45:41.000000000 +1000
> @@ -26,7 +26,7 @@ TRACE_EVENT(workqueue_insertion,
> __entry->func = work->func;
> ),
>
> - TP_printk("thread=%s:%d func=%pF", __entry->thread_comm,
> + TP_printk("thread=%s:%d func=%pf", __entry->thread_comm,
> __entry->thread_pid, __entry->func)
> );
>
> @@ -48,7 +48,7 @@ TRACE_EVENT(workqueue_execution,
> __entry->func = work->func;
> ),
>
> - TP_printk("thread=%s:%d func=%pF", __entry->thread_comm,
> + TP_printk("thread=%s:%d func=%pf", __entry->thread_comm,
> __entry->thread_pid, __entry->func)
> );
>

2009-09-23 09:04:57

by Anton Blanchard

[permalink] [raw]
Subject: [tip:tracing/urgent] tracing/workqueue: Use %pf in workqueue trace events

Commit-ID: 9961079348d43dddb1f21118c17f054f63bbaa05
Gitweb: http://git.kernel.org/tip/9961079348d43dddb1f21118c17f054f63bbaa05
Author: Anton Blanchard <[email protected]>
AuthorDate: Tue, 22 Sep 2009 12:40:33 +1000
Committer: Frederic Weisbecker <[email protected]>
CommitDate: Tue, 22 Sep 2009 23:14:58 +0200

tracing/workqueue: Use %pf in workqueue trace events

Using %pf instead of %pF supresses printing of the function offset
which will always be 0 in the case of worklet functions.

Signed-off-by: Anton Blanchard <[email protected]>
Cc: KOSAKI Motohiro <[email protected]>
Cc: Li Zefan <[email protected]>
Cc: Zhaolei <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Steven Rostedt <[email protected]>
LKML-Reference: <20090922024033.GB31801@kryten>
Signed-off-by: Frederic Weisbecker <[email protected]>


---
include/trace/events/workqueue.h | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/trace/events/workqueue.h b/include/trace/events/workqueue.h
index fcfd9a1..e4612db 100644
--- a/include/trace/events/workqueue.h
+++ b/include/trace/events/workqueue.h
@@ -26,7 +26,7 @@ TRACE_EVENT(workqueue_insertion,
__entry->func = work->func;
),

- TP_printk("thread=%s:%d func=%pF", __entry->thread_comm,
+ TP_printk("thread=%s:%d func=%pf", __entry->thread_comm,
__entry->thread_pid, __entry->func)
);

@@ -48,7 +48,7 @@ TRACE_EVENT(workqueue_execution,
__entry->func = work->func;
),

- TP_printk("thread=%s:%d func=%pF", __entry->thread_comm,
+ TP_printk("thread=%s:%d func=%pf", __entry->thread_comm,
__entry->thread_pid, __entry->func)
);

2009-09-24 00:22:41

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH 2/2] tracing/workqueue: Rename workqueue_execute to worklet_entry and add worklet_exit

Hi

> In Kosaki's patch, we had the work struct address displayed too.
> Your version is supposed to be sufficient because we know a workqueue
> serializes its works. Then we know that an exit event will always follow
> and match the previous entry event from the same workqueue thread.
> The workqueue pid then provides a sufficient key for that.
>
> That said, we should worry about possible lost events from
> perf in some circumstances. And userspace profiling needs something
> to ensure the accuracy about this entry/exit pair.
>
> We could have:
> entry work1
> exit work 1 <--- lost event
> entry work2 <--- lost event
> exit work2
>
> And then the pair would be misinterpreted.
>
> (Although we could have even other misinterpretation
> with other kind of scenarios, even if we have this work
> address. But that's still more safety).
>
> So I'd prefer to keep Kosaki's idea about these keys.
> But I prefer your event naming.
>
> May be I can unearth Kosaki's patch, change it with your naming
> and add your Signed-off-by?
>
> Kosaki, no problem about it?

Sure.
Thank you paid attention my patch :-)