2010-08-21 20:12:52

by Arjan van de Ven

[permalink] [raw]
Subject: [PATCH] workqueue: Add basic tracepoints to track workqueue execution


From: Arjan van de Ven <[email protected]>
Date: Sat, 21 Aug 2010 13:07:26 -0700
Subject: [PATCH] workqueue: Add basic tracepoints to track workqueue execution

With the introduction of the new unified work queue thread pools,
we lost one feature: It's no longer possible to know which worker
is causing the CPU to wake out of idle. The result is that PowerTOP
now reports a lot of "kworker/a:b" instead of more readable results.

This patch adds a pair of tracepoints to the new workqueue code,
similar in style to the timer/hrtimer tracepoints.

With this pair of tracepoints, the next PowerTOP can correctly
report which work item caused the wakeup (and how long it took):

Interrupt (43) i915 time 3.51ms wakeups 141
Work ieee80211_iface_work time 0.81ms wakeups 29
Work do_dbs_timer time 0.55ms wakeups 24
Process Xorg time 21.36ms wakeups 4
Timer sched_rt_period_timer time 0.01ms wakeups 1

Signed-off-by: Arjan van de Ven <[email protected]>
---
include/trace/events/workqueue.h | 63 ++++++++++++++++++++++++++++++++++++++
kernel/workqueue.c | 9 +++++
2 files changed, 72 insertions(+), 0 deletions(-)
create mode 100644 include/trace/events/workqueue.h

diff --git a/include/trace/events/workqueue.h b/include/trace/events/workqueue.h
new file mode 100644
index 0000000..4fd2bd1
--- /dev/null
+++ b/include/trace/events/workqueue.h
@@ -0,0 +1,63 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM workqueue
+
+#if !defined(_TRACE_WORKQUEUE_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_WORKQUEUE_H
+
+#include <linux/tracepoint.h>
+#include <linux/workqueue.h>
+
+/**
+ * workqueue_execute_start - called immediately before the workqueue callback
+ * @work: pointer to struct work_struct
+ *
+ * Allows to track workqueue execution.
+ */
+TRACE_EVENT(workqueue_execute_start,
+
+ TP_PROTO(struct work_struct *work),
+
+ TP_ARGS(work),
+
+ TP_STRUCT__entry(
+ __field( void *, work )
+ __field( void *, function)
+ ),
+
+ TP_fast_assign(
+ __entry->work = work;
+ __entry->function = work->func;
+ ),
+
+ TP_printk("work struct %p: function %pf", __entry->work, __entry->function)
+);
+
+/**
+ * workqueue_execute_end - called immediately before the workqueue callback
+ * @work: pointer to struct work_struct
+ *
+ * Allows to track workqueue execution.
+ */
+TRACE_EVENT(workqueue_execute_end,
+
+ TP_PROTO(struct work_struct *work),
+
+ TP_ARGS(work),
+
+ TP_STRUCT__entry(
+ __field( void *, work )
+ ),
+
+ TP_fast_assign(
+ __entry->work = work;
+ ),
+
+ TP_printk("work struct %p", __entry->work)
+);
+
+
+#endif /* _TRACE_WORKQUEUE_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
+
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 2994a0e..8bd600c 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -35,6 +35,9 @@
#include <linux/lockdep.h>
#include <linux/idr.h>

+#define CREATE_TRACE_POINTS
+#include <trace/events/workqueue.h>
+
#include "workqueue_sched.h"

enum {
@@ -1790,7 +1793,13 @@ static void process_one_work(struct worker *worker, struct work_struct *work)
work_clear_pending(work);
lock_map_acquire(&cwq->wq->lockdep_map);
lock_map_acquire(&lockdep_map);
+ trace_workqueue_execute_start(work);
f(work);
+ /*
+ * While we must be careful to not use "work" after this, the trace
+ * point will only record its address.
+ */
+ trace_workqueue_execute_end(work);
lock_map_release(&lockdep_map);
lock_map_release(&cwq->wq->lockdep_map);

--
1.7.1.1



--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org


2010-08-22 03:22:00

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] workqueue: Add basic tracepoints to track workqueue execution

On Sat, Aug 21, 2010 at 01:14:02PM -0700, Arjan van de Ven wrote:
>
> From: Arjan van de Ven <[email protected]>
> Date: Sat, 21 Aug 2010 13:07:26 -0700
> Subject: [PATCH] workqueue: Add basic tracepoints to track workqueue execution
>
> With the introduction of the new unified work queue thread pools,
> we lost one feature: It's no longer possible to know which worker
> is causing the CPU to wake out of idle. The result is that PowerTOP
> now reports a lot of "kworker/a:b" instead of more readable results.
>
> This patch adds a pair of tracepoints to the new workqueue code,
> similar in style to the timer/hrtimer tracepoints.
>
> With this pair of tracepoints, the next PowerTOP can correctly
> report which work item caused the wakeup (and how long it took):
>
> Interrupt (43) i915 time 3.51ms wakeups 141
> Work ieee80211_iface_work time 0.81ms wakeups 29
> Work do_dbs_timer time 0.55ms wakeups 24
> Process Xorg time 21.36ms wakeups 4
> Timer sched_rt_period_timer time 0.01ms wakeups 1
>
> Signed-off-by: Arjan van de Ven <[email protected]>



Acked-by: Frederic Weisbecker <[email protected]>


If that broke a userspace tool, should that perhaps go to .36 ?
I know it's a big patch, but in fact it's more a semi-revert of
the workqueue tracepoints removal.

It's my bad in fact: I suggested Tejun not to bother with workqueue
tracepoints yet and to drop the previous ones for now. The previous ones were
placed to observe problems specific to the purely serialized design they
had. I thought cmwq would need other tracepoints.

In fact I thought no tools were using these tracepoints. Bad assumption...

2010-08-22 04:01:07

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] workqueue: Add basic tracepoints to track workqueue execution

On Sat, Aug 21, 2010 at 01:14:02PM -0700, Arjan van de Ven wrote:
>
> From: Arjan van de Ven <[email protected]>
> Date: Sat, 21 Aug 2010 13:07:26 -0700
> Subject: [PATCH] workqueue: Add basic tracepoints to track workqueue execution
>
> With the introduction of the new unified work queue thread pools,
> we lost one feature: It's no longer possible to know which worker
> is causing the CPU to wake out of idle. The result is that PowerTOP
> now reports a lot of "kworker/a:b" instead of more readable results.
>
> This patch adds a pair of tracepoints to the new workqueue code,
> similar in style to the timer/hrtimer tracepoints.
>
> With this pair of tracepoints, the next PowerTOP can correctly
> report which work item caused the wakeup (and how long it took):
>
> Interrupt (43) i915 time 3.51ms wakeups 141
> Work ieee80211_iface_work time 0.81ms wakeups 29
> Work do_dbs_timer time 0.55ms wakeups 24
> Process Xorg time 21.36ms wakeups 4
> Timer sched_rt_period_timer time 0.01ms wakeups 1
>
> Signed-off-by: Arjan van de Ven <[email protected]>



Ok, Linus applied it. That answers my doubts about the
route for this patch :)

2010-08-23 09:51:32

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH] workqueue: Add basic tracepoints to track workqueue execution

Hello,

On 08/22/2010 06:01 AM, Frederic Weisbecker wrote:
> On Sat, Aug 21, 2010 at 01:14:02PM -0700, Arjan van de Ven wrote:
>>
>> From: Arjan van de Ven <[email protected]>
>> Date: Sat, 21 Aug 2010 13:07:26 -0700
>> Subject: [PATCH] workqueue: Add basic tracepoints to track workqueue execution
>>
>> With the introduction of the new unified work queue thread pools,
>> we lost one feature: It's no longer possible to know which worker
>> is causing the CPU to wake out of idle. The result is that PowerTOP
>> now reports a lot of "kworker/a:b" instead of more readable results.
>>
>> This patch adds a pair of tracepoints to the new workqueue code,
>> similar in style to the timer/hrtimer tracepoints.
>>
>> With this pair of tracepoints, the next PowerTOP can correctly
>> report which work item caused the wakeup (and how long it took):
>>
>> Interrupt (43) i915 time 3.51ms wakeups 141
>> Work ieee80211_iface_work time 0.81ms wakeups 29
>> Work do_dbs_timer time 0.55ms wakeups 24
>> Process Xorg time 21.36ms wakeups 4
>> Timer sched_rt_period_timer time 0.01ms wakeups 1
>>
>> Signed-off-by: Arjan van de Ven <[email protected]>
>
> Ok, Linus applied it. That answers my doubts about the
> route for this patch :)

I was planning to work on wq tracing in this devel cycle but yeah
Arjan's patch seems like a good starting place.

We definitely need more tracing points tho. Also, David was wondering
whether it would be possible to have simple file based interface to
monitor the current status of workers. That's doable w/ tracing API,
right?

Thanks.

--
tejun

2010-08-26 14:36:34

by Jason Baron

[permalink] [raw]
Subject: Re: [PATCH] workqueue: Add basic tracepoints to track workqueue execution

On Sat, Aug 21, 2010 at 01:14:02PM -0700, Arjan van de Ven wrote:
> With the introduction of the new unified work queue thread pools,
> we lost one feature: It's no longer possible to know which worker
> is causing the CPU to wake out of idle. The result is that PowerTOP
> now reports a lot of "kworker/a:b" instead of more readable results.
>
> This patch adds a pair of tracepoints to the new workqueue code,
> similar in style to the timer/hrtimer tracepoints.
>

Hi,

it might also be nice to add a workqueue chapter to the traceopint docbook.
Since you've already included kernel-doc style comments, the below is all that
we need.

current book is at: http://www.kernel.org/doc/htmldocs/tracepoint/

thanks,

-Jason


Add a workqueue chapter to the tracepoint docbook

Signed-off-by: Jason Baron <[email protected]>

---
Documentation/DocBook/tracepoint.tmpl | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/Documentation/DocBook/tracepoint.tmpl b/Documentation/DocBook/tracepoint.tmpl
index e8473ea..b57a9ed 100644
--- a/Documentation/DocBook/tracepoint.tmpl
+++ b/Documentation/DocBook/tracepoint.tmpl
@@ -104,4 +104,9 @@
<title>Block IO</title>
!Iinclude/trace/events/block.h
</chapter>
+
+ <chapter id="workqueue">
+ <title>Workqueue</title>
+!Iinclude/trace/events/workqueue.h
+ </chapter>
</book>
--
1.6.0.6

2010-08-27 10:35:50

by Tejun Heo

[permalink] [raw]
Subject: [PATCH] workqueue: Add a workqueue chapter to the tracepoint docbook

From: Jason Baron <[email protected]>

Add a workqueue chapter to the traceopint docbook.

Current book is at: http://www.kernel.org/doc/htmldocs/tracepoint/

Signed-off-by: Jason Baron <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
---
Applied w/ patch description update. Thanks.

Documentation/DocBook/tracepoint.tmpl | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/Documentation/DocBook/tracepoint.tmpl b/Documentation/DocBook/tracepoint.tmpl
index e8473ea..b57a9ed 100644
--- a/Documentation/DocBook/tracepoint.tmpl
+++ b/Documentation/DocBook/tracepoint.tmpl
@@ -104,4 +104,9 @@
<title>Block IO</title>
!Iinclude/trace/events/block.h
</chapter>
+
+ <chapter id="workqueue">
+ <title>Workqueue</title>
+!Iinclude/trace/events/workqueue.h
+ </chapter>
</book>
--
1.7.1

2010-08-27 13:14:16

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] workqueue: Add basic tracepoints to track workqueue execution

On Mon, Aug 23, 2010 at 11:46:18AM +0200, Tejun Heo wrote:
> Hello,
>
> On 08/22/2010 06:01 AM, Frederic Weisbecker wrote:
> > On Sat, Aug 21, 2010 at 01:14:02PM -0700, Arjan van de Ven wrote:
> >>
> >> From: Arjan van de Ven <[email protected]>
> >> Date: Sat, 21 Aug 2010 13:07:26 -0700
> >> Subject: [PATCH] workqueue: Add basic tracepoints to track workqueue execution
> >>
> >> With the introduction of the new unified work queue thread pools,
> >> we lost one feature: It's no longer possible to know which worker
> >> is causing the CPU to wake out of idle. The result is that PowerTOP
> >> now reports a lot of "kworker/a:b" instead of more readable results.
> >>
> >> This patch adds a pair of tracepoints to the new workqueue code,
> >> similar in style to the timer/hrtimer tracepoints.
> >>
> >> With this pair of tracepoints, the next PowerTOP can correctly
> >> report which work item caused the wakeup (and how long it took):
> >>
> >> Interrupt (43) i915 time 3.51ms wakeups 141
> >> Work ieee80211_iface_work time 0.81ms wakeups 29
> >> Work do_dbs_timer time 0.55ms wakeups 24
> >> Process Xorg time 21.36ms wakeups 4
> >> Timer sched_rt_period_timer time 0.01ms wakeups 1
> >>
> >> Signed-off-by: Arjan van de Ven <[email protected]>
> >
> > Ok, Linus applied it. That answers my doubts about the
> > route for this patch :)
>
> I was planning to work on wq tracing in this devel cycle but yeah
> Arjan's patch seems like a good starting place.
>
> We definitely need more tracing points tho. Also, David was wondering
> whether it would be possible to have simple file based interface to
> monitor the current status of workers. That's doable w/ tracing API,
> right?



Yeah but the ftrace API that handles statistics is a very basic thing.
It works well for measuring entities that won't be freed, like
for branch and function profiling for example.
But once you deal with entities that can disappear (like workqueue threads),
you need to handle all the locking by yourself.

You can have a look at kernel/trace/trace_workqueue.c. It's broken
as we have removed some workqueue tracepoints, so I planned to
remove it. It's a basic thing but perhaps you can restart from there.

I think there was also a race between reader and writer. A stat node
probably need to be read under rcu (and freed after a grace period).