2013-04-22 21:08:33

by Frederic Weisbecker

[permalink] [raw]
Subject: [GIT PULL] nohz: Help for testing

Ingo,

Please pull the testing enhancements for full dynticks from:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
timers/nohz-help-testing

HEAD: cb41a29076e9f95547da46578d5c8804f7b8845d

This is on top of my previous pull request (the timers/nohz branch) and
it is deemed to ease testing and help to diagnose failures:

* First patch enables CONFIG_RCU_NOCB_ALL when we have full dynticks to
ease automatic testing of the feature. This should also further
help simplifying the code. If users want more granularity in the future
to set RCU nocbs and timekeeper masks, we can still step back and
select CONFIG_RCU_NOCB_ALL only when we have CONFIG_NO_HZ_FULL_ALL.
Note this will only have an effect on tip/master because that config
is in core/rcu and not yet upstream.

* Second patch adds some tracepoints to dump the dynticks behaviour as
I got many questions from user who don't know how to diagnose why the
tick didn't stop on their settings.

Thanks.

---

Frederic Weisbecker (2):
nohz: Select wide RCU nocb for full dynticks
nohz: Add basic tracing

include/trace/events/timer.h | 21 +++++++++++++++++++++
kernel/time/Kconfig | 1 +
kernel/time/tick-sched.c | 19 +++++++++++++++----
3 files changed, 37 insertions(+), 4 deletions(-)

--
1.7.5.4


2013-04-22 21:08:37

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 1/2] nohz: Select wide RCU nocb for full dynticks

It makes testing and implementation much easier as we
know in advance that all CPUs are RCU nocbs.

Also this prepares to remove the dynamic check for
nohz_full= boot mask to be a subset of rcu_nocbs=

Eventually this should also help removing the requirement
for the boot CPU to be outside the full dynticks range.

Suggested-by: Christoph Lameter <[email protected]>
Suggested-by: Paul E. McKenney <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Geoff Levand <[email protected]>
Cc: Gilad Ben Yossef <[email protected]>
Cc: Hakan Akkan <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Kevin Hilman <[email protected]>
Cc: Li Zhong <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Paul Gortmaker <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
---
kernel/time/Kconfig | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
index f6a792a..e1ac129 100644
--- a/kernel/time/Kconfig
+++ b/kernel/time/Kconfig
@@ -110,6 +110,7 @@ config NO_HZ_FULL
select NO_HZ_COMMON
select RCU_USER_QS
select RCU_NOCB_CPU
+ select RCU_NOCB_CPU_ALL
select CONTEXT_TRACKING_FORCE
select IRQ_WORK
help
--
1.7.5.4

2013-04-22 21:08:51

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 2/2] nohz: Add basic tracing

It's not obvious to find out why the full dynticks subsystem
doesn't always stop the tick: whether this is due to kthreads,
posix timers, perf events, etc...

These new tracepoints are here to help the user diagnose
the failures and test this feature.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Geoff Levand <[email protected]>
Cc: Gilad Ben Yossef <[email protected]>
Cc: Hakan Akkan <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Kevin Hilman <[email protected]>
Cc: Li Zhong <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Paul Gortmaker <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
---
include/trace/events/timer.h | 21 +++++++++++++++++++++
kernel/time/tick-sched.c | 19 +++++++++++++++----
2 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/include/trace/events/timer.h b/include/trace/events/timer.h
index 425bcfe..f5eb53e 100644
--- a/include/trace/events/timer.h
+++ b/include/trace/events/timer.h
@@ -323,6 +323,27 @@ TRACE_EVENT(itimer_expire,
(int) __entry->pid, (unsigned long long)__entry->now)
);

+#ifdef CONFIG_NO_HZ_FULL
+TRACE_EVENT(tick_stop,
+
+ TP_PROTO(int success, char *error_msg),
+
+ TP_ARGS(success, error_msg),
+
+ TP_STRUCT__entry(
+ __field( int , success )
+ __string( msg, error_msg )
+ ),
+
+ TP_fast_assign(
+ __entry->success = success;
+ __assign_str(msg, error_msg);
+ ),
+
+ TP_printk("success=%s msg=%s", __entry->success ? "yes" : "no", __get_str(msg))
+);
+#endif
+
#endif /* _TRACE_TIMER_H */

/* This part must be outside protection */
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 12a900d..85e05ab 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -28,6 +28,8 @@

#include "tick-internal.h"

+#include <trace/events/timer.h>
+
/*
* Per cpu nohz control structure
*/
@@ -153,14 +155,20 @@ static bool can_stop_full_tick(void)
{
WARN_ON_ONCE(!irqs_disabled());

- if (!sched_can_stop_tick())
+ if (!sched_can_stop_tick()) {
+ trace_tick_stop(0, "more than 1 task in runqueue\n");
return false;
+ }

- if (!posix_cpu_timers_can_stop_tick(current))
+ if (!posix_cpu_timers_can_stop_tick(current)) {
+ trace_tick_stop(0, "posix timers running\n");
return false;
+ }

- if (!perf_event_can_stop_tick())
+ if (!perf_event_can_stop_tick()) {
+ trace_tick_stop(0, "perf events running\n");
return false;
+ }

/* sched_clock_tick() needs us? */
#ifdef CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
@@ -168,8 +176,10 @@ static bool can_stop_full_tick(void)
* TODO: kick full dynticks CPUs when
* sched_clock_stable is set.
*/
- if (!sched_clock_stable)
+ if (!sched_clock_stable) {
+ trace_tick_stop(0, "unstable sched clock\n");
return false;
+ }
#endif

return true;
@@ -631,6 +641,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,

ts->last_tick = hrtimer_get_expires(&ts->sched_timer);
ts->tick_stopped = 1;
+ trace_tick_stop(1, " ");
}

/*
--
1.7.5.4

Subject: Re: [PATCH 2/2] nohz: Add basic tracing


On Mon, 22 Apr 2013, Frederic Weisbecker wrote:

> It's not obvious to find out why the full dynticks subsystem
> doesn't always stop the tick: whether this is due to kthreads,
> posix timers, perf events, etc...
>
> These new tracepoints are here to help the user diagnose
> the failures and test this feature.

Very good. This will help a lot.

2013-04-23 20:00:29

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 2/2] nohz: Add basic tracing

On Tue, 2013-04-23 at 18:12 +0000, Christoph Lameter wrote:
> On Mon, 22 Apr 2013, Frederic Weisbecker wrote:
>
> > It's not obvious to find out why the full dynticks subsystem
> > doesn't always stop the tick: whether this is due to kthreads,
> > posix timers, perf events, etc...
> >
> > These new tracepoints are here to help the user diagnose
> > the failures and test this feature.
>
> Very good. This will help a lot.

You can also do:

cd /sys/kernel/debug/tracing
echo 1 > max_graph_depth
echo function_graph > current_tracer

And then run your code, and look to see what happens on the cpu in
question:

cat per_cpu/cpuX/trace

The "max_graph_depth" of one will make the function graph tracer just
trace the first function that enters the kernel. You'll be able to see
if the kernel did anything to your userspace application that wasn't
planned.

"max_graph_depth" was added in 3.9-rc1

-- Steve

2013-04-23 21:02:49

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 2/2] nohz: Add basic tracing

On Tue, Apr 23, 2013 at 04:00:23PM -0400, Steven Rostedt wrote:
> On Tue, 2013-04-23 at 18:12 +0000, Christoph Lameter wrote:
> > On Mon, 22 Apr 2013, Frederic Weisbecker wrote:
> >
> > > It's not obvious to find out why the full dynticks subsystem
> > > doesn't always stop the tick: whether this is due to kthreads,
> > > posix timers, perf events, etc...
> > >
> > > These new tracepoints are here to help the user diagnose
> > > the failures and test this feature.
> >
> > Very good. This will help a lot.
>
> You can also do:
>
> cd /sys/kernel/debug/tracing
> echo 1 > max_graph_depth
> echo function_graph > current_tracer
>
> And then run your code, and look to see what happens on the cpu in
> question:
>
> cat per_cpu/cpuX/trace
>
> The "max_graph_depth" of one will make the function graph tracer just
> trace the first function that enters the kernel. You'll be able to see
> if the kernel did anything to your userspace application that wasn't
> planned.
>
> "max_graph_depth" was added in 3.9-rc1

Very cool! I have added this to the documentation.

Thanx, Paul