2011-05-16 23:35:50

by Mandeep Singh Baines

[permalink] [raw]
Subject: [PATCH 1/4] watchdog: fix rounding issues in get_sample_period()

In get_sample_period(), softlockup_thresh is integer divided by 5 before
the multiplication by NSEC_PER_SEC. This results in softlockup_thresh
being rounded down to the nearest integer multiple of 5.

For example, a softlockup_thresh of 4 rounds down to 0.

Signed-off-by: Mandeep Singh Baines <[email protected]>
Cc: Marcin Slusarz <[email protected]>
Cc: Don Zickus <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
---
kernel/watchdog.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 14733d4..a06972d 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -110,7 +110,7 @@ static unsigned long get_sample_period(void)
* increment before the hardlockup detector generates
* a warning
*/
- return softlockup_thresh / 5 * NSEC_PER_SEC;
+ return softlockup_thresh * (NSEC_PER_SEC / 5);
}

/* Commands for resetting the watchdog */
--
1.7.3.1


2011-05-16 23:36:00

by Mandeep Singh Baines

[permalink] [raw]
Subject: [PATCH 2/4] watchdog: only disable/enable watchdog if neccessary

Don't take any action on an unsuccessful write to /proc.

Signed-off-by: Mandeep Singh Baines <[email protected]>
Cc: Marcin Slusarz <[email protected]>
Cc: Don Zickus <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
---
kernel/watchdog.c | 20 ++++++++++++--------
1 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index a06972d..cf0e09f 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -507,15 +507,19 @@ static void watchdog_disable_all_cpus(void)
int proc_dowatchdog_enabled(struct ctl_table *table, int write,
void __user *buffer, size_t *length, loff_t *ppos)
{
- proc_dointvec(table, write, buffer, length, ppos);
+ int ret;

- if (write) {
- if (watchdog_enabled)
- watchdog_enable_all_cpus();
- else
- watchdog_disable_all_cpus();
- }
- return 0;
+ ret = proc_dointvec(table, write, buffer, length, ppos);
+ if (ret || !write)
+ goto out;
+
+ if (watchdog_enabled)
+ watchdog_enable_all_cpus();
+ else
+ watchdog_disable_all_cpus();
+
+out:
+ return ret;
}

int proc_dowatchdog_thresh(struct ctl_table *table, int write,
--
1.7.3.1

2011-05-16 23:36:21

by Mandeep Singh Baines

[permalink] [raw]
Subject: [PATCH 3/4] watchdog: disable watchdog when thresh is zero

This restores the previous behavior of softlock_thresh.

Currently, setting watchdog_thresh to zero causes the watchdog
kthreads to consume a lot of CPU.

Signed-off-by: Mandeep Singh Baines <[email protected]>
Cc: Marcin Slusarz <[email protected]>
Cc: Don Zickus <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
---
kernel/watchdog.c | 15 ++++++++++++++-
1 files changed, 14 insertions(+), 1 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index cf0e09f..50c2f62 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -526,7 +526,20 @@ int proc_dowatchdog_thresh(struct ctl_table *table, int write,
void __user *buffer,
size_t *lenp, loff_t *ppos)
{
- return proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+ int ret;
+
+ ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+ if (ret || !write)
+ goto out;
+
+ if (softlockup_thresh)
+ watchdog_enable_all_cpus();
+ else
+ watchdog_disable_all_cpus();
+
+out:
+ return ret;
+
}
#endif /* CONFIG_SYSCTL */

--
1.7.3.1

2011-05-16 23:36:06

by Mandeep Singh Baines

[permalink] [raw]
Subject: [PATCH 4/4] watchdog: configure nmi watchdog period based on watchdog_thresh

Before the conversion of the NMI watchdog to perf event, the watchdog
timeout was 5 seconds. Now it is 60 seconds. For my particular application,
netbooks, 5 seconds was a better timeout. With a short timeout, we
catch faults earlier and are able to send back a panic. With a 60 second
timeout, the user is unlikely to wait and will instead hit the power
button, causing us to lose the panic info.

This change configures the NMI period based on the watchdog_thresh.

Signed-off-by: Mandeep Singh Baines <[email protected]>
Cc: Marcin Slusarz <[email protected]>
Cc: Don Zickus <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
---
arch/x86/kernel/apic/hw_nmi.c | 4 ++--
include/linux/nmi.h | 2 +-
kernel/watchdog.c | 2 +-
3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index 5260fe9..d5e57db0 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -19,9 +19,9 @@
#include <linux/delay.h>

#ifdef CONFIG_HARDLOCKUP_DETECTOR
-u64 hw_nmi_get_sample_period(void)
+u64 hw_nmi_get_sample_period(int watchdog_thresh)
{
- return (u64)(cpu_khz) * 1000 * 60;
+ return (u64)(cpu_khz) * 1000 * watchdog_thresh;
}
#endif

diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index c536f85..c20c55a 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -45,7 +45,7 @@ static inline bool trigger_all_cpu_backtrace(void)

#ifdef CONFIG_LOCKUP_DETECTOR
int hw_nmi_is_cpu_stuck(struct pt_regs *);
-u64 hw_nmi_get_sample_period(void);
+u64 hw_nmi_get_sample_period(int watchdog_thresh);
extern int watchdog_enabled;
struct ctl_table;
extern int proc_dowatchdog_enabled(struct ctl_table *, int ,
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 50c2f62..169e1e2 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -359,7 +359,7 @@ static int watchdog_nmi_enable(int cpu)

/* Try to register using hardware perf events */
wd_attr = &wd_hw_attr;
- wd_attr->sample_period = hw_nmi_get_sample_period();
+ wd_attr->sample_period = hw_nmi_get_sample_period(softlockup_thresh);
event = perf_event_create_kernel_counter(wd_attr, cpu, NULL, watchdog_overflow_callback);
if (!IS_ERR(event)) {
printk(KERN_INFO "NMI watchdog enabled, takes one hw-pmu counter.\n");
--
1.7.3.1

2011-05-17 07:10:31

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 3/4] watchdog: disable watchdog when thresh is zero


* Mandeep Singh Baines <[email protected]> wrote:

> This restores the previous behavior of softlock_thresh.
>
> Currently, setting watchdog_thresh to zero causes the watchdog
> kthreads to consume a lot of CPU.
>
> Signed-off-by: Mandeep Singh Baines <[email protected]>
> Cc: Marcin Slusarz <[email protected]>
> Cc: Don Zickus <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Frederic Weisbecker <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> ---
> kernel/watchdog.c | 15 ++++++++++++++-
> 1 files changed, 14 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index cf0e09f..50c2f62 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -526,7 +526,20 @@ int proc_dowatchdog_thresh(struct ctl_table *table, int write,
> void __user *buffer,
> size_t *lenp, loff_t *ppos)
> {
> - return proc_dointvec_minmax(table, write, buffer, lenp, ppos);
> + int ret;
> +
> + ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
> + if (ret || !write)
> + goto out;
> +
> + if (softlockup_thresh)
> + watchdog_enable_all_cpus();
> + else
> + watchdog_disable_all_cpus();

This now adds two similar looking blocks of these 4 lines, one in
proc_dowatchdog_enabled(), one in proc_dowatchdog_thresh().

They are not the same though. So what happens if the watchdog is disabled but
the threshold is updated to nonzero - do we enable the watchdog?

Thanks,

Ingo

2011-05-17 07:16:55

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 4/4] watchdog: configure nmi watchdog period based on watchdog_thresh


* Mandeep Singh Baines <[email protected]> wrote:

> Before the conversion of the NMI watchdog to perf event, the watchdog
> timeout was 5 seconds. Now it is 60 seconds. For my particular application,
> netbooks, 5 seconds was a better timeout. With a short timeout, we
> catch faults earlier and are able to send back a panic. With a 60 second
> timeout, the user is unlikely to wait and will instead hit the power
> button, causing us to lose the panic info.

That's an interesting observation. Have you been able to measure/observe this
effect somehow, or do you presume that users find 60 seconds too long?

This would be a concern for upstream as well i guess.

> This change configures the NMI period based on the watchdog_thresh.

Hm, our tolerance for the two thresholds is not just human but technical: hard
lockup warnings should indeed be triggered after just a few seconds, soft
lockups can have false positives under extreme conditions.

So we generally want a higher threshold for soft lockups than for hard lockups.

So how about we couple the thresholds with a factor: we make the soft threshold
twice the amount of time the hard threshold is? Then we could change the
upstream default as well i think: lets change the NMI timeout to 10 seconds
(and thus have the soft threshold at 20 seconds). Is 20 seconds short enough
for most users to not hit reset?

We might want to change another aspect of the NMI watchdog: right now it tries
to abort the offending task - which is really nasty if there was a spuriously
long irqs-off section somewhere in the kernel. How about we just print a
warning instead?

Thanks,

Ingo

2011-05-17 14:04:26

by Don Zickus

[permalink] [raw]
Subject: Re: [PATCH 4/4] watchdog: configure nmi watchdog period based on watchdog_thresh

On Tue, May 17, 2011 at 09:16:42AM +0200, Ingo Molnar wrote:
> Hm, our tolerance for the two thresholds is not just human but technical: hard
> lockup warnings should indeed be triggered after just a few seconds, soft
> lockups can have false positives under extreme conditions.
>
> So we generally want a higher threshold for soft lockups than for hard lockups.
>
> So how about we couple the thresholds with a factor: we make the soft threshold
> twice the amount of time the hard threshold is? Then we could change the
> upstream default as well i think: lets change the NMI timeout to 10 seconds
> (and thus have the soft threshold at 20 seconds). Is 20 seconds short enough
> for most users to not hit reset?

Making softlockup twice as long as hardlockup seems to make sense.
Setting the hardlockup to 10 seconds can be ok, but then you get into
power savings issues. For example, I have the timers setup to trigger 5
times a period (I know it probably should be 2 times), so at 10 seconds
that means the timers are firing every 2 seconds. That shows up on
powertop :-(. Though I was flirting with the idea of trying to slow down
or stop the timer when the cpu goes into deeper c-states. But that is a
different problem.

>
> We might want to change another aspect of the NMI watchdog: right now it tries
> to abort the offending task - which is really nasty if there was a spuriously
> long irqs-off section somewhere in the kernel. How about we just print a
> warning instead?

I dont understand this. IIRC NMI watchdog will either printk or panic on
a hardlockup. What do you mean by 'aborting' the task?

Cheers,
Don

>
> Thanks,
>
> Ingo

2011-05-17 18:47:54

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 4/4] watchdog: configure nmi watchdog period based on watchdog_thresh


* Don Zickus <[email protected]> wrote:

> On Tue, May 17, 2011 at 09:16:42AM +0200, Ingo Molnar wrote:
> > Hm, our tolerance for the two thresholds is not just human but technical: hard
> > lockup warnings should indeed be triggered after just a few seconds, soft
> > lockups can have false positives under extreme conditions.
> >
> > So we generally want a higher threshold for soft lockups than for hard lockups.
> >
> > So how about we couple the thresholds with a factor: we make the soft threshold
> > twice the amount of time the hard threshold is? Then we could change the
> > upstream default as well i think: lets change the NMI timeout to 10 seconds
> > (and thus have the soft threshold at 20 seconds). Is 20 seconds short enough
> > for most users to not hit reset?
>
> Making softlockup twice as long as hardlockup seems to make sense.
> Setting the hardlockup to 10 seconds can be ok, but then you get into
> power savings issues. For example, I have the timers setup to trigger 5
> times a period (I know it probably should be 2 times), so at 10 seconds
> that means the timers are firing every 2 seconds. That shows up on
> powertop :-(. Though I was flirting with the idea of trying to slow down
> or stop the timer when the cpu goes into deeper c-states. But that is a
> different problem.
>
> >
> > We might want to change another aspect of the NMI watchdog: right now it tries
> > to abort the offending task - which is really nasty if there was a spuriously
> > long irqs-off section somewhere in the kernel. How about we just print a
> > warning instead?
>
> I dont understand this. IIRC NMI watchdog will either printk or panic on
> a hardlockup. What do you mean by 'aborting' the task?

Oh, simple dementia on my side. We used to attempt a do_exit() in some long
gone version of that code :-)

Thanks,

Ingo

2011-05-18 03:46:47

by Mandeep Singh Baines

[permalink] [raw]
Subject: [PATCH 3/4 v2] watchdog: disable watchdog when thresh is zero

Ingo Molnar ([email protected]) wrote:
> This now adds two similar looking blocks of these 4 lines, one in
> proc_dowatchdog_enabled(), one in proc_dowatchdog_thresh().
>
> They are not the same though. So what happens if the watchdog is disabled but
> the threshold is updated to nonzero - do we enable the watchdog?
>

Good point. Fixed in this patch (v2). Also modified the patch to
disable the watchdog if thresh == 0 or enabled == 0.

I used a #define for proc_dowatchdog instead of an inline to avoid
duplicating the large parameter list.

---

This restores the previous behavior of softlock_thresh.

Currently, setting watchdog_thresh to zero causes the watchdog
kthreads to consume a lot of CPU.

In addition, the logic of proc_dowatchdog_thresh and proc_dowatchdog_enabled
has been factored into __proc_dowatchdog.

Signed-off-by: Mandeep Singh Baines <[email protected]>
LKML-Reference: <[email protected]>
Cc: Marcin Slusarz <[email protected]>
Cc: Don Zickus <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
---
include/linux/nmi.h | 7 +++++--
include/linux/sched.h | 1 -
kernel/sysctl.c | 4 +++-
kernel/watchdog.c | 25 +++++++++----------------
4 files changed, 17 insertions(+), 20 deletions(-)

diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index c536f85..10cbca7 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -47,9 +47,12 @@ static inline bool trigger_all_cpu_backtrace(void)
int hw_nmi_is_cpu_stuck(struct pt_regs *);
u64 hw_nmi_get_sample_period(void);
extern int watchdog_enabled;
+extern int watchdog_thresh;
struct ctl_table;
-extern int proc_dowatchdog_enabled(struct ctl_table *, int ,
- void __user *, size_t *, loff_t *);
+extern int __proc_dowatchdog(struct ctl_table *, int ,
+ void __user *, size_t *, loff_t *);
+#define proc_dowatchdog_enabled __proc_dowatchdog
+#define proc_dowatchdog_thresh __proc_dowatchdog
#endif

#endif
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 12211e1..d8b2d0b 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -315,7 +315,6 @@ extern int proc_dowatchdog_thresh(struct ctl_table *table, int write,
void __user *buffer,
size_t *lenp, loff_t *ppos);
extern unsigned int softlockup_panic;
-extern int softlockup_thresh;
void lockup_detector_init(void);
#else
static inline void touch_softlockup_watchdog(void)
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index c0bb324..acb12f4 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -731,10 +731,12 @@ static struct ctl_table kern_table[] = {
.maxlen = sizeof (int),
.mode = 0644,
.proc_handler = proc_dowatchdog_enabled,
+ .extra1 = &zero,
+ .extra2 = &one,
},
{
.procname = "watchdog_thresh",
- .data = &softlockup_thresh,
+ .data = &watchdog_thresh,
.maxlen = sizeof(int),
.mode = 0644,
.proc_handler = proc_dowatchdog_thresh,
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index cf0e09f..ea3dfc2 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -28,7 +28,7 @@
#include <linux/perf_event.h>

int watchdog_enabled = 1;
-int __read_mostly softlockup_thresh = 60;
+int __read_mostly watchdog_thresh = 60;

static DEFINE_PER_CPU(unsigned long, watchdog_touch_ts);
static DEFINE_PER_CPU(struct task_struct *, softlockup_watchdog);
@@ -105,12 +105,12 @@ static unsigned long get_timestamp(int this_cpu)
static unsigned long get_sample_period(void)
{
/*
- * convert softlockup_thresh from seconds to ns
+ * convert watchdog_thresh from seconds to ns
* the divide by 5 is to give hrtimer 5 chances to
* increment before the hardlockup detector generates
* a warning
*/
- return softlockup_thresh * (NSEC_PER_SEC / 5);
+ return watchdog_thresh * (NSEC_PER_SEC / 5);
}

/* Commands for resetting the watchdog */
@@ -182,7 +182,7 @@ static int is_softlockup(unsigned long touch_ts)
unsigned long now = get_timestamp(smp_processor_id());

/* Warn about unreasonable delays: */
- if (time_after(now, touch_ts + softlockup_thresh))
+ if (time_after(now, touch_ts + watchdog_thresh))
return now - touch_ts;

return 0;
@@ -501,19 +501,19 @@ static void watchdog_disable_all_cpus(void)
/* sysctl functions */
#ifdef CONFIG_SYSCTL
/*
- * proc handler for /proc/sys/kernel/nmi_watchdog
+ * proc handler for /proc/sys/kernel/nmi_watchdog,watchdog_thresh
*/

-int proc_dowatchdog_enabled(struct ctl_table *table, int write,
- void __user *buffer, size_t *length, loff_t *ppos)
+int __proc_dowatchdog(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos)
{
int ret;

- ret = proc_dointvec(table, write, buffer, length, ppos);
+ ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
if (ret || !write)
goto out;

- if (watchdog_enabled)
+ if (watchdog_enabled && watchdog_thresh)
watchdog_enable_all_cpus();
else
watchdog_disable_all_cpus();
@@ -521,13 +521,6 @@ int proc_dowatchdog_enabled(struct ctl_table *table, int write,
out:
return ret;
}
-
-int proc_dowatchdog_thresh(struct ctl_table *table, int write,
- void __user *buffer,
- size_t *lenp, loff_t *ppos)
-{
- return proc_dointvec_minmax(table, write, buffer, lenp, ppos);
-}
#endif /* CONFIG_SYSCTL */


--
1.7.3.1

2011-05-18 03:46:55

by Mandeep Singh Baines

[permalink] [raw]
Subject: [PATCH 4/4 v2] watchdog: configure nmi watchdog period based on watchdog_thresh

Ingo Molnar ([email protected]) wrote:
>
> * Mandeep Singh Baines <[email protected]> wrote:
>
> > Before the conversion of the NMI watchdog to perf event, the watchdog
> > timeout was 5 seconds. Now it is 60 seconds. For my particular application,
> > netbooks, 5 seconds was a better timeout. With a short timeout, we
> > catch faults earlier and are able to send back a panic. With a 60 second
> > timeout, the user is unlikely to wait and will instead hit the power
> > button, causing us to lose the panic info.
>
> That's an interesting observation. Have you been able to measure/observe this
> effect somehow, or do you presume that users find 60 seconds too long?
>

Mostly intuition. There is a threshold beyond which the user will hit
the power button. Not sure if its 20 seconds or 20 minutes. My feeling
was that the 1 minute was too long.

For a user experience perspective, a quick reboot also seems like a better
experience than a one minute hang. Our systems boot in 8 seconds and restore
the previous session so a reboot is almost not noticable.

> This would be a concern for upstream as well i guess.
>
> > This change configures the NMI period based on the watchdog_thresh.
>
> Hm, our tolerance for the two thresholds is not just human but technical: hard
> lockup warnings should indeed be triggered after just a few seconds, soft
> lockups can have false positives under extreme conditions.
>
> So we generally want a higher threshold for soft lockups than for hard lockups.
>
> So how about we couple the thresholds with a factor: we make the soft threshold
> twice the amount of time the hard threshold is? Then we could change the
> upstream default as well i think: lets change the NMI timeout to 10 seconds
> (and thus have the soft threshold at 20 seconds). Is 20 seconds short enough
> for most users to not hit reset?
>

Agree. Implemented in this version of the patch (v2).

---

Before the conversion of the NMI watchdog to perf event, the watchdog
timeout was 5 seconds. Now it is 60 seconds. For my particular application,
netbooks, 5 seconds was a better timeout. With a short timeout, we
catch faults earlier and are able to send back a panic. With a 60 second
timeout, the user is unlikely to wait and will instead hit the power
button, causing us to lose the panic info.

This change configures the NMI period to watchdog_thresh and sets
the softlockup_thresh to watchdog_thresh * 2. In addition, watchdog_thresh
was reduced to 10 seconds as suggested by Ingo Molnar.

Signed-off-by: Mandeep Singh Baines <[email protected]>
LKML-Reference: <[email protected]>
Cc: Marcin Slusarz <[email protected]>
Cc: Don Zickus <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
---
arch/x86/kernel/apic/hw_nmi.c | 4 ++--
include/linux/nmi.h | 2 +-
kernel/watchdog.c | 19 +++++++++++++++----
3 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index 5260fe9..d5e57db0 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -19,9 +19,9 @@
#include <linux/delay.h>

#ifdef CONFIG_HARDLOCKUP_DETECTOR
-u64 hw_nmi_get_sample_period(void)
+u64 hw_nmi_get_sample_period(int watchdog_thresh)
{
- return (u64)(cpu_khz) * 1000 * 60;
+ return (u64)(cpu_khz) * 1000 * watchdog_thresh;
}
#endif

diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index 10cbca7..a26fb4a 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -45,7 +45,7 @@ static inline bool trigger_all_cpu_backtrace(void)

#ifdef CONFIG_LOCKUP_DETECTOR
int hw_nmi_is_cpu_stuck(struct pt_regs *);
-u64 hw_nmi_get_sample_period(void);
+u64 hw_nmi_get_sample_period(int watchdog_thresh);
extern int watchdog_enabled;
extern int watchdog_thresh;
struct ctl_table;
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index ea3dfc2..2788fa9 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -28,7 +28,7 @@
#include <linux/perf_event.h>

int watchdog_enabled = 1;
-int __read_mostly watchdog_thresh = 60;
+int __read_mostly watchdog_thresh = 10;

static DEFINE_PER_CPU(unsigned long, watchdog_touch_ts);
static DEFINE_PER_CPU(struct task_struct *, softlockup_watchdog);
@@ -91,6 +91,17 @@ static int __init nosoftlockup_setup(char *str)
__setup("nosoftlockup", nosoftlockup_setup);
/* */

+/*
+ * Hard-lockup warnings should be triggered after just a few seconds. Soft-
+ * lockups can have false positives under extreme conditions. So we generally
+ * want a higher threshold for soft lockups than for hard lockups. So we couple
+ * the thresholds with a factor: we make the soft threshold twice the amount of
+ * time the hard threshold is.
+ */
+static int get_softlockup_thresh()
+{
+ return watchdog_thresh * 2;
+}

/*
* Returns seconds, approximately. We don't need nanosecond
@@ -110,7 +121,7 @@ static unsigned long get_sample_period(void)
* increment before the hardlockup detector generates
* a warning
*/
- return watchdog_thresh * (NSEC_PER_SEC / 5);
+ return get_softlockup_thresh() * (NSEC_PER_SEC / 5);
}

/* Commands for resetting the watchdog */
@@ -182,7 +193,7 @@ static int is_softlockup(unsigned long touch_ts)
unsigned long now = get_timestamp(smp_processor_id());

/* Warn about unreasonable delays: */
- if (time_after(now, touch_ts + watchdog_thresh))
+ if (time_after(now, touch_ts + get_softlockup_thresh()))
return now - touch_ts;

return 0;
@@ -359,7 +370,7 @@ static int watchdog_nmi_enable(int cpu)

/* Try to register using hardware perf events */
wd_attr = &wd_hw_attr;
- wd_attr->sample_period = hw_nmi_get_sample_period();
+ wd_attr->sample_period = hw_nmi_get_sample_period(watchdog_thresh);
event = perf_event_create_kernel_counter(wd_attr, cpu, NULL, watchdog_overflow_callback);
if (!IS_ERR(event)) {
printk(KERN_INFO "NMI watchdog enabled, takes one hw-pmu counter.\n");
--
1.7.3.1

2011-05-18 08:36:07

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 3/4 v2] watchdog: disable watchdog when thresh is zero


* Mandeep Singh Baines <[email protected]> wrote:

> +extern int watchdog_thresh;
> struct ctl_table;
> -extern int proc_dowatchdog_enabled(struct ctl_table *, int ,
> - void __user *, size_t *, loff_t *);
> +extern int __proc_dowatchdog(struct ctl_table *, int ,
> + void __user *, size_t *, loff_t *);
> +#define proc_dowatchdog_enabled __proc_dowatchdog
> +#define proc_dowatchdog_thresh __proc_dowatchdog

i like the other aspects of your patch but this one is a no-no, we do not use
1970's tech to obfuscate nice C code! :-)

If the argument list is annoying then introduce a helper structure. But having
it longer is no big issue either. Try to shorten the function names if
possible.

Sidenote, the sysctl code has been misdesigned a bit: it should be possible to
add sysctls in .c files and not centralize it all into kernel/sysctl.c
forcibly: we could should have a central static array by using a .sysctl_data
section or such. Anyone wanna fix/improve that?

Ingo

2011-05-18 08:39:50

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 4/4 v2] watchdog: configure nmi watchdog period based on watchdog_thresh


* Mandeep Singh Baines <[email protected]> wrote:

> Ingo Molnar ([email protected]) wrote:
> >
> > * Mandeep Singh Baines <[email protected]> wrote:
> >
> > > Before the conversion of the NMI watchdog to perf event, the watchdog
> > > timeout was 5 seconds. Now it is 60 seconds. For my particular application,
> > > netbooks, 5 seconds was a better timeout. With a short timeout, we
> > > catch faults earlier and are able to send back a panic. With a 60 second
> > > timeout, the user is unlikely to wait and will instead hit the power
> > > button, causing us to lose the panic info.
> >
> > That's an interesting observation. Have you been able to measure/observe this
> > effect somehow, or do you presume that users find 60 seconds too long?
> >
>
> Mostly intuition. There is a threshold beyond which the user will hit
> the power button. Not sure if its 20 seconds or 20 minutes. My feeling
> was that the 1 minute was too long.
>
> For a user experience perspective, a quick reboot also seems like a better
> experience than a one minute hang. Our systems boot in 8 seconds and restore
> the previous session so a reboot is almost not noticable.

Indeed you definitely want it configurable and have the delay down to 5 or 10
seconds, to correlate it with your boot delay.

Personally i consider any hang over 1 second annoying so you might want to work
on that 8 seconds boot time some more, it's too long ;-)

And any kernel code running with more than 1 second irqs off is a bug, plain
and simple.

Thanks,

Ingo

2011-05-18 13:52:20

by Don Zickus

[permalink] [raw]
Subject: Re: [PATCH 4/4 v2] watchdog: configure nmi watchdog period based on watchdog_thresh

On Wed, May 18, 2011 at 10:39:36AM +0200, Ingo Molnar wrote:
>
> * Mandeep Singh Baines <[email protected]> wrote:
>
> > Ingo Molnar ([email protected]) wrote:
> > >
> > > * Mandeep Singh Baines <[email protected]> wrote:
> > >
> > > > Before the conversion of the NMI watchdog to perf event, the watchdog
> > > > timeout was 5 seconds. Now it is 60 seconds. For my particular application,
> > > > netbooks, 5 seconds was a better timeout. With a short timeout, we
> > > > catch faults earlier and are able to send back a panic. With a 60 second
> > > > timeout, the user is unlikely to wait and will instead hit the power
> > > > button, causing us to lose the panic info.
> > >
> > > That's an interesting observation. Have you been able to measure/observe this
> > > effect somehow, or do you presume that users find 60 seconds too long?
> > >
> >
> > Mostly intuition. There is a threshold beyond which the user will hit
> > the power button. Not sure if its 20 seconds or 20 minutes. My feeling
> > was that the 1 minute was too long.
> >
> > For a user experience perspective, a quick reboot also seems like a better
> > experience than a one minute hang. Our systems boot in 8 seconds and restore
> > the previous session so a reboot is almost not noticable.
>
> Indeed you definitely want it configurable and have the delay down to 5 or 10
> seconds, to correlate it with your boot delay.
>
> Personally i consider any hang over 1 second annoying so you might want to work
> on that 8 seconds boot time some more, it's too long ;-)
>
> And any kernel code running with more than 1 second irqs off is a bug, plain
> and simple.

Not necessarily, but in those cases the code can use touch_nmi_watchdog
(for example slow hardware, long print outs, etc). :-)

Cheers,
Don

2011-05-18 13:53:15

by Don Zickus

[permalink] [raw]
Subject: Re: [PATCH 1/4] watchdog: fix rounding issues in get_sample_period()

On Mon, May 16, 2011 at 04:34:58PM -0700, Mandeep Singh Baines wrote:
> In get_sample_period(), softlockup_thresh is integer divided by 5 before
> the multiplication by NSEC_PER_SEC. This results in softlockup_thresh
> being rounded down to the nearest integer multiple of 5.
>
> For example, a softlockup_thresh of 4 rounds down to 0.

Mandeep,

Thanks for the patches. Unfortunately, I am taking some time off so I'll
shephard these patches when I get back. For the most part they look fine,
aside from Ingo's comments.

Cheers,
Don

>
> Signed-off-by: Mandeep Singh Baines <[email protected]>
> Cc: Marcin Slusarz <[email protected]>
> Cc: Don Zickus <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Frederic Weisbecker <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> ---
> kernel/watchdog.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index 14733d4..a06972d 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -110,7 +110,7 @@ static unsigned long get_sample_period(void)
> * increment before the hardlockup detector generates
> * a warning
> */
> - return softlockup_thresh / 5 * NSEC_PER_SEC;
> + return softlockup_thresh * (NSEC_PER_SEC / 5);
> }
>
> /* Commands for resetting the watchdog */
> --
> 1.7.3.1
>

2011-05-19 16:21:38

by Mandeep Singh Baines

[permalink] [raw]
Subject: [PATCH 3/4 v3] watchdog: disable watchdog when thresh is zero

Ingo Molnar ([email protected]) wrote:
>
> * Mandeep Singh Baines <[email protected]> wrote:
>
> > +extern int watchdog_thresh;
> > struct ctl_table;
> > -extern int proc_dowatchdog_enabled(struct ctl_table *, int ,
> > - void __user *, size_t *, loff_t *);
> > +extern int __proc_dowatchdog(struct ctl_table *, int ,
> > + void __user *, size_t *, loff_t *);
> > +#define proc_dowatchdog_enabled __proc_dowatchdog
> > +#define proc_dowatchdog_thresh __proc_dowatchdog
>
> i like the other aspects of your patch but this one is a no-no, we do not use
> 1970's tech to obfuscate nice C code! :-)
>

Yeah, you're right. Fixed in this patch (v3).

---

This restores the previous behavior of softlock_thresh.

Currently, setting watchdog_thresh to zero causes the watchdog
kthreads to consume a lot of CPU.

In addition, the logic of proc_dowatchdog_thresh and proc_dowatchdog_enabled
has been factored into proc_dowatchdog.

Signed-off-by: Mandeep Singh Baines <[email protected]>
LKML-Reference: <[email protected]>
Cc: Marcin Slusarz <[email protected]>
Cc: Don Zickus <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
---
include/linux/nmi.h | 5 +++--
include/linux/sched.h | 1 -
kernel/sysctl.c | 12 ++++++++----
kernel/watchdog.c | 25 +++++++++----------------
4 files changed, 20 insertions(+), 23 deletions(-)

diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index c536f85..5317b8b 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -47,9 +47,10 @@ static inline bool trigger_all_cpu_backtrace(void)
int hw_nmi_is_cpu_stuck(struct pt_regs *);
u64 hw_nmi_get_sample_period(void);
extern int watchdog_enabled;
+extern int watchdog_thresh;
struct ctl_table;
-extern int proc_dowatchdog_enabled(struct ctl_table *, int ,
- void __user *, size_t *, loff_t *);
+extern int proc_dowatchdog(struct ctl_table *, int ,
+ void __user *, size_t *, loff_t *);
#endif

#endif
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 12211e1..d8b2d0b 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -315,7 +315,6 @@ extern int proc_dowatchdog_thresh(struct ctl_table *table, int write,
void __user *buffer,
size_t *lenp, loff_t *ppos);
extern unsigned int softlockup_panic;
-extern int softlockup_thresh;
void lockup_detector_init(void);
#else
static inline void touch_softlockup_watchdog(void)
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index c0bb324..3dd0c46 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -730,14 +730,16 @@ static struct ctl_table kern_table[] = {
.data = &watchdog_enabled,
.maxlen = sizeof (int),
.mode = 0644,
- .proc_handler = proc_dowatchdog_enabled,
+ .proc_handler = proc_dowatchdog,
+ .extra1 = &zero,
+ .extra2 = &one,
},
{
.procname = "watchdog_thresh",
- .data = &softlockup_thresh,
+ .data = &watchdog_thresh,
.maxlen = sizeof(int),
.mode = 0644,
- .proc_handler = proc_dowatchdog_thresh,
+ .proc_handler = proc_dowatchdog,
.extra1 = &neg_one,
.extra2 = &sixty,
},
@@ -755,7 +757,9 @@ static struct ctl_table kern_table[] = {
.data = &watchdog_enabled,
.maxlen = sizeof (int),
.mode = 0644,
- .proc_handler = proc_dowatchdog_enabled,
+ .proc_handler = proc_dowatchdog,
+ .extra1 = &zero,
+ .extra2 = &one,
},
#endif
#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86)
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index cf0e09f..6030191 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -28,7 +28,7 @@
#include <linux/perf_event.h>

int watchdog_enabled = 1;
-int __read_mostly softlockup_thresh = 60;
+int __read_mostly watchdog_thresh = 60;

static DEFINE_PER_CPU(unsigned long, watchdog_touch_ts);
static DEFINE_PER_CPU(struct task_struct *, softlockup_watchdog);
@@ -105,12 +105,12 @@ static unsigned long get_timestamp(int this_cpu)
static unsigned long get_sample_period(void)
{
/*
- * convert softlockup_thresh from seconds to ns
+ * convert watchdog_thresh from seconds to ns
* the divide by 5 is to give hrtimer 5 chances to
* increment before the hardlockup detector generates
* a warning
*/
- return softlockup_thresh * (NSEC_PER_SEC / 5);
+ return watchdog_thresh * (NSEC_PER_SEC / 5);
}

/* Commands for resetting the watchdog */
@@ -182,7 +182,7 @@ static int is_softlockup(unsigned long touch_ts)
unsigned long now = get_timestamp(smp_processor_id());

/* Warn about unreasonable delays: */
- if (time_after(now, touch_ts + softlockup_thresh))
+ if (time_after(now, touch_ts + watchdog_thresh))
return now - touch_ts;

return 0;
@@ -501,19 +501,19 @@ static void watchdog_disable_all_cpus(void)
/* sysctl functions */
#ifdef CONFIG_SYSCTL
/*
- * proc handler for /proc/sys/kernel/nmi_watchdog
+ * proc handler for /proc/sys/kernel/nmi_watchdog,watchdog_thresh
*/

-int proc_dowatchdog_enabled(struct ctl_table *table, int write,
- void __user *buffer, size_t *length, loff_t *ppos)
+int proc_dowatchdog(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos)
{
int ret;

- ret = proc_dointvec(table, write, buffer, length, ppos);
+ ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
if (ret || !write)
goto out;

- if (watchdog_enabled)
+ if (watchdog_enabled && watchdog_thresh)
watchdog_enable_all_cpus();
else
watchdog_disable_all_cpus();
@@ -521,13 +521,6 @@ int proc_dowatchdog_enabled(struct ctl_table *table, int write,
out:
return ret;
}
-
-int proc_dowatchdog_thresh(struct ctl_table *table, int write,
- void __user *buffer,
- size_t *lenp, loff_t *ppos)
-{
- return proc_dointvec_minmax(table, write, buffer, lenp, ppos);
-}
#endif /* CONFIG_SYSCTL */


--
1.7.3.1

2011-05-19 16:26:49

by Mandeep Singh Baines

[permalink] [raw]
Subject: Re: [PATCH 1/4] watchdog: fix rounding issues in get_sample_period()

Don Zickus ([email protected]) wrote:
> On Mon, May 16, 2011 at 04:34:58PM -0700, Mandeep Singh Baines wrote:
> > In get_sample_period(), softlockup_thresh is integer divided by 5 before
> > the multiplication by NSEC_PER_SEC. This results in softlockup_thresh
> > being rounded down to the nearest integer multiple of 5.
> >
> > For example, a softlockup_thresh of 4 rounds down to 0.
>
> Mandeep,
>
> Thanks for the patches. Unfortunately, I am taking some time off so I'll
> shephard these patches when I get back. For the most part they look fine,
> aside from Ingo's comments.
>

No problem. Enjoy your time off:)

Ingo, since Don is out, would you mind acking the patches (assuming you're
happy). It simplifies pushing these patches into our local ChromiumOS tree
if I have an upstream ack.

> Cheers,
> Don
>
> >
> > Signed-off-by: Mandeep Singh Baines <[email protected]>
> > Cc: Marcin Slusarz <[email protected]>
> > Cc: Don Zickus <[email protected]>
> > Cc: Peter Zijlstra <[email protected]>
> > Cc: Frederic Weisbecker <[email protected]>
> > Cc: Ingo Molnar <[email protected]>
> > ---
> > kernel/watchdog.c | 2 +-
> > 1 files changed, 1 insertions(+), 1 deletions(-)
> >
> > diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> > index 14733d4..a06972d 100644
> > --- a/kernel/watchdog.c
> > +++ b/kernel/watchdog.c
> > @@ -110,7 +110,7 @@ static unsigned long get_sample_period(void)
> > * increment before the hardlockup detector generates
> > * a warning
> > */
> > - return softlockup_thresh / 5 * NSEC_PER_SEC;
> > + return softlockup_thresh * (NSEC_PER_SEC / 5);
> > }
> >
> > /* Commands for resetting the watchdog */
> > --
> > 1.7.3.1
> >

2011-05-20 11:51:29

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 1/4] watchdog: fix rounding issues in get_sample_period()


* Mandeep Singh Baines <[email protected]> wrote:

> Ingo, since Don is out, would you mind acking the patches (assuming you're
> happy). It simplifies pushing these patches into our local ChromiumOS tree if
> I have an upstream ack.

I suspect it would simplify things even more for you if they were all upstream,
right?

To get that process underway please send an updated series with (delta) patches
against -tip.

Thanks,

Ingo

2011-05-23 05:15:01

by Mandeep Singh Baines

[permalink] [raw]
Subject: Re: [PATCH 1/4] watchdog: fix rounding issues in get_sample_period()

Ingo Molnar ([email protected]) wrote:
>
> * Mandeep Singh Baines <[email protected]> wrote:
>
> > Ingo, since Don is out, would you mind acking the patches (assuming you're
> > happy). It simplifies pushing these patches into our local ChromiumOS tree if
> > I have an upstream ack.
>
> I suspect it would simplify things even more for you if they were all upstream,
> right?
>

Most definitely:)

> To get that process underway please send an updated series with (delta) patches
> against -tip.
>

Rebased series against tip/master and resent:

* Message-Id: <[email protected]>

> Thanks,
>
> Ingo

2011-05-23 11:13:25

by Mandeep Singh Baines

[permalink] [raw]
Subject: [tip:perf/urgent] watchdog: Disable watchdog when thresh is zero

Commit-ID: 586692a5a5fc5740c8a46abc0f2365495c2d7c5f
Gitweb: http://git.kernel.org/tip/586692a5a5fc5740c8a46abc0f2365495c2d7c5f
Author: Mandeep Singh Baines <[email protected]>
AuthorDate: Sun, 22 May 2011 22:10:22 -0700
Committer: Ingo Molnar <[email protected]>
CommitDate: Mon, 23 May 2011 11:58:59 +0200

watchdog: Disable watchdog when thresh is zero

This restores the previous behavior of softlock_thresh.

Currently, setting watchdog_thresh to zero causes the watchdog
kthreads to consume a lot of CPU.

In addition, the logic of proc_dowatchdog_thresh and
proc_dowatchdog_enabled has been factored into proc_dowatchdog.

Signed-off-by: Mandeep Singh Baines <[email protected]>
Cc: Marcin Slusarz <[email protected]>
Cc: Don Zickus <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
LKML-Reference: <[email protected]>
---
include/linux/nmi.h | 5 +++--
include/linux/sched.h | 1 -
kernel/sysctl.c | 12 ++++++++----
kernel/watchdog.c | 25 +++++++++----------------
4 files changed, 20 insertions(+), 23 deletions(-)

diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index c536f85..5317b8b 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -47,9 +47,10 @@ static inline bool trigger_all_cpu_backtrace(void)
int hw_nmi_is_cpu_stuck(struct pt_regs *);
u64 hw_nmi_get_sample_period(void);
extern int watchdog_enabled;
+extern int watchdog_thresh;
struct ctl_table;
-extern int proc_dowatchdog_enabled(struct ctl_table *, int ,
- void __user *, size_t *, loff_t *);
+extern int proc_dowatchdog(struct ctl_table *, int ,
+ void __user *, size_t *, loff_t *);
#endif

#endif
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 12211e1..d8b2d0b 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -315,7 +315,6 @@ extern int proc_dowatchdog_thresh(struct ctl_table *table, int write,
void __user *buffer,
size_t *lenp, loff_t *ppos);
extern unsigned int softlockup_panic;
-extern int softlockup_thresh;
void lockup_detector_init(void);
#else
static inline void touch_softlockup_watchdog(void)
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index c0bb324..3dd0c46 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -730,14 +730,16 @@ static struct ctl_table kern_table[] = {
.data = &watchdog_enabled,
.maxlen = sizeof (int),
.mode = 0644,
- .proc_handler = proc_dowatchdog_enabled,
+ .proc_handler = proc_dowatchdog,
+ .extra1 = &zero,
+ .extra2 = &one,
},
{
.procname = "watchdog_thresh",
- .data = &softlockup_thresh,
+ .data = &watchdog_thresh,
.maxlen = sizeof(int),
.mode = 0644,
- .proc_handler = proc_dowatchdog_thresh,
+ .proc_handler = proc_dowatchdog,
.extra1 = &neg_one,
.extra2 = &sixty,
},
@@ -755,7 +757,9 @@ static struct ctl_table kern_table[] = {
.data = &watchdog_enabled,
.maxlen = sizeof (int),
.mode = 0644,
- .proc_handler = proc_dowatchdog_enabled,
+ .proc_handler = proc_dowatchdog,
+ .extra1 = &zero,
+ .extra2 = &one,
},
#endif
#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86)
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index cf0e09f..6030191 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -28,7 +28,7 @@
#include <linux/perf_event.h>

int watchdog_enabled = 1;
-int __read_mostly softlockup_thresh = 60;
+int __read_mostly watchdog_thresh = 60;

static DEFINE_PER_CPU(unsigned long, watchdog_touch_ts);
static DEFINE_PER_CPU(struct task_struct *, softlockup_watchdog);
@@ -105,12 +105,12 @@ static unsigned long get_timestamp(int this_cpu)
static unsigned long get_sample_period(void)
{
/*
- * convert softlockup_thresh from seconds to ns
+ * convert watchdog_thresh from seconds to ns
* the divide by 5 is to give hrtimer 5 chances to
* increment before the hardlockup detector generates
* a warning
*/
- return softlockup_thresh * (NSEC_PER_SEC / 5);
+ return watchdog_thresh * (NSEC_PER_SEC / 5);
}

/* Commands for resetting the watchdog */
@@ -182,7 +182,7 @@ static int is_softlockup(unsigned long touch_ts)
unsigned long now = get_timestamp(smp_processor_id());

/* Warn about unreasonable delays: */
- if (time_after(now, touch_ts + softlockup_thresh))
+ if (time_after(now, touch_ts + watchdog_thresh))
return now - touch_ts;

return 0;
@@ -501,19 +501,19 @@ static void watchdog_disable_all_cpus(void)
/* sysctl functions */
#ifdef CONFIG_SYSCTL
/*
- * proc handler for /proc/sys/kernel/nmi_watchdog
+ * proc handler for /proc/sys/kernel/nmi_watchdog,watchdog_thresh
*/

-int proc_dowatchdog_enabled(struct ctl_table *table, int write,
- void __user *buffer, size_t *length, loff_t *ppos)
+int proc_dowatchdog(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos)
{
int ret;

- ret = proc_dointvec(table, write, buffer, length, ppos);
+ ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
if (ret || !write)
goto out;

- if (watchdog_enabled)
+ if (watchdog_enabled && watchdog_thresh)
watchdog_enable_all_cpus();
else
watchdog_disable_all_cpus();
@@ -521,13 +521,6 @@ int proc_dowatchdog_enabled(struct ctl_table *table, int write,
out:
return ret;
}
-
-int proc_dowatchdog_thresh(struct ctl_table *table, int write,
- void __user *buffer,
- size_t *lenp, loff_t *ppos)
-{
- return proc_dointvec_minmax(table, write, buffer, lenp, ppos);
-}
#endif /* CONFIG_SYSCTL */

2011-05-23 11:13:44

by Mandeep Singh Baines

[permalink] [raw]
Subject: [tip:perf/urgent] watchdog: Change the default timeout and configure nmi watchdog period based on watchdog_thresh

Commit-ID: 4eec42f392043063d0f019640b4ccc2a45570002
Gitweb: http://git.kernel.org/tip/4eec42f392043063d0f019640b4ccc2a45570002
Author: Mandeep Singh Baines <[email protected]>
AuthorDate: Sun, 22 May 2011 22:10:23 -0700
Committer: Ingo Molnar <[email protected]>
CommitDate: Mon, 23 May 2011 11:58:59 +0200

watchdog: Change the default timeout and configure nmi watchdog period based on watchdog_thresh

Before the conversion of the NMI watchdog to perf event, the
watchdog timeout was 5 seconds. Now it is 60 seconds. For my
particular application, netbooks, 5 seconds was a better
timeout. With a short timeout, we catch faults earlier and are
able to send back a panic. With a 60 second timeout, the user is
unlikely to wait and will instead hit the power button, causing
us to lose the panic info.

This change configures the NMI period to watchdog_thresh and
sets the softlockup_thresh to watchdog_thresh * 2. In addition,
watchdog_thresh was reduced to 10 seconds as suggested by Ingo
Molnar.

Signed-off-by: Mandeep Singh Baines <[email protected]>
Cc: Marcin Slusarz <[email protected]>
Cc: Don Zickus <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
LKML-Reference: <[email protected]>
---
arch/x86/kernel/apic/hw_nmi.c | 4 ++--
include/linux/nmi.h | 2 +-
kernel/watchdog.c | 19 +++++++++++++++----
3 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index 5260fe9..d5e57db0 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -19,9 +19,9 @@
#include <linux/delay.h>

#ifdef CONFIG_HARDLOCKUP_DETECTOR
-u64 hw_nmi_get_sample_period(void)
+u64 hw_nmi_get_sample_period(int watchdog_thresh)
{
- return (u64)(cpu_khz) * 1000 * 60;
+ return (u64)(cpu_khz) * 1000 * watchdog_thresh;
}
#endif

diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index 5317b8b..2d304ef 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -45,7 +45,7 @@ static inline bool trigger_all_cpu_backtrace(void)

#ifdef CONFIG_LOCKUP_DETECTOR
int hw_nmi_is_cpu_stuck(struct pt_regs *);
-u64 hw_nmi_get_sample_period(void);
+u64 hw_nmi_get_sample_period(int watchdog_thresh);
extern int watchdog_enabled;
extern int watchdog_thresh;
struct ctl_table;
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 6030191..6e63097 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -28,7 +28,7 @@
#include <linux/perf_event.h>

int watchdog_enabled = 1;
-int __read_mostly watchdog_thresh = 60;
+int __read_mostly watchdog_thresh = 10;

static DEFINE_PER_CPU(unsigned long, watchdog_touch_ts);
static DEFINE_PER_CPU(struct task_struct *, softlockup_watchdog);
@@ -91,6 +91,17 @@ static int __init nosoftlockup_setup(char *str)
__setup("nosoftlockup", nosoftlockup_setup);
/* */

+/*
+ * Hard-lockup warnings should be triggered after just a few seconds. Soft-
+ * lockups can have false positives under extreme conditions. So we generally
+ * want a higher threshold for soft lockups than for hard lockups. So we couple
+ * the thresholds with a factor: we make the soft threshold twice the amount of
+ * time the hard threshold is.
+ */
+static int get_softlockup_thresh()
+{
+ return watchdog_thresh * 2;
+}

/*
* Returns seconds, approximately. We don't need nanosecond
@@ -110,7 +121,7 @@ static unsigned long get_sample_period(void)
* increment before the hardlockup detector generates
* a warning
*/
- return watchdog_thresh * (NSEC_PER_SEC / 5);
+ return get_softlockup_thresh() * (NSEC_PER_SEC / 5);
}

/* Commands for resetting the watchdog */
@@ -182,7 +193,7 @@ static int is_softlockup(unsigned long touch_ts)
unsigned long now = get_timestamp(smp_processor_id());

/* Warn about unreasonable delays: */
- if (time_after(now, touch_ts + watchdog_thresh))
+ if (time_after(now, touch_ts + get_softlockup_thresh()))
return now - touch_ts;

return 0;
@@ -359,7 +370,7 @@ static int watchdog_nmi_enable(int cpu)

/* Try to register using hardware perf events */
wd_attr = &wd_hw_attr;
- wd_attr->sample_period = hw_nmi_get_sample_period();
+ wd_attr->sample_period = hw_nmi_get_sample_period(watchdog_thresh);
event = perf_event_create_kernel_counter(wd_attr, cpu, NULL, watchdog_overflow_callback);
if (!IS_ERR(event)) {
printk(KERN_INFO "NMI watchdog enabled, takes one hw-pmu counter.\n");

2011-05-24 03:58:43

by Ingo Molnar

[permalink] [raw]
Subject: [tip:perf/urgent] watchdog: Fix non-standard prototype of get_softlockup_thresh()

Commit-ID: 6e9101aeec39961308176e0f59e73ac5d37d243a
Gitweb: http://git.kernel.org/tip/6e9101aeec39961308176e0f59e73ac5d37d243a
Author: Ingo Molnar <[email protected]>
AuthorDate: Tue, 24 May 2011 05:43:18 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Tue, 24 May 2011 05:53:39 +0200

watchdog: Fix non-standard prototype of get_softlockup_thresh()

This build warning slipped through:

kernel/watchdog.c:102: warning: function declaration isn't a prototype

As reported by Stephen Rothwell.

Also address an unused variable warning that GCC 4.6.0 reports:
we cannot do anything about failed watchdog ops during CPU hotplug
(it's not serious enough to return an error from the notifier),
so ignore them.

Reported-by: Stephen Rothwell <[email protected]>
Cc: Mandeep Singh Baines <[email protected]>
Cc: Marcin Slusarz <[email protected]>
Cc: Don Zickus <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
LKML-Reference: <[email protected]>
---
kernel/watchdog.c | 11 ++++-------
1 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 6e63097..3d0c56a 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -98,7 +98,7 @@ __setup("nosoftlockup", nosoftlockup_setup);
* the thresholds with a factor: we make the soft threshold twice the amount of
* time the hard threshold is.
*/
-static int get_softlockup_thresh()
+static int get_softlockup_thresh(void)
{
return watchdog_thresh * 2;
}
@@ -415,15 +415,13 @@ static void watchdog_nmi_disable(int cpu) { return; }
#endif /* CONFIG_HARDLOCKUP_DETECTOR */

/* prepare/enable/disable routines */
-static int watchdog_prepare_cpu(int cpu)
+static void watchdog_prepare_cpu(int cpu)
{
struct hrtimer *hrtimer = &per_cpu(watchdog_hrtimer, cpu);

WARN_ON(per_cpu(softlockup_watchdog, cpu));
hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
hrtimer->function = watchdog_timer_fn;
-
- return 0;
}

static int watchdog_enable(int cpu)
@@ -542,17 +540,16 @@ static int __cpuinit
cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
{
int hotcpu = (unsigned long)hcpu;
- int err = 0;

switch (action) {
case CPU_UP_PREPARE:
case CPU_UP_PREPARE_FROZEN:
- err = watchdog_prepare_cpu(hotcpu);
+ watchdog_prepare_cpu(hotcpu);
break;
case CPU_ONLINE:
case CPU_ONLINE_FROZEN:
if (watchdog_enabled)
- err = watchdog_enable(hotcpu);
+ watchdog_enable(hotcpu);
break;
#ifdef CONFIG_HOTPLUG_CPU
case CPU_UP_CANCELED:

2011-06-09 11:48:05

by Don Zickus

[permalink] [raw]
Subject: Re: [PATCH 1/4] watchdog: fix rounding issues in get_sample_period()

On Thu, May 19, 2011 at 09:26:31AM -0700, Mandeep Singh Baines wrote:
> Don Zickus ([email protected]) wrote:
> > On Mon, May 16, 2011 at 04:34:58PM -0700, Mandeep Singh Baines wrote:
> > > In get_sample_period(), softlockup_thresh is integer divided by 5 before
> > > the multiplication by NSEC_PER_SEC. This results in softlockup_thresh
> > > being rounded down to the nearest integer multiple of 5.
> > >
> > > For example, a softlockup_thresh of 4 rounds down to 0.
> >
> > Mandeep,
> >
> > Thanks for the patches. Unfortunately, I am taking some time off so I'll
> > shephard these patches when I get back. For the most part they look fine,
> > aside from Ingo's comments.
> >
>
> No problem. Enjoy your time off:)
>
> Ingo, since Don is out, would you mind acking the patches (assuming you're
> happy). It simplifies pushing these patches into our local ChromiumOS tree
> if I have an upstream ack.

Hi Mandeep,

It looks like Ingo took most of your patches. Is there any outstanding
ones that didn't make it that you see?

Cheers,
Don

2011-06-09 14:52:21

by Mandeep Singh Baines

[permalink] [raw]
Subject: Re: [PATCH 1/4] watchdog: fix rounding issues in get_sample_period()

Don Zickus ([email protected]) wrote:
> On Thu, May 19, 2011 at 09:26:31AM -0700, Mandeep Singh Baines wrote:
> > Don Zickus ([email protected]) wrote:
> > > On Mon, May 16, 2011 at 04:34:58PM -0700, Mandeep Singh Baines wrote:
> > > > In get_sample_period(), softlockup_thresh is integer divided by 5 before
> > > > the multiplication by NSEC_PER_SEC. This results in softlockup_thresh
> > > > being rounded down to the nearest integer multiple of 5.
> > > >
> > > > For example, a softlockup_thresh of 4 rounds down to 0.
> > >
> > > Mandeep,
> > >
> > > Thanks for the patches. Unfortunately, I am taking some time off so I'll
> > > shephard these patches when I get back. For the most part they look fine,
> > > aside from Ingo's comments.
> > >
> >
> > No problem. Enjoy your time off:)
> >
> > Ingo, since Don is out, would you mind acking the patches (assuming you're
> > happy). It simplifies pushing these patches into our local ChromiumOS tree
> > if I have an upstream ack.
>
> Hi Mandeep,
>
> It looks like Ingo took most of your patches. Is there any outstanding
> ones that didn't make it that you see?
>

Hi Don,

Thanks for following up. All the patches are now in.

I am working on a few new NMI issues so I'll be sending out a few patches
in a day or so. I've committed one critical fix to our 2.6.32 tree and am now
working on forward-porting it to tip/master:

http://git.chromium.org/gitweb/?p=chromiumos/third_party/kernel.git;a=commitdiff;h=2a825f5f38f24e655dd655253eaaa5a82460a43e;hp=09ecdd5b43b9df930efd51a3fe317c18907a6c2d

Regards,
Mandeep

> Cheers,
> Don