2006-10-21 16:51:57

by Andi Kleen

[permalink] [raw]
Subject: [PATCH] [14/19] i386: Disable nmi watchdog on all ThinkPads


Even newer Thinkpads have bugs in SMM code that causes hangs with
NMI watchdog.

Signed-off-by: Andi Kleen <[email protected]>

---
arch/i386/kernel/nmi.c | 10 +++++-----
drivers/firmware/dmi_scan.c | 20 ++++++++++++++++++++
include/linux/dmi.h | 2 ++
3 files changed, 27 insertions(+), 5 deletions(-)

Index: linux/arch/i386/kernel/nmi.c
===================================================================
--- linux.orig/arch/i386/kernel/nmi.c
+++ linux/arch/i386/kernel/nmi.c
@@ -219,11 +219,11 @@ static int __init check_nmi_watchdog(voi
int cpu;

/* Enable NMI watchdog for newer systems.
- Actually it should be safe for most systems before 2004 too except
- for some IBM systems that corrupt registers when NMI happens
- during SMM. Unfortunately we don't have more exact information
- on these and use this coarse check. */
- if (nmi_watchdog == NMI_DEFAULT && dmi_get_year(DMI_BIOS_DATE) >= 2004)
+ Probably safe on most older systems too, but let's be careful.
+ IBM ThinkPads use INT10 inside SMM and that allows early NMI inside SMM
+ which hangs the system. Disable watchdog for all thinkpads */
+ if (nmi_watchdog == NMI_DEFAULT && dmi_get_year(DMI_BIOS_DATE) >= 2004 &&
+ !dmi_name_in_vendors("ThinkPad"))
nmi_watchdog = NMI_LOCAL_APIC;

if ((nmi_watchdog == NMI_NONE) || (nmi_watchdog == NMI_DEFAULT))
Index: linux/drivers/firmware/dmi_scan.c
===================================================================
--- linux.orig/drivers/firmware/dmi_scan.c
+++ linux/drivers/firmware/dmi_scan.c
@@ -326,6 +326,26 @@ char *dmi_get_system_info(int field)
}
EXPORT_SYMBOL(dmi_get_system_info);

+
+/**
+ * dmi_name_in_vendors - Check if string is anywhere in the DMI vendor information.
+ * @str: Case sensitive Name
+ */
+int dmi_name_in_vendors(char *str)
+{
+ static int fields[] = { DMI_BIOS_VENDOR, DMI_BIOS_VERSION, DMI_SYS_VENDOR,
+ DMI_PRODUCT_NAME, DMI_PRODUCT_VERSION, DMI_BOARD_VENDOR,
+ DMI_BOARD_NAME, DMI_BOARD_VERSION, DMI_NONE };
+ int i;
+ for (i = 0; fields[i] != DMI_NONE; i++) {
+ int f = fields[i];
+ if (dmi_ident[f] && strstr(dmi_ident[f], str))
+ return 1;
+ }
+ return 0;
+}
+EXPORT_SYMBOL(dmi_name_in_vendors);
+
/**
* dmi_find_device - find onboard device by type/name
* @type: device type or %DMI_DEV_TYPE_ANY to match all device types
Index: linux/include/linux/dmi.h
===================================================================
--- linux.orig/include/linux/dmi.h
+++ linux/include/linux/dmi.h
@@ -69,6 +69,7 @@ extern struct dmi_device * dmi_find_devi
struct dmi_device *from);
extern void dmi_scan_machine(void);
extern int dmi_get_year(int field);
+extern int dmi_name_in_vendors(char *str);

#else

@@ -77,6 +78,7 @@ static inline char * dmi_get_system_info
static inline struct dmi_device * dmi_find_device(int type, const char *name,
struct dmi_device *from) { return NULL; }
static inline int dmi_get_year(int year) { return 0; }
+static inline int dmi_name_in_vendors(char *s) { return 0; }

#endif


2006-10-21 17:24:09

by Dave Jones

[permalink] [raw]
Subject: Re: [PATCH] [14/19] i386: Disable nmi watchdog on all ThinkPads

On Sat, Oct 21, 2006 at 06:51:34PM +0200, Andi Kleen wrote:
>
> Even newer Thinkpads have bugs in SMM code that causes hangs with
> NMI watchdog.
>
> Signed-off-by: Andi Kleen <[email protected]>
>
> ---
> arch/i386/kernel/nmi.c | 10 +++++-----
> drivers/firmware/dmi_scan.c | 20 ++++++++++++++++++++
> include/linux/dmi.h | 2 ++
> 3 files changed, 27 insertions(+), 5 deletions(-)
>
> Index: linux/arch/i386/kernel/nmi.c
> ===================================================================
> --- linux.orig/arch/i386/kernel/nmi.c
> +++ linux/arch/i386/kernel/nmi.c
> @@ -219,11 +219,11 @@ static int __init check_nmi_watchdog(voi
> int cpu;
>
> /* Enable NMI watchdog for newer systems.
> - Actually it should be safe for most systems before 2004 too except
> - for some IBM systems that corrupt registers when NMI happens
> - during SMM. Unfortunately we don't have more exact information
> - on these and use this coarse check. */
> - if (nmi_watchdog == NMI_DEFAULT && dmi_get_year(DMI_BIOS_DATE) >= 2004)
> + Probably safe on most older systems too, but let's be careful.
> + IBM ThinkPads use INT10 inside SMM and that allows early NMI inside SMM
> + which hangs the system. Disable watchdog for all thinkpads */
> + if (nmi_watchdog == NMI_DEFAULT && dmi_get_year(DMI_BIOS_DATE) >= 2004 &&
> + !dmi_name_in_vendors("ThinkPad"))
> nmi_watchdog = NMI_LOCAL_APIC;

This is going to get some people scratching their heads wondering
why it isn't working if they ever try nmi_watchdog on one of these.
How about adding an explanitory printk ?

Dave

--
http://www.codemonkey.org.uk

2006-10-21 18:12:12

by Andi Kleen

[permalink] [raw]
Subject: Re: [patches] Re: [PATCH] [14/19] i386: Disable nmi watchdog on all ThinkPads


> > - if (nmi_watchdog == NMI_DEFAULT && dmi_get_year(DMI_BIOS_DATE) >= 2004)
> > + Probably safe on most older systems too, but let's be careful.
> > + IBM ThinkPads use INT10 inside SMM and that allows early NMI inside SMM
> > + which hangs the system. Disable watchdog for all thinkpads */
> > + if (nmi_watchdog == NMI_DEFAULT && dmi_get_year(DMI_BIOS_DATE) >= 2004 &&
> > + !dmi_name_in_vendors("ThinkPad"))
> > nmi_watchdog = NMI_LOCAL_APIC;
>
> This is going to get some people scratching their heads wondering
> why it isn't working if they ever try nmi_watchdog on one of these.
> How about adding an explanitory printk ?

When you enable it manually then NMI_DEFAULT won't be set and this code
is never executed.

BTW their machines will likely not stay up long enough that they can
see the printk (unless Lenovo fixes that particular bug in the future,
they are aware of it)

-Andi

2006-10-21 18:14:46

by Dave Jones

[permalink] [raw]
Subject: Re: [patches] Re: [PATCH] [14/19] i386: Disable nmi watchdog on all ThinkPads

On Sat, Oct 21, 2006 at 08:11:56PM +0200, Andi Kleen wrote:
>
> > > - if (nmi_watchdog == NMI_DEFAULT && dmi_get_year(DMI_BIOS_DATE) >= 2004)
> > > + Probably safe on most older systems too, but let's be careful.
> > > + IBM ThinkPads use INT10 inside SMM and that allows early NMI inside SMM
> > > + which hangs the system. Disable watchdog for all thinkpads */
> > > + if (nmi_watchdog == NMI_DEFAULT && dmi_get_year(DMI_BIOS_DATE) >= 2004 &&
> > > + !dmi_name_in_vendors("ThinkPad"))
> > > nmi_watchdog = NMI_LOCAL_APIC;
> >
> > This is going to get some people scratching their heads wondering
> > why it isn't working if they ever try nmi_watchdog on one of these.
> > How about adding an explanitory printk ?
>
> When you enable it manually then NMI_DEFAULT won't be set and this code
> is never executed.
>
> BTW their machines will likely not stay up long enough that they can
> see the printk (unless Lenovo fixes that particular bug in the future,
> they are aware of it)

Ouch, nasty. I'm surprised no-one complained about this earlier.

Dave


--
http://www.codemonkey.org.uk

2006-10-21 18:22:35

by Andi Kleen

[permalink] [raw]
Subject: Re: [patches] Re: [PATCH] [14/19] i386: Disable nmi watchdog on all ThinkPads


> Ouch, nasty. I'm surprised no-one complained about this earlier.

i386 only enabled it post 2.6.18. I suppose nobody tested ThinkPads
after that.

Actually I should probably add it to 64bit too, but still waiting
for at least one report (maybe maybe they have already fixed it in
the Core2 capable models)

-Andi