2015-11-02 21:22:45

by Ville Syrjälä

[permalink] [raw]
Subject: [PATCH] intel_idle: Don't use on Lenovo Ideapad S10-3t

From: Ville Syrjälä <[email protected]>

Lenovo Ideapad S10-3t hangs coming out of S3 with intel_idle.
The two workaround that seem to help are "intel_idle.max_cstate=0"
or "nohz=off highres=off".

At a first glance quirk_tigerpoint_bm_sts() seemed promising, but
even when moved to early_resume it didn't do anything.

I have no idea what's wrong here, so let's just disable intel_idle
for these machines using a DMI match.

Cc: Len Brown <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Ville Syrjälä <[email protected]>
---
If anyone has any better ideas, I can try out some patches.

drivers/idle/intel_idle.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)

diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index cd4510a..c4a6888 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -61,6 +61,7 @@
#include <linux/notifier.h>
#include <linux/cpu.h>
#include <linux/module.h>
+#include <linux/dmi.h>
#include <asm/cpu_device_id.h>
#include <asm/mwait.h>
#include <asm/msr.h>
@@ -925,6 +926,25 @@ static const struct x86_cpu_id intel_idle_ids[] __initconst = {
};
MODULE_DEVICE_TABLE(x86cpu, intel_idle_ids);

+static int intel_idle_disable_callback(const struct dmi_system_id *id)
+{
+ pr_debug(PREFIX "problematic system (%s), disabling\n", id->ident);
+ return 1;
+}
+
+static const struct dmi_system_id intel_idle_disable_dmi[] = {
+ {
+ /* Lenovo Ideapad S10-3t, hangs coming out of S3 */
+ .callback = intel_idle_disable_callback,
+ .ident = "Lenovo Ideapad S10-3t",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_VERSION, "Lenovo Ideapad S10-3t"),
+ },
+ },
+ {}
+};
+
/*
* intel_idle_probe()
*/
@@ -957,6 +977,9 @@ static int __init intel_idle_probe(void)
!mwait_substates)
return -ENODEV;

+ if (dmi_check_system(intel_idle_disable_dmi))
+ return -ENODEV;
+
pr_debug(PREFIX "MWAIT substates: 0x%x\n", mwait_substates);

icpu = (const struct idle_cpu *)id->driver_data;
--
2.4.10


2015-11-03 03:04:27

by Brown, Len

[permalink] [raw]
Subject: RE: [PATCH] intel_idle: Don't use on Lenovo Ideapad S10-3t

> Lenovo Ideapad S10-3t hangs coming out of S3 with intel_idle.
> The two workaround that seem to help are "intel_idle.max_cstate=0"
> or "nohz=off highres=off".
>
> At a first glance quirk_tigerpoint_bm_sts() seemed promising, but
> even when moved to early_resume it didn't do anything.
>
> I have no idea what's wrong here, so let's just disable intel_idle
> for these machines using a DMI match.

Ville,

It is great that several workarounds have been discovered.

But it would be better to get a good idea of the root-cause
before permanently ignoring the problem via a new
black-list in the upstream kernel.

Is it possible for you to file a bug at bugzilla.kernel.org
against Product: power-management; component: intel_idle?

In it, please put the following information.

If this is a regression, the oldest kernel that broke.

When booted with intel_idle, and then without:

dmesg | grep idle
grep . /sys/devices/system/cpu/cpu0/cpuidle/*/*
# turbostat --debug sleep 10 2> turbostat.out

# cd /sys/devices/system/clocksource/clocksource0
grep . available_clocksource current_clocksource


Other boot options to test:

maxcpus=1
nohpet

intel_idle.max_cstate=3
and if that fails
intel_idle.max_cstate=2
and if that fails
intel_idle.max_cstate=1

thanks,
-Len

????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2015-11-04 16:22:36

by Ville Syrjälä

[permalink] [raw]
Subject: Re: [PATCH] intel_idle: Don't use on Lenovo Ideapad S10-3t

On Tue, Nov 03, 2015 at 03:04:21AM +0000, Brown, Len wrote:
> > Lenovo Ideapad S10-3t hangs coming out of S3 with intel_idle.
> > The two workaround that seem to help are "intel_idle.max_cstate=0"
> > or "nohz=off highres=off".
> >
> > At a first glance quirk_tigerpoint_bm_sts() seemed promising, but
> > even when moved to early_resume it didn't do anything.
> >
> > I have no idea what's wrong here, so let's just disable intel_idle
> > for these machines using a DMI match.
>
> Ville,
>
> It is great that several workarounds have been discovered.
>
> But it would be better to get a good idea of the root-cause
> before permanently ignoring the problem via a new
> black-list in the upstream kernel.
>
> Is it possible for you to file a bug at bugzilla.kernel.org
> against Product: power-management; component: intel_idle?

https://bugzilla.kernel.org/show_bug.cgi?id=107151

>
> In it, please put the following information.
>
> If this is a regression, the oldest kernel that broke.

I'll repeat the details here just in case people are too lazy to look at
the bug:

It seems intel_idle has always been flaky with this hardware. It used to
work a little better in the past, but not perfectly.

I tried to bisect the total breakage starting from v3.11, and found the
following commit to be at fault:
commit a8d46b9e4e487301affe84fa53de40b890898604
Author: Rafael J. Wysocki <[email protected]>
Date: Tue Sep 30 02:29:01 2014 +0200

ACPI / sleep: Rework the handling of ACPI GPE wakeup from
suspend-to-idle

Before that I observed three different behaviours for S3 resume:
a) sometimes it resumed OK
b) sometimes it resumed part way, but kbd/network etc. were still
dead, but then pressing the power button made it finish the resume
somehow
c) same as the previous, except the power button press also made it
all the way to userspace and initiated a normal shutdown
The same kernel could exhibit both a) and b), or both a) and c).

After a8d46b9e4e48 it never resumes, no matter if I press the power
button or not.

> When booted with intel_idle, and then without:
>
> dmesg | grep idle
> grep . /sys/devices/system/cpu/cpu0/cpuidle/*/*
> # turbostat --debug sleep 10 2> turbostat.out
>
> # cd /sys/devices/system/clocksource/clocksource0
> grep . available_clocksource current_clocksource
>
>
> Other boot options to test:
>
> maxcpus=1
> nohpet
>
> intel_idle.max_cstate=3
> and if that fails
> intel_idle.max_cstate=2
> and if that fails
> intel_idle.max_cstate=1

None of these help.

--
Ville Syrj?l?
Intel OTC

Subject: Re: [PATCH] intel_idle: Don't use on Lenovo Ideapad S10-3t

On Wed, 04 Nov 2015, Ville Syrj?l? wrote:
> On Tue, Nov 03, 2015 at 03:04:21AM +0000, Brown, Len wrote:
> > > Lenovo Ideapad S10-3t hangs coming out of S3 with intel_idle.
> > > The two workaround that seem to help are "intel_idle.max_cstate=0"
> > > or "nohz=off highres=off".
> > >
> > > At a first glance quirk_tigerpoint_bm_sts() seemed promising, but
> > > even when moved to early_resume it didn't do anything.
> > >
> > > I have no idea what's wrong here, so let's just disable intel_idle
> > > for these machines using a DMI match.
> >
> > Ville,
> >
> > It is great that several workarounds have been discovered.
> >
> > But it would be better to get a good idea of the root-cause
> > before permanently ignoring the problem via a new
> > black-list in the upstream kernel.
> >
> > Is it possible for you to file a bug at bugzilla.kernel.org
> > against Product: power-management; component: intel_idle?
>
> https://bugzilla.kernel.org/show_bug.cgi?id=107151

This is a shot in the dark, but...

The Atom N450 public errata sheet is nightmare fuel, and Ville's Ideapad is
running that hellish processor on outdated microcode.

http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/atom-processor-n400-specification-update.pdf

Ville, you might want to insure you're at the latest BIOS for the S10-3t:
http://support.lenovo.com/us/en/products/laptops-and-netbooks/ideapad-s-series-netbooks/ideapad-s10-3t/downloads/DS010786

(don't bother with changelogs to make any sort of update-or-not decisions:
Lenovo changelogs are known to often be incomplete).

I also strongly recommend that you should install your distro's microcode
update package: your microcode should have been revision 0x107 or higher, it
has been in the public microcode distribution since Q2/Q3-2010...

Now, as I said, this is all a shot in the dark, and it might not help at all
with the hangs. But it looks like something worth trying...

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh

2015-11-09 10:28:49

by Ville Syrjälä

[permalink] [raw]
Subject: Re: [PATCH] intel_idle: Don't use on Lenovo Ideapad S10-3t

On Sun, Nov 08, 2015 at 05:20:16PM -0200, Henrique de Moraes Holschuh wrote:
> On Wed, 04 Nov 2015, Ville Syrj?l? wrote:
> > On Tue, Nov 03, 2015 at 03:04:21AM +0000, Brown, Len wrote:
> > > > Lenovo Ideapad S10-3t hangs coming out of S3 with intel_idle.
> > > > The two workaround that seem to help are "intel_idle.max_cstate=0"
> > > > or "nohz=off highres=off".
> > > >
> > > > At a first glance quirk_tigerpoint_bm_sts() seemed promising, but
> > > > even when moved to early_resume it didn't do anything.
> > > >
> > > > I have no idea what's wrong here, so let's just disable intel_idle
> > > > for these machines using a DMI match.
> > >
> > > Ville,
> > >
> > > It is great that several workarounds have been discovered.
> > >
> > > But it would be better to get a good idea of the root-cause
> > > before permanently ignoring the problem via a new
> > > black-list in the upstream kernel.
> > >
> > > Is it possible for you to file a bug at bugzilla.kernel.org
> > > against Product: power-management; component: intel_idle?
> >
> > https://bugzilla.kernel.org/show_bug.cgi?id=107151
>
> This is a shot in the dark, but...
>
> The Atom N450 public errata sheet is nightmare fuel, and Ville's Ideapad is
> running that hellish processor on outdated microcode.
>
> http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/atom-processor-n400-specification-update.pdf
>
> Ville, you might want to insure you're at the latest BIOS for the S10-3t:
> http://support.lenovo.com/us/en/products/laptops-and-netbooks/ideapad-s-series-netbooks/ideapad-s10-3t/downloads/DS010786

Seems to require Windows, so no can do without major pain.

Considering acpi_idle works just fine, I doubt a new BIOS version would
have anything to fix this, except maybe by accident.

> (don't bother with changelogs to make any sort of update-or-not decisions:
> Lenovo changelogs are known to often be incomplete).
>
> I also strongly recommend that you should install your distro's microcode
> update package: your microcode should have been revision 0x107 or higher, it
> has been in the public microcode distribution since Q2/Q3-2010...

[ 0.000000] microcode: CPU0 microcode updated early to revision 0x107, date = 2009-08-25
[ 0.990876] microcode: CPU0 sig=0x106ca, pf=0x4, revision=0x107
[ 0.990972] microcode: CPU1 sig=0x106ca, pf=0x4, revision=0x107

No help unfortunately.

>
> Now, as I said, this is all a shot in the dark, and it might not help at all
> with the hangs. But it looks like something worth trying...
>
> --
> "One disk to rule them all, One disk to find them. One disk to bring
> them all and in the darkness grind them. In the Land of Redmond
> where the shadows lie." -- The Silicon Valley Tarot
> Henrique Holschuh

--
Ville Syrj?l?
Intel OTC

2016-10-27 18:10:25

by Ville Syrjälä

[permalink] [raw]
Subject: [PATCH] intel_idle: Don't use on Lenovo Ideapad S10-3t

From: Ville Syrjälä <[email protected]>

Lenovo Ideapad S10-3t hangs coming out of S3 with intel_idle.
The two workaround that seem to help are "intel_idle.max_cstate=0"
or "nohz=off highres=off".

At a first glance quirk_tigerpoint_bm_sts() seemed promising, but
even when moved to early_resume it didn't do anything.

I have no idea what's wrong here, so let's just disable intel_idle
for these machines using a DMI match.

I sent this patch originally about one year ago, at which time I was
asked to file a bug, which I did [1], which was mostly a waste of time
since no one actually did anything to fix the problem. In the
meantime even acpi-idle got broken and then fixed again [2], so at
least there's still a working alternative. As intel_idle has never
worked on this machine, and it looks like no one has any real
intention to fix it, I'm going to suggest *again* that we simply
disable intel_idle on this machine and get on with life.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=107151
[2] https://lkml.org/lkml/2016/5/11/238

Cc: [email protected]
Cc: Steven Rostedt <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: "Srivatsa S. Bhat" <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arjan van de Ven <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Paul McKenney <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Paul Turner <[email protected]>
Cc: "Zhang, Rui" <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Ville Syrjälä <[email protected]>
---
drivers/idle/intel_idle.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)

diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 4466a2f969d7..db81b27f6250 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -59,6 +59,7 @@
#include <linux/notifier.h>
#include <linux/cpu.h>
#include <linux/moduleparam.h>
+#include <linux/dmi.h>
#include <asm/cpu_device_id.h>
#include <asm/intel-family.h>
#include <asm/mwait.h>
@@ -1089,6 +1090,25 @@ static const struct x86_cpu_id intel_idle_ids[] __initconst = {
{}
};

+static int intel_idle_disable_callback(const struct dmi_system_id *id)
+{
+ pr_debug(PREFIX "problematic system (%s), disabling\n", id->ident);
+ return 1;
+}
+
+static const struct dmi_system_id intel_idle_disable_dmi[] = {
+ {
+ /* Lenovo Ideapad S10-3t, hangs coming out of S3 */
+ .callback = intel_idle_disable_callback,
+ .ident = "Lenovo Ideapad S10-3t",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_VERSION, "Lenovo Ideapad S10-3t"),
+ },
+ },
+ {}
+};
+
/*
* intel_idle_probe()
*/
@@ -1121,6 +1141,9 @@ static int __init intel_idle_probe(void)
!mwait_substates)
return -ENODEV;

+ if (dmi_check_system(intel_idle_disable_dmi))
+ return -ENODEV;
+
pr_debug(PREFIX "MWAIT substates: 0x%x\n", mwait_substates);

icpu = (const struct idle_cpu *)id->driver_data;
--
2.7.4

2016-10-27 22:25:34

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] intel_idle: Don't use on Lenovo Ideapad S10-3t

On Thu, 27 Oct 2016, [email protected] wrote:

> From: Ville Syrjälä <[email protected]>
>
> Lenovo Ideapad S10-3t hangs coming out of S3 with intel_idle.
> The two workaround that seem to help are "intel_idle.max_cstate=0"
> or "nohz=off highres=off".
>
> At a first glance quirk_tigerpoint_bm_sts() seemed promising, but
> even when moved to early_resume it didn't do anything.
>
> I have no idea what's wrong here, so let's just disable intel_idle
> for these machines using a DMI match.
>
> I sent this patch originally about one year ago, at which time I was
> asked to file a bug, which I did [1], which was mostly a waste of time
> since no one actually did anything to fix the problem. In the
> meantime even acpi-idle got broken and then fixed again [2], so at
> least there's still a working alternative. As intel_idle has never
> worked on this machine, and it looks like no one has any real
> intention to fix it, I'm going to suggest *again* that we simply
> disable intel_idle on this machine and get on with life.

Please put that patch on hold. We have some new data on this.

Thanks,

tglx