2012-05-28 01:53:31

by Dave Hansen

[permalink] [raw]
Subject: Suspend/resume regressions on Lenovo S10-3

I have a Lenovo S10-3 Atom netbook. It's always had some amount of
trouble working with the intel_idle driver, so I usually compile that
out an use the acpi one. However, just after 3.1, suspend/resume broke.
'echo mem > /sys/power/state' would hang before suspending. I bisected
it down to the commits around:

e978aa7d7d57d04eb5f88a7507c4fb98577def77 / v3.1-1-ge978aa7

by Deepthi. But, current mainline (v3.4-07644-g07acfc2) hangs with a
different symptom: it suspends, but hangs on resume from suspend. I
think _that_ delta in the behavior was caused by:

3439a8da16bcad6b0982ece938c9f8299bb53584

ACPI / cpuidle: Remove acpi_idle_suspend (to fix suspend
regression)

It's a bit of a pain to bisect these two different things in parallel.
I was trying to tell git bisect 'good' on working suspend/resume, 'bad'
on the hang during resume, and 'skip' on the hangs _during_ suspend. 83
kernels in, I'm not sure that's working very well. :)

Deepthi, do you have any idea why your patches broke me in the first
place? Perhaps we should fix that regression first before we go on and
try to figure out what changed to let it suspend again, but break later.

Here's my record of the bisect in case anyone's interested:

v3.4-7644-g07acfc2 - hangs on resume
v3.2-9399-g507a03c - hangs on resume
v3.2-3376-g9879326 - hangs on resume
v3.2-rc4-1312-gdfd56b8 - hangs on resume
v3.2-rc2-580-g8054321 - hangs on resume
v3.2-rc2-576-g4c6e869 - hangs on resume!!!!
v3.1-10555-ge0d6511 - works!
v3.1-10181-g32aaeff - works fine!!! - calling good
v3.1-7844-g6987427 - works!
v3.1.0-00009-ge751b75 - works!
v3.1-5470-g4713e96 - works!!
v3.1-9413-ge45a618 - works!
v3.1-10601-g83dbb15 - works!
v3.1-rc3-115-g8d36ffa - works!
v3.0-7177-g3acc847 - works
v3.1-7712-gcd834fa - works!
v3.1-126-g586e46e - works!
v3.1-rc9-17-g8d73521 - works!
v3.1-rc1-546-g86f14df - works!
v3.1-rc9-2-gb02b917 - works!
v3.1-5478-gfb4431b - works!
v3.1-5494-gdd76986 - works!
v3.1-rc9-3-g3047454 - works!
v3.1-7715-g4907602 - works!!
v3.2-rc2-570-ge170d18 - hangs on resume!!!!!!
v3.1-rc3-129-g4dc0152 - works
v3.2-rc1-284-g52e4c2a - 52e4c2a05... - hangs on resume!!!!!!
v3.1-rc9-146-g7928631 - works!
v3.1-7720-g78c87e8 - works!!
v3.1-8575-gc4e2d24 - works!
v3.2-rc1-28-gf30a648 - hangs on suspend
v3.2-rc1-282-g3439a8d - hangs on resume
v3.2-rc1-44-ga7c36fd - hangs on suspend
v3.2-rc1-4-gc0d1831 - hangs on suspend
v3.2-rc1-37-g0007fa2 - hangs on suspend
v3.2-rc1-15-g6688a4d - hangs on suspend
v3.2-rc1-11-g2f451d2 - hangs on suspend
v3.1-rc1-549-g19940b3 - hang during boot!!!!!!!!!!
v3.2-rc1-10-g9e226b4 - hangs on suspend
v3.2-rc1-98-g42a0ddc - hangs on suspend
v3.2-rc1-43-g8f3f1c9 - hangs on suspend
v3.2-rc1-13-gaf6d9fe - hangs on suspend
v3.2-rc1-4-gc288bf2 - hangs on suspend - skip
v3.2-rc1-41-ga4c9e2e - hangs on suspend - skip
v3.2-rc1-3-gee9f7ef - hangs on suspend - skip
v3.2-rc1-1-g816af3b - hangs on suspend - skip
v3.2-rc1-117-gab8fe93 - hangs on suspend - skip
v3.2-rc1-4-g1c8ee73 - hangs on suspend - skip
v3.2-rc1-282-g3439a8d - 3439a8da1... - hangs on suspend - skip
v3.2-rc1-44-ga7c36fd - a7c36fd8c5... - hangs on suspend - skip
v3.2-rc1-189-g87618e0 - hangs on suspend - skip
v3.2-rc1-3-g3b8ce3a - hangs on suspend - skip
v3.2-rc1-190-gf2ee442 - hangs on suspend - skip
v3.2-rc1-2-g272e42b - hangs on suspend - skip
v3.2-rc1-29-g2690e21 - hangs on suspend - skip
v3.2-rc1-142-gf28ad3b - hangs on suspend - skip
v3.2-rc1-12-g091264f - hangs on suspend - skip
v3.2-rc1-3-g3ec7215 - hangs on suspend - skip
v3.2-rc1-6-g2d5fcc9 - hangs on suspend - skip
## recheck v3.2-rc1-281-g5b34b08 - hangs on suspend - skip
v3.2-rc1-6-g55c0008 - hangs on suspend - skip
v3.2-rc1-162-gfe10e6f - hangs on suspend - skip # 216 revisions left
v3.2-rc1-42-gbbe26ff - hangs on suspend - skip # 216 revisions left
v3.1-1-ge978aa7 - hangs on suspend - skip
v3.2-rc1-182-gc1f4246 - hangs on suspend - skip
v3.2-rc1-8-g12b6d9d - hangs on suspend - skip
v3.2-rc1-3-g1b929995 - hangs on suspend - skip
v3.2-rc1-5-g1dd6c07 - hangs on suspend - skip
v3.2-rc1-27-gd890d73 - hangs on suspend - skip
v3.1-10675-ga6f05b9 - hangs on suspend - skip
v3.2-rc1-40-g10b391b - hangs on suspend - skip
v3.2-rc1-5-gf7f9bdf - hangs on suspend - skip
v3.1-16-gefb9058 - hangs on suspend
v3.2-rc1-182-gc1f4246 - hangs on suspend - calling bad
v3.2-rc1-3-g3b8ce3a - hangs on suspend - calling bad
v3.1-10623-g50e6963 - hangs on suspend - calling bad
v3.1-00003-g4202735 - hangs on suspend
v3.2-rc1-142-gf28ad3b - hangs on suspend
v3.2-rc1-17-g4beb116 - hangs on suspend
v3.1-10692-g54a0f91 - hangs on suspend
v3.1-10620-g1944ce6 - hangs on suspend
v3.1-4-g46bcfad - hangs on suspend


2012-06-06 12:51:31

by Deepthi Dharwar

[permalink] [raw]
Subject: Re: Suspend/resume regressions on Lenovo S10-3

On 05/28/2012 07:23 AM, Dave Hansen wrote:

> I have a Lenovo S10-3 Atom netbook. It's always had some amount of
> trouble working with the intel_idle driver, so I usually compile that
> out an use the acpi one. However, just after 3.1, suspend/resume broke.
> 'echo mem > /sys/power/state' would hang before suspending. I bisected
> it down to the commits around:
>
> e978aa7d7d57d04eb5f88a7507c4fb98577def77 / v3.1-1-ge978aa7
>
> by Deepthi. But, current mainline (v3.4-07644-g07acfc2) hangs with a
> different symptom: it suspends, but hangs on resume from suspend. I
> think _that_ delta in the behavior was caused by:
>
> 3439a8da16bcad6b0982ece938c9f8299bb53584
>
> ACPI / cpuidle: Remove acpi_idle_suspend (to fix suspend
> regression)
>
> It's a bit of a pain to bisect these two different things in parallel.
> I was trying to tell git bisect 'good' on working suspend/resume, 'bad'
> on the hang during resume, and 'skip' on the hangs _during_ suspend. 83
> kernels in, I'm not sure that's working very well. :)
>
> Deepthi, do you have any idea why your patches broke me in the first
> place? Perhaps we should fix that regression first before we go on and
> try to figure out what changed to let it suspend again, but break later.


Hi Dave,

Sorry about my patches breaking your suspend-resume.

I, basically tried out building and booting 3.1 kernel with
my patch set to reproduce the failure. I could clearly
see suspend not happening. It turns out to be
a bug with my first patch in global registration
series submitted earlier.

e978aa7d7d57d04eb5f88a7507c4fb98577def77 / v3.1-1-ge978aa7

The following patch, fixes the suspend issues
seen on my laptop due to earlier cpuidle cleanup
(Lenevo T420 booting with acpi_idle enabled).
Can you please give this a try
on top of my patch set (without Rafael's fix)
and see if it fixes the problem for you.
I am not reverting acpi_idle_suspend flag and
hopefully it should resume fine too.

---

This patch fixes suspend-resume issue seen in the kernel 3.1
series using acpi_idle_driver because of cpuidle global
registration cleanup.
Here, when acpi_idle_suspend flag was set ( during suspend)
the interrupts were not getting enabled in acpi_idle_enter_bm()
routine which was causing the system to hang.


Signed-off-by: Deepthi Dharwar <[email protected]>

---
drivers/acpi/processor_idle.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 24fe3af..6e35293 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -895,8 +895,9 @@ static int acpi_idle_enter_bm(struct cpuidle_device *dev,
if (unlikely(!pr))
return -EINVAL;

-
if (acpi_idle_suspend) {
+ local_irq_disable();
+ local_irq_enable();
cpu_relax();
return -EINVAL;
}

Cheers,
Deepthi

2012-06-06 13:37:57

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: Suspend/resume regressions on Lenovo S10-3

On Wednesday, June 06, 2012, Deepthi Dharwar wrote:
> On 05/28/2012 07:23 AM, Dave Hansen wrote:
>
> > I have a Lenovo S10-3 Atom netbook. It's always had some amount of
> > trouble working with the intel_idle driver, so I usually compile that
> > out an use the acpi one. However, just after 3.1, suspend/resume broke.
> > 'echo mem > /sys/power/state' would hang before suspending. I bisected
> > it down to the commits around:
> >
> > e978aa7d7d57d04eb5f88a7507c4fb98577def77 / v3.1-1-ge978aa7
> >
> > by Deepthi. But, current mainline (v3.4-07644-g07acfc2) hangs with a
> > different symptom: it suspends, but hangs on resume from suspend. I
> > think _that_ delta in the behavior was caused by:
> >
> > 3439a8da16bcad6b0982ece938c9f8299bb53584
> >
> > ACPI / cpuidle: Remove acpi_idle_suspend (to fix suspend
> > regression)
> >
> > It's a bit of a pain to bisect these two different things in parallel.
> > I was trying to tell git bisect 'good' on working suspend/resume, 'bad'
> > on the hang during resume, and 'skip' on the hangs _during_ suspend. 83
> > kernels in, I'm not sure that's working very well. :)
> >
> > Deepthi, do you have any idea why your patches broke me in the first
> > place? Perhaps we should fix that regression first before we go on and
> > try to figure out what changed to let it suspend again, but break later.
>
>
> Hi Dave,
>
> Sorry about my patches breaking your suspend-resume.
>
> I, basically tried out building and booting 3.1 kernel with
> my patch set to reproduce the failure. I could clearly
> see suspend not happening. It turns out to be
> a bug with my first patch in global registration
> series submitted earlier.
>
> e978aa7d7d57d04eb5f88a7507c4fb98577def77 / v3.1-1-ge978aa7
>
> The following patch, fixes the suspend issues
> seen on my laptop due to earlier cpuidle cleanup
> (Lenevo T420 booting with acpi_idle enabled).
> Can you please give this a try
> on top of my patch set (without Rafael's fix)
> and see if it fixes the problem for you.
> I am not reverting acpi_idle_suspend flag and
> hopefully it should resume fine too.
>
> ---
>
> This patch fixes suspend-resume issue seen in the kernel 3.1
> series using acpi_idle_driver because of cpuidle global
> registration cleanup.
> Here, when acpi_idle_suspend flag was set ( during suspend)
> the interrupts were not getting enabled in acpi_idle_enter_bm()
> routine which was causing the system to hang.
>
>
> Signed-off-by: Deepthi Dharwar <[email protected]>
>
> ---
> drivers/acpi/processor_idle.c | 3 ++-
> 1 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
> index 24fe3af..6e35293 100644
> --- a/drivers/acpi/processor_idle.c
> +++ b/drivers/acpi/processor_idle.c
> @@ -895,8 +895,9 @@ static int acpi_idle_enter_bm(struct cpuidle_device *dev,
> if (unlikely(!pr))
> return -EINVAL;
>
> -
> if (acpi_idle_suspend) {
> + local_irq_disable();
> + local_irq_enable();
> cpu_relax();
> return -EINVAL;
> }

May I say this is ugly? Why can't we track the status of interrupts
properly here?

Rafael

2012-06-06 15:09:30

by Alan Stern

[permalink] [raw]
Subject: Re: [linux-pm] Suspend/resume regressions on Lenovo S10-3

On Wed, 6 Jun 2012, Rafael J. Wysocki wrote:

> > --- a/drivers/acpi/processor_idle.c
> > +++ b/drivers/acpi/processor_idle.c
> > @@ -895,8 +895,9 @@ static int acpi_idle_enter_bm(struct cpuidle_device *dev,
> > if (unlikely(!pr))
> > return -EINVAL;
> >
> > -
> > if (acpi_idle_suspend) {
> > + local_irq_disable();
> > + local_irq_enable();
> > cpu_relax();
> > return -EINVAL;
> > }
>
> May I say this is ugly? Why can't we track the status of interrupts
> properly here?

It's not just ugly; it's illogical. What reason could there possibly
be for disabling interrupts and then enabling them again without doing
anything in between?

Alan Stern

2012-06-06 15:12:55

by Dave Hansen

[permalink] [raw]
Subject: Re: Suspend/resume regressions on Lenovo S10-3

On 06/06/2012 05:51 AM, Deepthi Dharwar wrote:
> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
> index 24fe3af..6e35293 100644
> --- a/drivers/acpi/processor_idle.c
> +++ b/drivers/acpi/processor_idle.c
> @@ -895,8 +895,9 @@ static int acpi_idle_enter_bm(struct cpuidle_device *dev,
> if (unlikely(!pr))
> return -EINVAL;
>
> -
> if (acpi_idle_suspend) {
> + local_irq_disable();
> + local_irq_enable();
> cpu_relax();
> return -EINVAL;
> }

Heh, that is quite the hack. :)

However, it does at least work around my problem: I can suspend again
with 46bcfad + your patch.

2012-06-06 16:45:54

by Deepthi Dharwar

[permalink] [raw]
Subject: Re: Suspend/resume regressions on Lenovo S10-3

On 06/06/2012 07:13 PM, Rafael J. Wysocki wrote:

> On Wednesday, June 06, 2012, Deepthi Dharwar wrote:
>> On 05/28/2012 07:23 AM, Dave Hansen wrote:
>>
>>> I have a Lenovo S10-3 Atom netbook. It's always had some amount of
>>> trouble working with the intel_idle driver, so I usually compile that
>>> out an use the acpi one. However, just after 3.1, suspend/resume broke.
>>> 'echo mem > /sys/power/state' would hang before suspending. I bisected
>>> it down to the commits around:
>>>
>>> e978aa7d7d57d04eb5f88a7507c4fb98577def77 / v3.1-1-ge978aa7
>>>
>>> by Deepthi. But, current mainline (v3.4-07644-g07acfc2) hangs with a
>>> different symptom: it suspends, but hangs on resume from suspend. I
>>> think _that_ delta in the behavior was caused by:
>>>
>>> 3439a8da16bcad6b0982ece938c9f8299bb53584
>>>
>>> ACPI / cpuidle: Remove acpi_idle_suspend (to fix suspend
>>> regression)
>>>
>>> It's a bit of a pain to bisect these two different things in parallel.
>>> I was trying to tell git bisect 'good' on working suspend/resume, 'bad'
>>> on the hang during resume, and 'skip' on the hangs _during_ suspend. 83
>>> kernels in, I'm not sure that's working very well. :)
>>>
>>> Deepthi, do you have any idea why your patches broke me in the first
>>> place? Perhaps we should fix that regression first before we go on and
>>> try to figure out what changed to let it suspend again, but break later.
>>
>>
>> Hi Dave,
>>
>> Sorry about my patches breaking your suspend-resume.
>>
>> I, basically tried out building and booting 3.1 kernel with
>> my patch set to reproduce the failure. I could clearly
>> see suspend not happening. It turns out to be
>> a bug with my first patch in global registration
>> series submitted earlier.
>>
>> e978aa7d7d57d04eb5f88a7507c4fb98577def77 / v3.1-1-ge978aa7
>>
>> The following patch, fixes the suspend issues
>> seen on my laptop due to earlier cpuidle cleanup
>> (Lenevo T420 booting with acpi_idle enabled).
>> Can you please give this a try
>> on top of my patch set (without Rafael's fix)
>> and see if it fixes the problem for you.
>> I am not reverting acpi_idle_suspend flag and
>> hopefully it should resume fine too.
>>
>> ---
>>
>> This patch fixes suspend-resume issue seen in the kernel 3.1
>> series using acpi_idle_driver because of cpuidle global
>> registration cleanup.
>> Here, when acpi_idle_suspend flag was set ( during suspend)
>> the interrupts were not getting enabled in acpi_idle_enter_bm()
>> routine which was causing the system to hang.
>>
>>
>> Signed-off-by: Deepthi Dharwar <[email protected]>
>>
>> ---
>> drivers/acpi/processor_idle.c | 3 ++-
>> 1 files changed, 2 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
>> index 24fe3af..6e35293 100644
>> --- a/drivers/acpi/processor_idle.c
>> +++ b/drivers/acpi/processor_idle.c
>> @@ -895,8 +895,9 @@ static int acpi_idle_enter_bm(struct cpuidle_device *dev,
>> if (unlikely(!pr))
>> return -EINVAL;
>>
>> -
>> if (acpi_idle_suspend) {
>> + local_irq_disable();
>> + local_irq_enable();
>> cpu_relax();
>> return -EINVAL;
>> }
>
> May I say this is ugly? Why can't we track the status of interrupts
> properly here?



I agree. Just the irq_enable call should do the trick.
Once the cpu enters idle via cpu_idle call, irqs are disabled. We
need to enable them if acpi_idle_suspend is set.
Else we'll see a hang during suspend.

The patch that was posted was just for workaround patch.
I shall do some more testing with just irq_enable call
and then shall post the fix which can be considered
for inclusion.

Cheers
Deepthi

2012-06-08 06:25:21

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: Suspend/resume regressions on Lenovo S10-3

On Wednesday, June 06, 2012, Deepthi Dharwar wrote:
> On 06/06/2012 07:13 PM, Rafael J. Wysocki wrote:
>
> > On Wednesday, June 06, 2012, Deepthi Dharwar wrote:
> >> On 05/28/2012 07:23 AM, Dave Hansen wrote:
> >>
> >>> I have a Lenovo S10-3 Atom netbook. It's always had some amount of
> >>> trouble working with the intel_idle driver, so I usually compile that
> >>> out an use the acpi one. However, just after 3.1, suspend/resume broke.
> >>> 'echo mem > /sys/power/state' would hang before suspending. I bisected
> >>> it down to the commits around:
> >>>
> >>> e978aa7d7d57d04eb5f88a7507c4fb98577def77 / v3.1-1-ge978aa7
> >>>
> >>> by Deepthi. But, current mainline (v3.4-07644-g07acfc2) hangs with a
> >>> different symptom: it suspends, but hangs on resume from suspend. I
> >>> think _that_ delta in the behavior was caused by:
> >>>
> >>> 3439a8da16bcad6b0982ece938c9f8299bb53584
> >>>
> >>> ACPI / cpuidle: Remove acpi_idle_suspend (to fix suspend
> >>> regression)
> >>>
> >>> It's a bit of a pain to bisect these two different things in parallel.
> >>> I was trying to tell git bisect 'good' on working suspend/resume, 'bad'
> >>> on the hang during resume, and 'skip' on the hangs _during_ suspend. 83
> >>> kernels in, I'm not sure that's working very well. :)
> >>>
> >>> Deepthi, do you have any idea why your patches broke me in the first
> >>> place? Perhaps we should fix that regression first before we go on and
> >>> try to figure out what changed to let it suspend again, but break later.
> >>
> >>
> >> Hi Dave,
> >>
> >> Sorry about my patches breaking your suspend-resume.
> >>
> >> I, basically tried out building and booting 3.1 kernel with
> >> my patch set to reproduce the failure. I could clearly
> >> see suspend not happening. It turns out to be
> >> a bug with my first patch in global registration
> >> series submitted earlier.
> >>
> >> e978aa7d7d57d04eb5f88a7507c4fb98577def77 / v3.1-1-ge978aa7
> >>
> >> The following patch, fixes the suspend issues
> >> seen on my laptop due to earlier cpuidle cleanup
> >> (Lenevo T420 booting with acpi_idle enabled).
> >> Can you please give this a try
> >> on top of my patch set (without Rafael's fix)
> >> and see if it fixes the problem for you.
> >> I am not reverting acpi_idle_suspend flag and
> >> hopefully it should resume fine too.
> >>
> >> ---
> >>
> >> This patch fixes suspend-resume issue seen in the kernel 3.1
> >> series using acpi_idle_driver because of cpuidle global
> >> registration cleanup.
> >> Here, when acpi_idle_suspend flag was set ( during suspend)
> >> the interrupts were not getting enabled in acpi_idle_enter_bm()
> >> routine which was causing the system to hang.
> >>
> >>
> >> Signed-off-by: Deepthi Dharwar <[email protected]>
> >>
> >> ---
> >> drivers/acpi/processor_idle.c | 3 ++-
> >> 1 files changed, 2 insertions(+), 1 deletions(-)
> >>
> >> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
> >> index 24fe3af..6e35293 100644
> >> --- a/drivers/acpi/processor_idle.c
> >> +++ b/drivers/acpi/processor_idle.c
> >> @@ -895,8 +895,9 @@ static int acpi_idle_enter_bm(struct cpuidle_device *dev,
> >> if (unlikely(!pr))
> >> return -EINVAL;
> >>
> >> -
> >> if (acpi_idle_suspend) {
> >> + local_irq_disable();
> >> + local_irq_enable();
> >> cpu_relax();
> >> return -EINVAL;
> >> }
> >
> > May I say this is ugly? Why can't we track the status of interrupts
> > properly here?
>
>
>
> I agree. Just the irq_enable call should do the trick.
> Once the cpu enters idle via cpu_idle call, irqs are disabled. We
> need to enable them if acpi_idle_suspend is set.
> Else we'll see a hang during suspend.
>
> The patch that was posted was just for workaround patch.
> I shall do some more testing with just irq_enable call
> and then shall post the fix which can be considered
> for inclusion.

I see. OK, then, looking forward to seeing the final patch. :-)

Thanks,
Rafael

2012-06-08 06:42:24

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: Suspend/resume regressions on Lenovo S10-3

On 06/06/2012 07:13 PM, Rafael J. Wysocki wrote:

> On Wednesday, June 06, 2012, Deepthi Dharwar wrote:
>> On 05/28/2012 07:23 AM, Dave Hansen wrote:
>>
>>> I have a Lenovo S10-3 Atom netbook. It's always had some amount of
>>> trouble working with the intel_idle driver, so I usually compile that
>>> out an use the acpi one. However, just after 3.1, suspend/resume broke.
>>> 'echo mem > /sys/power/state' would hang before suspending. I bisected
>>> it down to the commits around:
>>>
>>> e978aa7d7d57d04eb5f88a7507c4fb98577def77 / v3.1-1-ge978aa7
>>>
>>> by Deepthi. But, current mainline (v3.4-07644-g07acfc2) hangs with a
>>> different symptom: it suspends, but hangs on resume from suspend. I
>>> think _that_ delta in the behavior was caused by:
>>>
>>> 3439a8da16bcad6b0982ece938c9f8299bb53584
>>>
>>> ACPI / cpuidle: Remove acpi_idle_suspend (to fix suspend
>>> regression)
>>>
>>> It's a bit of a pain to bisect these two different things in parallel.
>>> I was trying to tell git bisect 'good' on working suspend/resume, 'bad'
>>> on the hang during resume, and 'skip' on the hangs _during_ suspend. 83
>>> kernels in, I'm not sure that's working very well. :)
>>>
>>> Deepthi, do you have any idea why your patches broke me in the first
>>> place? Perhaps we should fix that regression first before we go on and
>>> try to figure out what changed to let it suspend again, but break later.
>>
>>
>> Hi Dave,
>>
>> Sorry about my patches breaking your suspend-resume.
>>
>> I, basically tried out building and booting 3.1 kernel with
>> my patch set to reproduce the failure. I could clearly
>> see suspend not happening. It turns out to be
>> a bug with my first patch in global registration
>> series submitted earlier.
>>
>> e978aa7d7d57d04eb5f88a7507c4fb98577def77 / v3.1-1-ge978aa7
>>
>> The following patch, fixes the suspend issues
>> seen on my laptop due to earlier cpuidle cleanup
>> (Lenevo T420 booting with acpi_idle enabled).
>> Can you please give this a try
>> on top of my patch set (without Rafael's fix)
>> and see if it fixes the problem for you.
>> I am not reverting acpi_idle_suspend flag and
>> hopefully it should resume fine too.
>>
>> ---
>>
>> This patch fixes suspend-resume issue seen in the kernel 3.1
>> series using acpi_idle_driver because of cpuidle global
>> registration cleanup.
>> Here, when acpi_idle_suspend flag was set ( during suspend)
>> the interrupts were not getting enabled in acpi_idle_enter_bm()
>> routine which was causing the system to hang.
>>
>>
>> Signed-off-by: Deepthi Dharwar <[email protected]>
>>
>> ---
>> drivers/acpi/processor_idle.c | 3 ++-
>> 1 files changed, 2 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
>> index 24fe3af..6e35293 100644
>> --- a/drivers/acpi/processor_idle.c
>> +++ b/drivers/acpi/processor_idle.c
>> @@ -895,8 +895,9 @@ static int acpi_idle_enter_bm(struct cpuidle_device *dev,
>> if (unlikely(!pr))
>> return -EINVAL;
>>
>> -
>> if (acpi_idle_suspend) {
>> + local_irq_disable();
>> + local_irq_enable();
>> cpu_relax();
>> return -EINVAL;
>> }
>
> May I say this is ugly? Why can't we track the status of interrupts
> properly here?
>


Btw, Deepthi, when you are modifying this to keep track of interrupt enabled/
disabled status, I think it would be worthwhile to also add a WARN_ON() in
cpu_idle() inside arch/x86/kernel/process.c, just like how ARM and sh do it.

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 735279e..1ca7e1a 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -459,6 +459,9 @@ void cpu_idle(void)
if (cpuidle_idle_call())
pm_idle();

+ /* The idle routine must return with IRQs enabled. */
+ WARN_ON(irqs_disabled());
+
rcu_idle_exit();
start_critical_timings();

[If we had done this earlier, we could have caught the bug right when the
patch went in :-)]

Regards,

Srivatsa S. Bhat

2012-06-11 13:32:22

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: Suspend/resume regressions on Lenovo S10-3

Hi Dave,

On 06/06/2012 06:21 PM, Deepthi Dharwar wrote:

> On 05/28/2012 07:23 AM, Dave Hansen wrote:
>
>> I have a Lenovo S10-3 Atom netbook. It's always had some amount of
>> trouble working with the intel_idle driver, so I usually compile that
>> out an use the acpi one.


What problem did you face with intel_idle driver? Is it suspend/resume
related?

I see a comment in drivers/idle/intel_idle.c such as:

/*
* Known limitations
* [...]
*
* ACPI has a .suspend hack to turn off deep c-statees during suspend
* to avoid complications with the lapic timer workaround.
* Have not seen issues with suspend, but may need same workaround here.
*
*/

So, if you are facing suspend issues with the intel_idle driver, we probably
need to add that same workaround here as well, to make it work.

Please let us know what problem you are facing.

Regards,
Srivatsa S. Bhat

>> However, just after 3.1, suspend/resume broke.
>> 'echo mem > /sys/power/state' would hang before suspending. I bisected
>> it down to the commits around:
>>
>> e978aa7d7d57d04eb5f88a7507c4fb98577def77 / v3.1-1-ge978aa7
>>
>> by Deepthi. But, current mainline (v3.4-07644-g07acfc2) hangs with a
>> different symptom: it suspends, but hangs on resume from suspend. I
>> think _that_ delta in the behavior was caused by:
>>
>> 3439a8da16bcad6b0982ece938c9f8299bb53584
>>
>> ACPI / cpuidle: Remove acpi_idle_suspend (to fix suspend
>> regression)
>>
>> It's a bit of a pain to bisect these two different things in parallel.
>> I was trying to tell git bisect 'good' on working suspend/resume, 'bad'
>> on the hang during resume, and 'skip' on the hangs _during_ suspend. 83
>> kernels in, I'm not sure that's working very well. :)
>>
>> Deepthi, do you have any idea why your patches broke me in the first
>> place? Perhaps we should fix that regression first before we go on and
>> try to figure out what changed to let it suspend again, but break later.
>
>
> Hi Dave,
>
> Sorry about my patches breaking your suspend-resume.
>
> I, basically tried out building and booting 3.1 kernel with
> my patch set to reproduce the failure. I could clearly
> see suspend not happening. It turns out to be
> a bug with my first patch in global registration
> series submitted earlier.
>
> e978aa7d7d57d04eb5f88a7507c4fb98577def77 / v3.1-1-ge978aa7
>
> The following patch, fixes the suspend issues
> seen on my laptop due to earlier cpuidle cleanup
> (Lenevo T420 booting with acpi_idle enabled).
> Can you please give this a try
> on top of my patch set (without Rafael's fix)
> and see if it fixes the problem for you.
> I am not reverting acpi_idle_suspend flag and
> hopefully it should resume fine too.
>
> ---
>
> This patch fixes suspend-resume issue seen in the kernel 3.1
> series using acpi_idle_driver because of cpuidle global
> registration cleanup.
> Here, when acpi_idle_suspend flag was set ( during suspend)
> the interrupts were not getting enabled in acpi_idle_enter_bm()
> routine which was causing the system to hang.
>
>
> Signed-off-by: Deepthi Dharwar <[email protected]>
>
> ---
> drivers/acpi/processor_idle.c | 3 ++-
> 1 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
> index 24fe3af..6e35293 100644
> --- a/drivers/acpi/processor_idle.c
> +++ b/drivers/acpi/processor_idle.c
> @@ -895,8 +895,9 @@ static int acpi_idle_enter_bm(struct cpuidle_device *dev,
> if (unlikely(!pr))
> return -EINVAL;
>
> -
> if (acpi_idle_suspend) {
> + local_irq_disable();
> + local_irq_enable();
> cpu_relax();
> return -EINVAL;
> }
>
> Cheers,
> Deepthi
>



--
Regards,
Srivatsa S. Bhat
IBM Linux Technology Center

2012-06-11 14:58:02

by Dave Hansen

[permalink] [raw]
Subject: Re: Suspend/resume regressions on Lenovo S10-3

On 06/11/2012 06:31 AM, Srivatsa S. Bhat wrote:
> Hi Dave,
>
> On 06/06/2012 06:21 PM, Deepthi Dharwar wrote:
>
>> On 05/28/2012 07:23 AM, Dave Hansen wrote:
>>
>>> I have a Lenovo S10-3 Atom netbook. It's always had some amount of
>>> trouble working with the intel_idle driver, so I usually compile that
>>> out an use the acpi one.
>
>
> What problem did you face with intel_idle driver? Is it suspend/resume
> related?

With intel_idle, sometimes it won't boot, others it won't suspend/resume:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/674075

I'm actually not sure how it behaves on current kernels.