2008-10-16 07:51:39

by Zhao, Yakui

[permalink] [raw]
Subject: Suspend/resume regression between 2.6.26 and 2.6.27-rc1

On Fri, 2008-10-10 at 14:41 +0200, Rafael J. Wysocki wrote:
> On Friday, 10 of October 2008, Zhang Rui wrote:
> > Hi, len,
> >
> > this is the ACPI regression test result based on the latest ACPI test branch.
> >
> > 1. on Acer:(AMD CPU, VIA chipset. 64 bit kernel)
> > When doing S3 test, after pressing the power button, the system
> > reboots instead of resuming. But if S3 is done after S4, the system can
> > resume very well after pressing power button or using the RTC
> > alarm.
> > Note that this is an upstream regression as it can be reproduced on
> > linus' tree.
> > Yakui is investigating this issue.
>
> We had some reports of the second suspend (S3) failure too, where the second
> attempt to suspend to RAM (or to resume from it) failed after a successful
> one. I wonder if that's related.
Some suspend/resume tests are done on one Acer laptop(AMD CPU, VIA
chipset, 64-bit kernel).
The system will be rebooted when pressing power button after the box
enters S3 state. But if S3 is done after doing S4, the system can be
resumed very well after pressing power button.This issue can be
reproduced on the upstream kernel.

After the further test we can confirm that this is a regression. The
2.6.26 kernel can work well on this box. But the 2.6.27-rc1 will fail.

After using the git-bisect it is confirmed that the commit
736f12bff9d9e7b4e895c64f73b190c8383fc2a1 is good.
>commit 736f12bff9d9e7b4e895c64f73b190c8383fc2a1
>Author: Glauber Costa <[email protected]>
> Date: Tue May 27 20:14:51 2008 -0700
>x86: don't use gdt_page openly.

And the commit
55f262391a2365d657a00ed68edd1a51bca66af5 is bad.
>commit 55f262391a2365d657a00ed68edd1a51bca66af5
>Author: Yinghai Lu <[email protected]>
>Date: Wed Jun 25 17:54:23 2008 -0700
>x86: rename setup_32.c to setup.c

The patches between the above two commits are related with X86. When
using git-bisect between the above two commits, we will get the
compiling errors(For example: some files don't exist) or the kernel
panic. So we can't continue using git-bisect to identify which commit
the regression is caused by.

Best regards.
Yakui.

> >


2008-10-16 09:40:12

by Alan Jenkins

[permalink] [raw]
Subject: Re: Suspend/resume regression between 2.6.26 and 2.6.27-rc1

Zhao Yakui wrote:
> On Fri, 2008-10-10 at 14:41 +0200, Rafael J. Wysocki wrote:
>> On Friday, 10 of October 2008, Zhang Rui wrote:
>>> Hi, len,
>>>
>>> this is the ACPI regression test result based on the latest ACPI test branch.
>>>
>>> 1. on Acer:(AMD CPU, VIA chipset. 64 bit kernel)
>>> When doing S3 test, after pressing the power button, the system
>>> reboots instead of resuming. But if S3 is done after S4, the system can
>>> resume very well after pressing power button or using the RTC
>>> alarm.
>>> Note that this is an upstream regression as it can be reproduced on
>>> linus' tree.
>>> Yakui is investigating this issue.
>> We had some reports of the second suspend (S3) failure too, where the second
>> attempt to suspend to RAM (or to resume from it) failed after a successful
>> one. I wonder if that's related.
> Some suspend/resume tests are done on one Acer laptop(AMD CPU, VIA
> chipset, 64-bit kernel).
> The system will be rebooted when pressing power button after the box
> enters S3 state. But if S3 is done after doing S4, the system can be
> resumed very well after pressing power button.This issue can be
> reproduced on the upstream kernel.
>
> After the further test we can confirm that this is a regression. The
> 2.6.26 kernel can work well on this box. But the 2.6.27-rc1 will fail.
>
> After using the git-bisect it is confirmed that the commit
> 736f12bff9d9e7b4e895c64f73b190c8383fc2a1 is good.
> >commit 736f12bff9d9e7b4e895c64f73b190c8383fc2a1
> >Author: Glauber Costa <[email protected]>
> > Date: Tue May 27 20:14:51 2008 -0700
> >x86: don't use gdt_page openly.
>
> And the commit
> 55f262391a2365d657a00ed68edd1a51bca66af5 is bad.
> >commit 55f262391a2365d657a00ed68edd1a51bca66af5
> >Author: Yinghai Lu <[email protected]>
> >Date: Wed Jun 25 17:54:23 2008 -0700
> >x86: rename setup_32.c to setup.c
>
> The patches between the above two commits are related with X86. When
> using git-bisect between the above two commits, we will get the
> compiling errors(For example: some files don't exist) or the kernel
> panic. So we can't continue using git-bisect to identify which commit
> the regression is caused by.

There's a known fix for the kernel panic. It's referenced at <http://bugzilla.kernel.org/show_bug.cgi?id=11237#c25>. That should help you bisect down to a smaller range. Hopefully you can rule out the commit that caused^Wexposed Bug #11237, which is really a nasty BIOS bug.

HTH
Alan

2008-10-16 14:17:56

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: Suspend/resume regression between 2.6.26 and 2.6.27-rc1

On Thursday, 16 of October 2008, Zhao Yakui wrote:
> On Fri, 2008-10-10 at 14:41 +0200, Rafael J. Wysocki wrote:
> > On Friday, 10 of October 2008, Zhang Rui wrote:
> > > Hi, len,
> > >
> > > this is the ACPI regression test result based on the latest ACPI test branch.
> > >
> > > 1. on Acer:(AMD CPU, VIA chipset. 64 bit kernel)
> > > When doing S3 test, after pressing the power button, the system
> > > reboots instead of resuming. But if S3 is done after S4, the system can
> > > resume very well after pressing power button or using the RTC
> > > alarm.
> > > Note that this is an upstream regression as it can be reproduced on
> > > linus' tree.
> > > Yakui is investigating this issue.
> >
> > We had some reports of the second suspend (S3) failure too, where the second
> > attempt to suspend to RAM (or to resume from it) failed after a successful
> > one. I wonder if that's related.
> Some suspend/resume tests are done on one Acer laptop(AMD CPU, VIA
> chipset, 64-bit kernel).
> The system will be rebooted when pressing power button after the box
> enters S3 state. But if S3 is done after doing S4, the system can be
> resumed very well after pressing power button.This issue can be
> reproduced on the upstream kernel.
>
> After the further test we can confirm that this is a regression. The
> 2.6.26 kernel can work well on this box. But the 2.6.27-rc1 will fail.
>
> After using the git-bisect it is confirmed that the commit
> 736f12bff9d9e7b4e895c64f73b190c8383fc2a1 is good.
> >commit 736f12bff9d9e7b4e895c64f73b190c8383fc2a1
> >Author: Glauber Costa <[email protected]>
> > Date: Tue May 27 20:14:51 2008 -0700
> >x86: don't use gdt_page openly.
>
> And the commit
> 55f262391a2365d657a00ed68edd1a51bca66af5 is bad.
> >commit 55f262391a2365d657a00ed68edd1a51bca66af5
> >Author: Yinghai Lu <[email protected]>
> >Date: Wed Jun 25 17:54:23 2008 -0700
> >x86: rename setup_32.c to setup.c
>
> The patches between the above two commits are related with X86. When
> using git-bisect between the above two commits, we will get the
> compiling errors(For example: some files don't exist) or the kernel
> panic. So we can't continue using git-bisect to identify which commit
> the regression is caused by.

If you test SMP kernels on a non-SMP box, that may be
http://bugzilla.kernel.org/show_bug.cgi?id=11568. In which case, please test
the patch from http://bugzilla.kernel.org/show_bug.cgi?id=11568#c46 .

Thanks,
Rafael

2008-10-17 00:40:59

by Ming Lin

[permalink] [raw]
Subject: Re: Suspend/resume regression between 2.6.26 and 2.6.27-rc1

On Thu, Oct 16, 2008 at 5:39 PM, Alan Jenkins
<[email protected]> wrote:
> There's a known fix for the kernel panic. It's referenced at <http://bugzilla.kernel.org/show_bug.cgi?id=11237#c25>. That should help you bisect down to a smaller range. Hopefully you can rule out the commit that caused^Wexposed Bug #11237, which is really a nasty BIOS bug.

It's already the result after the patch you mentioned applied.
We can not bisect any more due to compile error because i386/x86_64 merge stuff.

Thanks,
Lin Ming

2008-10-17 01:22:23

by Zhao, Yakui

[permalink] [raw]
Subject: Re: Suspend/resume regression between 2.6.26 and 2.6.27-rc1

On Thu, 2008-10-16 at 16:21 +0200, Rafael J. Wysocki wrote:
> On Thursday, 16 of October 2008, Zhao Yakui wrote:
> > On Fri, 2008-10-10 at 14:41 +0200, Rafael J. Wysocki wrote:

> >
> > The patches between the above two commits are related with X86. When
> > using git-bisect between the above two commits, we will get the
> > compiling errors(For example: some files don't exist) or the kernel
> > panic. So we can't continue using git-bisect to identify which commit
> > the regression is caused by.
>
> If you test SMP kernels on a non-SMP box, that may be
> http://bugzilla.kernel.org/show_bug.cgi?id=11568. In which case, please test
> the patch from http://bugzilla.kernel.org/show_bug.cgi?id=11568#c46 .
>
Thanks for so quick response. The smp kernel is tested on the non-SMP
box.
After applying the patch in
http://bugzilla.kernel.org/show_bug.cgi?id=11568#c46,
the box can be resumed very well on the latest kernel.

Thanks.

> Thanks,
> Rafael

2008-10-17 12:01:35

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: Suspend/resume regression between 2.6.26 and 2.6.27-rc1

On Friday, 17 of October 2008, Zhao Yakui wrote:
> On Thu, 2008-10-16 at 16:21 +0200, Rafael J. Wysocki wrote:
> > On Thursday, 16 of October 2008, Zhao Yakui wrote:
> > > On Fri, 2008-10-10 at 14:41 +0200, Rafael J. Wysocki wrote:
>
> > >
> > > The patches between the above two commits are related with X86. When
> > > using git-bisect between the above two commits, we will get the
> > > compiling errors(For example: some files don't exist) or the kernel
> > > panic. So we can't continue using git-bisect to identify which commit
> > > the regression is caused by.
> >
> > If you test SMP kernels on a non-SMP box, that may be
> > http://bugzilla.kernel.org/show_bug.cgi?id=11568. In which case, please test
> > the patch from http://bugzilla.kernel.org/show_bug.cgi?id=11568#c46 .
> >
> Thanks for so quick response. The smp kernel is tested on the non-SMP
> box.
> After applying the patch in
> http://bugzilla.kernel.org/show_bug.cgi?id=11568#c46,
> the box can be resumed very well on the latest kernel.

Thanks for testing.

The patch has been posted for mainline inclusion:
http://marc.info/?l=linux-kernel&m=122419938208201&w=4

Thanks,
Rafael