2009-11-18 14:06:07

by Ferenc Wagner

[permalink] [raw]
Subject: Re: [linux-pm] intermittent suspend problem again

Ferenc Wagner <[email protected]> writes:

> Since I've instrumented s2disk and the hibernation path, no freeze
> happened during hibernating the machine.

Not until I removed the delays from hibernation_platform_enter(), which
were put there previously to get step-by-step feedback. Removing them
again resulted in a freeze in short course, maybe just two hibernations
later. The instrumentation shows it stuck in dpm_suspend_start(PMSG_HIBERNATE).
Does it mean that some device driver is at fault? I'll check if it
always fails at the same point (although tracing into dpm_suspend_start
isn't pure fun because of the multitude of devices it loops over). Is
there any way to get printk output from that phase?

Side question: If I run s2disk from the init=/bin/bash prompt, the
instrumentation in acpi_enter_sleep_state_prep in drivers/acpi/acpica/hwsleep.c
fires before the "Snapshotting system" phase, but it does not fire if I
hibernate from the full running desktop. (That instrumentation was put
there to investigate the KMS-triggered STR freeze.) What could explain
this?
--
Thanks,
Feri.


2009-11-18 22:12:00

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [linux-pm] intermittent suspend problem again

On Wednesday 18 November 2009, Ferenc Wagner wrote:
> Ferenc Wagner <[email protected]> writes:
>
> > Since I've instrumented s2disk and the hibernation path, no freeze
> > happened during hibernating the machine.
>
> Not until I removed the delays from hibernation_platform_enter(), which
> were put there previously to get step-by-step feedback. Removing them
> again resulted in a freeze in short course, maybe just two hibernations
> later. The instrumentation shows it stuck in dpm_suspend_start(PMSG_HIBERNATE).
> Does it mean that some device driver is at fault?

A driver or one of the platform hooks.

> I'll check if it always fails at the same point (although tracing into
> dpm_suspend_start isn't pure fun because of the multitude of devices it
> loops over). Is there any way to get printk output from that phase?

Compile with CONFIG_PM_VERBOSE (it does mean exactly that).

> Side question: If I run s2disk from the init=/bin/bash prompt, the
> instrumentation in acpi_enter_sleep_state_prep in drivers/acpi/acpica/hwsleep.c
> fires before the "Snapshotting system" phase, but it does not fire if I
> hibernate from the full running desktop. (That instrumentation was put
> there to investigate the KMS-triggered STR freeze.) What could explain
> this?

It looks like it uses the "shutdown" method when run with init=/bin/bash, but
I don't know why exactly.

Thanks,
Rafael

2009-11-18 22:54:24

by Ferenc Wagner

[permalink] [raw]
Subject: Re: [linux-pm] intermittent suspend problem again

"Rafael J. Wysocki" <[email protected]> writes:

> On Wednesday 18 November 2009, Ferenc Wagner wrote:
>> Ferenc Wagner <[email protected]> writes:
>>
>> > Since I've instrumented s2disk and the hibernation path, no freeze
>> > happened during hibernating the machine.
>>
>> Not until I removed the delays from hibernation_platform_enter(), which
>> were put there previously to get step-by-step feedback. Removing them
>> again resulted in a freeze in short course, maybe just two hibernations
>> later. The instrumentation shows it stuck in dpm_suspend_start(PMSG_HIBERNATE).
>> Does it mean that some device driver is at fault?
>
> A driver or one of the platform hooks.
>
>> I'll check if it always fails at the same point (although tracing into
>> dpm_suspend_start isn't pure fun because of the multitude of devices it
>> loops over). Is there any way to get printk output from that phase?
>
> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).

I've been running with CONFIG_PM_VERBOSE=y for a good while, but that
didn't help getting for example the result of the following printks to
the VGA console (0x3bc is the parallel port):

@@ -445,34 +446,66 @@ int hibernation_platform_enter(void)
* hibernation_ops->finish() before saving the image, so we should let
* the firmware know that we're going to enter the sleep state after all
*/
+ printk ("hibernation_ops->begin()...\n");
+ outb(16, 0x3bc);
error = hibernation_ops->begin();
+ outb(17, 0x3bc);
+ printk ("hibernation_ops->begin(): %d\n", error);
if (error)
goto Close;

However, my dmesg is full of lines like

agpgart-intel 0000:00:00.0: preparing freeze
pci 0000:00:00.1: preparing freeze
pci 0000:00:00.3: preparing freeze

etc., I'll check it they are the same all the time. Anyway, the above
printk strings aren't present in dmesg after a successful resume even,
so I must be doing something wrong... The parport pins do change, though.
Maybe explicit levels would work better? I can't see any other
difference from the pm_dev_dbg macro producing the above lines.

>> Side question: If I run s2disk from the init=/bin/bash prompt, the
>> instrumentation in acpi_enter_sleep_state_prep in drivers/acpi/acpica/hwsleep.c
>> fires before the "Snapshotting system" phase, but it does not fire if I
>> hibernate from the full running desktop. (That instrumentation was put
>> there to investigate the KMS-triggered STR freeze.) What could explain
>> this?
>
> It looks like it uses the "shutdown" method when run with init=/bin/bash, but
> I don't know why exactly.

Thanks for the tip, I'll check this too.
--
Regards,
Feri.

2009-11-28 19:01:28

by Ferenc Wagner

[permalink] [raw]
Subject: Re: [linux-pm] intermittent suspend problem again

"Rafael J. Wysocki" <[email protected]> writes:

> On Wednesday 18 November 2009, Ferenc Wagner wrote:
>
>> Ferenc Wagner <[email protected]> writes:
>>
>>> Since I've instrumented s2disk and the hibernation path, no freeze
>>> happened during hibernating the machine.
>>
>> Not until I removed the delays from hibernation_platform_enter(), which
>> were put there previously to get step-by-step feedback. Removing them
>> again resulted in a freeze in short course, maybe just two hibernations
>> later. The instrumentation shows it stuck in dpm_suspend_start(PMSG_HIBERNATE).
>> Does it mean that some device driver is at fault?
>
> A driver or one of the platform hooks.
>
>> I'll check if it always fails at the same point (although tracing into
>> dpm_suspend_start isn't pure fun because of the multitude of devices it
>> loops over). Is there any way to get printk output from that phase?
>
> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).

The last message now was:

e100: 0000:02:08.0: hibernate, may wakeup

Looks like hibernating the e100 driver is unstable.
--
Regards,
Feri.

2009-11-29 00:29:04

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [linux-pm] intermittent suspend problem again

On Saturday 28 November 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <[email protected]> writes:
>
> > On Wednesday 18 November 2009, Ferenc Wagner wrote:
> >
> >> Ferenc Wagner <[email protected]> writes:
> >>
> >>> Since I've instrumented s2disk and the hibernation path, no freeze
> >>> happened during hibernating the machine.
> >>
> >> Not until I removed the delays from hibernation_platform_enter(), which
> >> were put there previously to get step-by-step feedback. Removing them
> >> again resulted in a freeze in short course, maybe just two hibernations
> >> later. The instrumentation shows it stuck in dpm_suspend_start(PMSG_HIBERNATE).
> >> Does it mean that some device driver is at fault?
> >
> > A driver or one of the platform hooks.
> >
> >> I'll check if it always fails at the same point (although tracing into
> >> dpm_suspend_start isn't pure fun because of the multitude of devices it
> >> loops over). Is there any way to get printk output from that phase?
> >
> > Compile with CONFIG_PM_VERBOSE (it does mean exactly that).
>
> The last message now was:
>
> e100: 0000:02:08.0: hibernate, may wakeup
>
> Looks like hibernating the e100 driver is unstable.

Can you verify that by trying to hibernate without the e100 driver?

Rafael

2009-11-29 10:12:20

by Ferenc Wagner

[permalink] [raw]
Subject: Re: [linux-pm] intermittent suspend problem again

"Rafael J. Wysocki" <[email protected]> writes:

> On Saturday 28 November 2009, Ferenc Wagner wrote:
>> "Rafael J. Wysocki" <[email protected]> writes:
>>
>>> On Wednesday 18 November 2009, Ferenc Wagner wrote:
>>>
>>>> Ferenc Wagner <[email protected]> writes:
>>>>
>>>>> Since I've instrumented s2disk and the hibernation path, no freeze
>>>>> happened during hibernating the machine.
>>>>
>>>> Not until I removed the delays from hibernation_platform_enter(), which
>>>> were put there previously to get step-by-step feedback. Removing them
>>>> again resulted in a freeze in short course, maybe just two hibernations
>>>> later. The instrumentation shows it stuck in dpm_suspend_start(PMSG_HIBERNATE).
>>>> Does it mean that some device driver is at fault?
>>>
>>> A driver or one of the platform hooks.
>>>
>>>> I'll check if it always fails at the same point (although tracing into
>>>> dpm_suspend_start isn't pure fun because of the multitude of devices it
>>>> loops over). Is there any way to get printk output from that phase?
>>>
>>> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).
>>
>> The last message now was:
>>
>> e100: 0000:02:08.0: hibernate, may wakeup
>>
>> Looks like hibernating the e100 driver is unstable.
>
> Can you verify that by trying to hibernate without the e100 driver?

Not really, as I still can't reliable reproduce the issue. Since I'm
running with suspend loglevel = 8, it's happened only twice (in a row),
with seemingly exact same console output. Some earlier freezes also
happened in dpm_suspend_start, at least. However, I can certainly add
e100 to SUSPEND_MODULES under /etc/pm/config.d, and continue running
with that. Or I can try stress-testing the module, but not sure, how.
Interestingly, git log v2.6.31.. -- e100.c is tiny, but 8fbd962e affects
the suspend/resume routines through e100_up. This could explain the
timing-sensitive nature of the issue. I took the liberty to change the
Cc list, maybe linux-netdev can lend us a hand.
--
Regards,
Feri.

2009-11-29 15:06:47

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [linux-pm] intermittent suspend problem again

On Sunday 29 November 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <[email protected]> writes:
>
> > On Saturday 28 November 2009, Ferenc Wagner wrote:
> >> "Rafael J. Wysocki" <[email protected]> writes:
> >>
> >>> On Wednesday 18 November 2009, Ferenc Wagner wrote:
> >>>
> >>>> Ferenc Wagner <[email protected]> writes:
> >>>>
> >>>>> Since I've instrumented s2disk and the hibernation path, no freeze
> >>>>> happened during hibernating the machine.
> >>>>
> >>>> Not until I removed the delays from hibernation_platform_enter(), which
> >>>> were put there previously to get step-by-step feedback. Removing them
> >>>> again resulted in a freeze in short course, maybe just two hibernations
> >>>> later. The instrumentation shows it stuck in dpm_suspend_start(PMSG_HIBERNATE).
> >>>> Does it mean that some device driver is at fault?
> >>>
> >>> A driver or one of the platform hooks.
> >>>
> >>>> I'll check if it always fails at the same point (although tracing into
> >>>> dpm_suspend_start isn't pure fun because of the multitude of devices it
> >>>> loops over). Is there any way to get printk output from that phase?
> >>>
> >>> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).
> >>
> >> The last message now was:
> >>
> >> e100: 0000:02:08.0: hibernate, may wakeup
> >>
> >> Looks like hibernating the e100 driver is unstable.
> >
> > Can you verify that by trying to hibernate without the e100 driver?
>
> Not really, as I still can't reliable reproduce the issue. Since I'm
> running with suspend loglevel = 8, it's happened only twice (in a row),
> with seemingly exact same console output. Some earlier freezes also
> happened in dpm_suspend_start, at least. However, I can certainly add
> e100 to SUSPEND_MODULES under /etc/pm/config.d, and continue running
> with that.

That's what I'd do. In addition to that, you can run multiple
hibernation/resume cycles in a tight loop using the RTC wakealarm.

Thanks,
Rafael

2009-12-01 10:30:32

by Ferenc Wagner

[permalink] [raw]
Subject: Re: [linux-pm] intermittent suspend problem again

"Rafael J. Wysocki" <[email protected]> writes:

> On Sunday 29 November 2009, Ferenc Wagner wrote:
>
>> "Rafael J. Wysocki" <[email protected]> writes:
>>
>>> On Saturday 28 November 2009, Ferenc Wagner wrote:
>>>
>>>> "Rafael J. Wysocki" <[email protected]> writes:
>>>>
>>>>> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).
>>>>
>>>> The last message now was:
>>>>
>>>> e100: 0000:02:08.0: hibernate, may wakeup
>>>>
>>>> Looks like hibernating the e100 driver is unstable.
>>>
>>> Can you verify that by trying to hibernate without the e100 driver?
>>
>> Not really, as I still can't reliable reproduce the issue. Since I'm
>> running with suspend loglevel = 8, it's happened only twice (in a row),
>> with seemingly exact same console output. Some earlier freezes also
>> happened in dpm_suspend_start, at least. However, I can certainly add
>> e100 to SUSPEND_MODULES under /etc/pm/config.d, and continue running
>> with that.
>
> That's what I'd do.

That worked out mosty OK (no freeze in quite some hibernation cycles),
but I'm continuing testing it.

On the other hand, I reverted 8fbd962e3, recompiled and replaced the
module, and got the freeze during hibernation. And that was the bulk of
the changes since 2.6.31... I'll revert the rest and test again, but
that seems purely cosmetic, so no high hopes.

> In addition to that, you can run multiple hibernation/resume cycles in
> a tight loop using the RTC wakealarm.

I'll do so, as soon as I find a way to automatically supply the dm-crypt
passphrase... or even better, learn to hibernate to ramdisk from the
initramfs. :)
--
Cheers,
Feri.

2009-12-01 12:27:54

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [linux-pm] intermittent suspend problem again

On Tuesday 01 December 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <[email protected]> writes:
>
> > On Sunday 29 November 2009, Ferenc Wagner wrote:
> >
> >> "Rafael J. Wysocki" <[email protected]> writes:
> >>
> >>> On Saturday 28 November 2009, Ferenc Wagner wrote:
> >>>
> >>>> "Rafael J. Wysocki" <[email protected]> writes:
> >>>>
> >>>>> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).
> >>>>
> >>>> The last message now was:
> >>>>
> >>>> e100: 0000:02:08.0: hibernate, may wakeup
> >>>>
> >>>> Looks like hibernating the e100 driver is unstable.
> >>>
> >>> Can you verify that by trying to hibernate without the e100 driver?
> >>
> >> Not really, as I still can't reliable reproduce the issue. Since I'm
> >> running with suspend loglevel = 8, it's happened only twice (in a row),
> >> with seemingly exact same console output. Some earlier freezes also
> >> happened in dpm_suspend_start, at least. However, I can certainly add
> >> e100 to SUSPEND_MODULES under /etc/pm/config.d, and continue running
> >> with that.
> >
> > That's what I'd do.
>
> That worked out mosty OK (no freeze in quite some hibernation cycles),
> but I'm continuing testing it.

Great, please let me know how it works out.

> On the other hand, I reverted 8fbd962e3, recompiled and replaced the
> module, and got the freeze during hibernation. And that was the bulk of
> the changes since 2.6.31... I'll revert the rest and test again, but
> that seems purely cosmetic, so no high hopes.
>
> > In addition to that, you can run multiple hibernation/resume cycles in
> > a tight loop using the RTC wakealarm.
>
> I'll do so, as soon as I find a way to automatically supply the dm-crypt
> passphrase... or even better, learn to hibernate to ramdisk from the
> initramfs. :)

Well, you don't need to use swap encryptuon for _testing_. :-)

Thanks,
Rafael

2009-12-01 17:47:09

by Ferenc Wagner

[permalink] [raw]
Subject: Re: [linux-pm] intermittent suspend problem again

"Rafael J. Wysocki" <[email protected]> writes:

> On Tuesday 01 December 2009, Ferenc Wagner wrote:
>> "Rafael J. Wysocki" <[email protected]> writes:
>>
>>> On Sunday 29 November 2009, Ferenc Wagner wrote:
>>>
>>>> "Rafael J. Wysocki" <[email protected]> writes:
>>>>
>>>>> On Saturday 28 November 2009, Ferenc Wagner wrote:
>>>>>
>>>>>> "Rafael J. Wysocki" <[email protected]> writes:
>>>>>>
>>>>>>> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).
>>>>>>
>>>>>> The last message now was:
>>>>>>
>>>>>> e100: 0000:02:08.0: hibernate, may wakeup
>>>>>>
>>>>>> Looks like hibernating the e100 driver is unstable.
>>>>>
>>>>> Can you verify that by trying to hibernate without the e100 driver?
>>>>
>>>> Not really, as I still can't reliable reproduce the issue. Since I'm
>>>> running with suspend loglevel = 8, it's happened only twice (in a row),
>>>> with seemingly exact same console output. Some earlier freezes also
>>>> happened in dpm_suspend_start, at least. However, I can certainly add
>>>> e100 to SUSPEND_MODULES under /etc/pm/config.d, and continue running
>>>> with that.
>>>
>>> That's what I'd do.
>>
>> That worked out mosty OK (no freeze in quite some hibernation cycles),
>> but I'm continuing testing it.
>
> Great, please let me know how it works out.

Will do. On the negative side, this tends to confuse NetworkManager.

>> On the other hand, I reverted 8fbd962e3, recompiled and replaced the
>> module, and got the freeze during hibernation. And that was the bulk of
>> the changes since 2.6.31... I'll revert the rest and test again, but
>> that seems purely cosmetic, so no high hopes.
>>
>>> In addition to that, you can run multiple hibernation/resume cycles in
>>> a tight loop using the RTC wakealarm.
>>
>> I'll do so, as soon as I find a way to automatically supply the dm-crypt
>> passphrase... or even better, learn to hibernate to ramdisk from the
>> initramfs. :)
>
> Well, you don't need to use swap encryption for _testing_. :-)

I use partition encryption, everything except for /boot is encrypted.
Apropos: does s2disk perform encryption with a temporary key even if I
don't supply and RSA key, to protect mlocked application data from being
present in the swap after restore?
--
Thanks,
Feri.

2009-12-01 21:32:25

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [linux-pm] intermittent suspend problem again

On Tuesday 01 December 2009, Ferenc Wagner wrote:
> "Rafael J. Wysocki" <[email protected]> writes:
>
> > On Tuesday 01 December 2009, Ferenc Wagner wrote:
> >> "Rafael J. Wysocki" <[email protected]> writes:
> >>
> >>> On Sunday 29 November 2009, Ferenc Wagner wrote:
> >>>
> >>>> "Rafael J. Wysocki" <[email protected]> writes:
> >>>>
> >>>>> On Saturday 28 November 2009, Ferenc Wagner wrote:
> >>>>>
> >>>>>> "Rafael J. Wysocki" <[email protected]> writes:
> >>>>>>
> >>>>>>> Compile with CONFIG_PM_VERBOSE (it does mean exactly that).
> >>>>>>
> >>>>>> The last message now was:
> >>>>>>
> >>>>>> e100: 0000:02:08.0: hibernate, may wakeup
> >>>>>>
> >>>>>> Looks like hibernating the e100 driver is unstable.
> >>>>>
> >>>>> Can you verify that by trying to hibernate without the e100 driver?
> >>>>
> >>>> Not really, as I still can't reliable reproduce the issue. Since I'm
> >>>> running with suspend loglevel = 8, it's happened only twice (in a row),
> >>>> with seemingly exact same console output. Some earlier freezes also
> >>>> happened in dpm_suspend_start, at least. However, I can certainly add
> >>>> e100 to SUSPEND_MODULES under /etc/pm/config.d, and continue running
> >>>> with that.
> >>>
> >>> That's what I'd do.
> >>
> >> That worked out mosty OK (no freeze in quite some hibernation cycles),
> >> but I'm continuing testing it.
> >
> > Great, please let me know how it works out.
>
> Will do. On the negative side, this tends to confuse NetworkManager.
>
> >> On the other hand, I reverted 8fbd962e3, recompiled and replaced the
> >> module, and got the freeze during hibernation. And that was the bulk of
> >> the changes since 2.6.31... I'll revert the rest and test again, but
> >> that seems purely cosmetic, so no high hopes.
> >>
> >>> In addition to that, you can run multiple hibernation/resume cycles in
> >>> a tight loop using the RTC wakealarm.
> >>
> >> I'll do so, as soon as I find a way to automatically supply the dm-crypt
> >> passphrase... or even better, learn to hibernate to ramdisk from the
> >> initramfs. :)
> >
> > Well, you don't need to use swap encryption for _testing_. :-)
>
> I use partition encryption, everything except for /boot is encrypted.

If /boot is big enough, you could use a swap file in /boot for the testing.

> Apropos: does s2disk perform encryption with a temporary key even if I
> don't supply and RSA key, to protect mlocked application data from being
> present in the swap after restore?

It can do that, but you need to provide a key during suspend and resume.

Otherwise it doesn't use a random key, because it would have to store it in
the clear in the image header.

Thanks,
Rafael

2009-12-02 01:58:47

by Ferenc Wagner

[permalink] [raw]
Subject: Re: [linux-pm] intermittent suspend problem again

"Rafael J. Wysocki" <[email protected]> writes:

> On Tuesday 01 December 2009, Ferenc Wagner wrote:
>
>> "Rafael J. Wysocki" <[email protected]> writes:
>>
>>> On Tuesday 01 December 2009, Ferenc Wagner wrote:
>>>
>>>> "Rafael J. Wysocki" <[email protected]> writes:
>>>>
>>>>> In addition to that, you can run multiple hibernation/resume cycles in
>>>>> a tight loop using the RTC wakealarm.
>>>>
>>>> I'll do so, as soon as I find a way to automatically supply the dm-crypt
>>>> passphrase... or even better, learn to hibernate to ramdisk from the
>>>> initramfs. :)
>>>
>>> Well, you don't need to use swap encryption for _testing_. :-)
>>
>> I use partition encryption, everything except for /boot is encrypted.
>
> If /boot is big enough, you could use a swap file in /boot for the testing.

Ramdisk worked good. Maybe too good, because I left the machine doing
s2disks while I was having dinner, and it achieved some 120 suspends
without a freeze. Only the e100 and the mii modules were loaded.

After some script munging I got the machine automatically boot with an
alternate passphrase, so in vivo testing is possible now. I mean,
tomorrow.

Btw. s2disk has a strange effect of simulating enters during suspend.
It looks like this in a terminal:

$ sudo s2disk



















$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$ <cursor is here>

Can you also see this?

>> Apropos: does s2disk perform encryption with a temporary key even if I
>> don't supply and RSA key, to protect mlocked application data from being
>> present in the swap after restore?
>
> It can do that, but you need to provide a key during suspend and resume.
>
> Otherwise it doesn't use a random key, because it would have to store it in
> the clear in the image header.

So you don't feel like the "What is this 'Encrypt suspend image' for?"
Q&A in Documentation/swsusp.txt describes a real threat, do you? If
an "application" has direct access to swap, then it's game over anyway.
--
Thanks,
Feri.

2009-12-02 10:55:18

by Ferenc Wagner

[permalink] [raw]
Subject: Re: [linux-pm] intermittent suspend problem again

Ferenc Wagner <[email protected]> writes:

> "Rafael J. Wysocki" <[email protected]> writes:
>
>> On Tuesday 01 December 2009, Ferenc Wagner wrote:
>>
>>> "Rafael J. Wysocki" <[email protected]> writes:
>>>
>>>> On Tuesday 01 December 2009, Ferenc Wagner wrote:
>>>>
>>>>> "Rafael J. Wysocki" <[email protected]> writes:
>>>>>
>>>>>> In addition to that, you can run multiple hibernation/resume cycles in
>>>>>> a tight loop using the RTC wakealarm.
>>>>>
>>>>> I'll do so, as soon as I find a way to automatically supply the dm-crypt
>>>>> passphrase... or even better, learn to hibernate to ramdisk from the
>>>>> initramfs. :)
>>>>
>>>> Well, you don't need to use swap encryption for _testing_. :-)
>>>
>>> I use partition encryption, everything except for /boot is encrypted.
>>
>> If /boot is big enough, you could use a swap file in /boot for the testing.
>
> Ramdisk worked good. Maybe too good, because I left the machine doing
> s2disks while I was having dinner, and it achieved some 120 suspends
> without a freeze. Only the e100 and the mii modules were loaded.
>
> After some script munging I got the machine automatically boot with an
> alternate passphrase, so in vivo testing is possible now. I mean,
> tomorrow.

After almost 100 hibernate/resume cycles, I have to say that this issue
can't be reproduced by suspending in a tight loop. I tried that while
flood pinging my gateway and also with no network activity. The rc8
e100 module was loaded all the time.

> Btw. s2disk has a strange effect of simulating enters during suspend.
> [...]
> Can you also see this?

It can't be seen from the "sleep 1; s2disk" command, so it's probably an
artifact from X, when s2disk starts before Enter is released.
--
Regards,
Feri.

2009-12-02 12:27:22

by Stefan Seyfried

[permalink] [raw]
Subject: Re: [linux-pm] intermittent suspend problem again

On Wed, 02 Dec 2009 02:58:41 +0100
Ferenc Wagner <[email protected]> wrote:

> Btw. s2disk has a strange effect of simulating enters during suspend.
> It looks like this in a terminal:
>
> $ sudo s2disk

...

>
>
>
>
>
>
> $
> $
> $

...

> $
> $
> $
> $
> $ <cursor is here>
>
> Can you also see this?

That's an old "bug" in X (but fixed for me since quite some time) IIUC:
you type s2disk <enter>, s2disk switches to console 1, while enter is
still pressed.
=> X does not know that enter got "released" and will only notice after
resume finished. The X internal key autorepeat does the rest.

Does not happen for me since at least one year, I'd guess, but I am
always running the latest and greatest bleeding edge of everything ;)

HTH,

seife
--
Stefan Seyfried

"Any ideas, John?"
"Well, surrounding them's out."

2009-12-02 21:32:51

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [linux-pm] intermittent suspend problem again

On Wednesday 02 December 2009, Ferenc Wagner wrote:
> Ferenc Wagner <[email protected]> writes:
>
> > "Rafael J. Wysocki" <[email protected]> writes:
> >
> >> On Tuesday 01 December 2009, Ferenc Wagner wrote:
> >>
> >>> "Rafael J. Wysocki" <[email protected]> writes:
> >>>
> >>>> On Tuesday 01 December 2009, Ferenc Wagner wrote:
> >>>>
> >>>>> "Rafael J. Wysocki" <[email protected]> writes:
> >>>>>
> >>>>>> In addition to that, you can run multiple hibernation/resume cycles in
> >>>>>> a tight loop using the RTC wakealarm.
> >>>>>
> >>>>> I'll do so, as soon as I find a way to automatically supply the dm-crypt
> >>>>> passphrase... or even better, learn to hibernate to ramdisk from the
> >>>>> initramfs. :)
> >>>>
> >>>> Well, you don't need to use swap encryption for _testing_. :-)
> >>>
> >>> I use partition encryption, everything except for /boot is encrypted.
> >>
> >> If /boot is big enough, you could use a swap file in /boot for the testing.
> >
> > Ramdisk worked good. Maybe too good, because I left the machine doing
> > s2disks while I was having dinner, and it achieved some 120 suspends
> > without a freeze. Only the e100 and the mii modules were loaded.
> >
> > After some script munging I got the machine automatically boot with an
> > alternate passphrase, so in vivo testing is possible now. I mean,
> > tomorrow.
>
> After almost 100 hibernate/resume cycles, I have to say that this issue
> can't be reproduced by suspending in a tight loop. I tried that while
> flood pinging my gateway and also with no network activity. The rc8
> e100 module was loaded all the time.

Then I guess we do something that confuses your machine's BIOS or it's a
timing-related issue (ie. there has to be a substantial delay between the
hibernation and restore to trigger the problem). Or both.

Thanks,
Rafael

2009-12-02 21:48:41

by Mikael Abrahamsson

[permalink] [raw]
Subject: Re: [linux-pm] intermittent suspend problem again

On Wed, 2 Dec 2009, Rafael J. Wysocki wrote:

> Then I guess we do something that confuses your machine's BIOS or it's a
> timing-related issue (ie. there has to be a substantial delay between
> the hibernation and restore to trigger the problem). Or both.

I don't know if it's related, but on 2.6.31.something (ubuntu 9.10 stock
kernel) I have intermittent suspend/resume problems on my Thinkpad X200,
and I get the problems much more frequent when running on batteries than
when I'm running with the power cable plugged in. Just saying that this
might be something to test as well.

Link to "my" bug on launchpad:

<https://bugs.launchpad.net/ubuntu/+source/linux/+bug/473876>

--
Mikael Abrahamsson email: [email protected]

2009-12-02 21:48:24

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [linux-pm] intermittent suspend problem again

On Wednesday 02 December 2009, Ferenc Wagner wrote:
> Ferenc Wagner <[email protected]> writes:
>
> > "Rafael J. Wysocki" <[email protected]> writes:
> >
> >> On Tuesday 01 December 2009, Ferenc Wagner wrote:
> >>
> >>> "Rafael J. Wysocki" <[email protected]> writes:
> >>>
> >>>> On Tuesday 01 December 2009, Ferenc Wagner wrote:
> >>>>
> >>>>> "Rafael J. Wysocki" <[email protected]> writes:
> >>>>>
> >>>>>> In addition to that, you can run multiple hibernation/resume cycles in
> >>>>>> a tight loop using the RTC wakealarm.
> >>>>>
> >>>>> I'll do so, as soon as I find a way to automatically supply the dm-crypt
> >>>>> passphrase... or even better, learn to hibernate to ramdisk from the
> >>>>> initramfs. :)
> >>>>
> >>>> Well, you don't need to use swap encryption for _testing_. :-)
> >>>
> >>> I use partition encryption, everything except for /boot is encrypted.
> >>
> >> If /boot is big enough, you could use a swap file in /boot for the testing.
> >
> > Ramdisk worked good. Maybe too good, because I left the machine doing
> > s2disks while I was having dinner, and it achieved some 120 suspends
> > without a freeze. Only the e100 and the mii modules were loaded.
> >
> > After some script munging I got the machine automatically boot with an
> > alternate passphrase, so in vivo testing is possible now. I mean,
> > tomorrow.
>
> After almost 100 hibernate/resume cycles, I have to say that this issue
> can't be reproduced by suspending in a tight loop. I tried that while
> flood pinging my gateway and also with no network activity. The rc8
> e100 module was loaded all the time.

I wonder if this patch:

http://patchwork.kernel.org/patch/64276/

helps in your case.

Thanks,
Rafael

2009-12-13 01:41:56

by Pavel Machek

[permalink] [raw]
Subject: s2disk encryption was Re: [linux-pm] intermittent suspend problem again

Hi!

> > >> On the other hand, I reverted 8fbd962e3, recompiled and replaced the
> > >> module, and got the freeze during hibernation. And that was the bulk of
> > >> the changes since 2.6.31... I'll revert the rest and test again, but
> > >> that seems purely cosmetic, so no high hopes.
> > >>
> > >>> In addition to that, you can run multiple hibernation/resume cycles in
> > >>> a tight loop using the RTC wakealarm.
> > >>
> > >> I'll do so, as soon as I find a way to automatically supply the dm-crypt
> > >> passphrase... or even better, learn to hibernate to ramdisk from the
> > >> initramfs. :)
> > >
> > > Well, you don't need to use swap encryption for _testing_. :-)
> >
> > I use partition encryption, everything except for /boot is encrypted.
>
> If /boot is big enough, you could use a swap file in /boot for the testing.
>
> > Apropos: does s2disk perform encryption with a temporary key even if I
> > don't supply and RSA key, to protect mlocked application data from being
> > present in the swap after restore?
>
> It can do that, but you need to provide a key during suspend and resume.
>
> Otherwise it doesn't use a random key, because it would have to store it in
> the clear in the image header.

I believe it can use random key, stored in clear in image
header. Reason is... image header is easier to overwrite than removing
whole image.

That was original motivation for encryption... not having to overwrite
swap data with zeros.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2009-12-13 01:41:55

by Pavel Machek

[permalink] [raw]
Subject: Re: [linux-pm] intermittent suspend problem again

Hi!

> > Btw. s2disk has a strange effect of simulating enters during suspend.
> > [...]
> > Can you also see this?
>
> It can't be seen from the "sleep 1; s2disk" command, so it's probably an
> artifact from X, when s2disk starts before Enter is released.

Yes, I see that, too, and believe it is X artifact.

(X keyboard handling is really broken: it does its own
autorepeat. That does not really work when big latencies are around.)
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html