LinuxLists.cc - Re: [PATCH 0/2] Fix hangup after creating checkpoint on Xen.

2011-02-07 09:35:50

Subject: Re: [PATCH 0/2] Fix hangup after creating checkpoint on Xen.

On Monday, February 07, 2011, SUZUKI, Kazuhiro wrote:
> Hi,

Hi,

> The following patch series fixes hangup after creating checkpoint on
> Xen. The Linux Xen guest can be saved the state to restore later, and
> also created snapshot like checkpoint via the hypervisor.
> But, when the snapshot is created for the PV guest, it will hangup.
>
> We added 'PMSG_CANCEL' message and 'cancel' handler in dev_pm_ops
> struct in the pm-linux part.

Please don't do that, unless you can convince me there's no other way to fix
the problem you're trying to address.

In my opinion it's highly unrealistic to assume that device drivers
(or even subsystems) will implement the ->cancel() callback just for the
benefit of Xen.

And if the only subsystem that needs to implement ->cancel() is Xen, then the
issue should be addressed without modifying the device core code, in a different
way.

> In creating checkpoint mode, the resume handler of xenbus should not
> be called. In this case, it is recognized that the suspend was canceled
> in drivers/xen/manage.c and call dpm_resume_end() with PMSG_CANCEL.
> If the 'cancel' handler is defined, it is called instead of resume().
>
> [1/2] - Fix hangup after creating checkpoint on Xen -- pm-linux part.
> [2/2] - Fix hangup after creating checkpoint on Xen -- Xen part.

Thanks,
Rafael

2011-02-08 11:23:13

by Ian Campbell

[permalink] [raw]

Subject: Re: [PATCH 0/2] Fix hangup after creating checkpoint on Xen.

On Mon, 2011-02-07 at 10:35 +0100, Rafael J. Wysocki wrote:
> On Monday, February 07, 2011, SUZUKI, Kazuhiro wrote:
> > Hi,
>
> Hi,
>
> > The following patch series fixes hangup after creating checkpoint on
> > Xen. The Linux Xen guest can be saved the state to restore later, and
> > also created snapshot like checkpoint via the hypervisor.
> > But, when the snapshot is created for the PV guest, it will hangup.
> >
> > We added 'PMSG_CANCEL' message and 'cancel' handler in dev_pm_ops
> > struct in the pm-linux part.
>
> Please don't do that, unless you can convince me there's no other way to fix
> the problem you're trying to address.

Sorry, it was my advise to Kazuhiro that solving the underlying issue by
extending the core would be preferable to making Xen specific hacks.

The problem is that currently we have:

dpm_suspend_start(PMSG_SUSPEND);

dpm_suspend_noirq(PMSG_SUSPEND);

sysdev_suspend(PMSG_SUSPEND);
/* suspend hypercall */
sysdev_resume();

dpm_resume_noirq(PMSG_RESUME);

dpm_resume_end(PMSG_RESUME);

However the suspend hypercall can return a value indicating that the
suspend didn't actually happen (e.g. was cancelled). This is used e.g.
when checkpointing guests, because in that case you want the original
guest to continue. When the suspend didn't happen the drivers need to
recover differently from if it did.

The originally proposed solution was to only call dpm_resume_end if the
suspend was not cancelled. My concern with this was that unbalancing the
dpm_suspend_* and dpm_resume_* did not seem like a correct interaction
with the core. For example dpm_suspend_* adds stuff to
dpm_suspended_list and if dpm_resume_* is not called presumably this all
gets out of sync for next time. Hence my suggestion to add a cancel
message type.

> In my opinion it's highly unrealistic to assume that device drivers
> (or even subsystems) will implement the ->cancel() callback just for the
> benefit of Xen.

I did vague wonder if a similar message might be of interest to e.g.
cancelling hibernations or similar.

> And if the only subsystem that needs to implement ->cancel() is Xen, then the
> issue should be addressed without modifying the device core code, in a different
> way.

I thought it would be preferable to make use of/extend core
functionality where possible but if that's not the case we can find
another way.

Do you have any suggestions for how to correctly interact with the core
functions?

Is adding a suspend_cancel operation to just at the struct xenbus_driver
level and introducing a xen specific function to walk to the bus the
sort of thing you were thinking of? (it seems reasonable). Should we be
doing anything with dpm_*_list in that case?

(FWIW original thread is on xen-devel at
http://thread.gmane.org/gmane.comp.emulators.xen.devel/95265/)

Ian.

--
Ian Campbell
Current Noise: Crowbar - Dead Sun

Why on earth do people buy old bottles of wine when they can get a
fresh one for a quarter of the price?

2011-02-08 16:46:38

by Alan Stern

[permalink] [raw]

Subject: Re: [linux-pm] [PATCH 0/2] Fix hangup after creating checkpoint on Xen.

On Tue, 8 Feb 2011, Ian Campbell wrote:

> The problem is that currently we have:
>
> dpm_suspend_start(PMSG_SUSPEND);
>
> dpm_suspend_noirq(PMSG_SUSPEND);
>
> sysdev_suspend(PMSG_SUSPEND);
> /* suspend hypercall */
> sysdev_resume();
>
> dpm_resume_noirq(PMSG_RESUME);
>
> dpm_resume_end(PMSG_RESUME);
>
> However the suspend hypercall can return a value indicating that the
> suspend didn't actually happen (e.g. was cancelled). This is used e.g.
> when checkpointing guests, because in that case you want the original
> guest to continue. When the suspend didn't happen the drivers need to
> recover differently from if it did.

That is odd, and it is quite different from the intended design of the
PM core. Drivers are supposed to put their devices into a known
suspended state; then afterwards they put the devices back into an
operational state. What happens while the devices are in the suspended
state isn't supposed to matter -- the system transition can fail, but
devices get treated exactly the same way as if it succeeded.

Why do your drivers need to recover differently based on the success of
the hypercall?

Alan Stern

2011-02-08 17:36:09

by Ian Campbell

[permalink] [raw]

Subject: Re: [linux-pm] [PATCH 0/2] Fix hangup after creating checkpoint on Xen.

On Tue, 2011-02-08 at 11:46 -0500, Alan Stern wrote:
> On Tue, 8 Feb 2011, Ian Campbell wrote:
>
> > The problem is that currently we have:
> >
> > dpm_suspend_start(PMSG_SUSPEND);
> >
> > dpm_suspend_noirq(PMSG_SUSPEND);
> >
> > sysdev_suspend(PMSG_SUSPEND);
> > /* suspend hypercall */
> > sysdev_resume();
> >
> > dpm_resume_noirq(PMSG_RESUME);
> >
> > dpm_resume_end(PMSG_RESUME);
> >
> > However the suspend hypercall can return a value indicating that the
> > suspend didn't actually happen (e.g. was cancelled). This is used e.g.
> > when checkpointing guests, because in that case you want the original
> > guest to continue. When the suspend didn't happen the drivers need to
> > recover differently from if it did.
>
> That is odd, and it is quite different from the intended design of the
> PM core. Drivers are supposed to put their devices into a known
> suspended state; then afterwards they put the devices back into an
> operational state. What happens while the devices are in the suspended
> state isn't supposed to matter -- the system transition can fail, but
> devices get treated exactly the same way as if it succeeded.
>
> Why do your drivers need to recover differently based on the success of
> the hypercall?

checkpointing isn't really my area but AIUI you don't want to do a full
device teardown and reconnect like you would with a proper suspend
because of the time that takes which prevents you from doing continuous
rolling checkpoints at granularity which people want to implement
various disaster recovery schemes.

Hopefully one of the Xen checkpointing folks will chime in and explain
why this is not possible to achieve at the individual driver level (or,
even better, with a patch which does it that way ;-)).

Ian.

--
Ian Campbell
Current Noise: Buckcherry - King Of Kings

Beauty? What's that?
-- Larry Wall in <[email protected]>

2011-02-09 23:23:14

by Brendan Cully

[permalink] [raw]

Subject: Re: [linux-pm] [PATCH 0/2] Fix hangup after creating checkpoint on Xen.

On Tuesday, 08 February 2011 at 17:35, Ian Campbell wrote:
> On Tue, 2011-02-08 at 11:46 -0500, Alan Stern wrote:
> > On Tue, 8 Feb 2011, Ian Campbell wrote:
> >
> > > The problem is that currently we have:
> > >
> > > dpm_suspend_start(PMSG_SUSPEND);
> > >
> > > dpm_suspend_noirq(PMSG_SUSPEND);
> > >
> > > sysdev_suspend(PMSG_SUSPEND);
> > > /* suspend hypercall */
> > > sysdev_resume();
> > >
> > > dpm_resume_noirq(PMSG_RESUME);
> > >
> > > dpm_resume_end(PMSG_RESUME);
> > >
> > > However the suspend hypercall can return a value indicating that the
> > > suspend didn't actually happen (e.g. was cancelled). This is used e.g.
> > > when checkpointing guests, because in that case you want the original
> > > guest to continue. When the suspend didn't happen the drivers need to
> > > recover differently from if it did.
> >
> > That is odd, and it is quite different from the intended design of the
> > PM core. Drivers are supposed to put their devices into a known
> > suspended state; then afterwards they put the devices back into an
> > operational state. What happens while the devices are in the suspended
> > state isn't supposed to matter -- the system transition can fail, but
> > devices get treated exactly the same way as if it succeeded.
> >
> > Why do your drivers need to recover differently based on the success of
> > the hypercall?
>
> checkpointing isn't really my area but AIUI you don't want to do a full
> device teardown and reconnect like you would with a proper suspend
> because of the time that takes which prevents you from doing continuous
> rolling checkpoints at granularity which people want to implement
> various disaster recovery schemes.
>
> Hopefully one of the Xen checkpointing folks will chime in and explain
> why this is not possible to achieve at the individual driver level (or,
> even better, with a patch which does it that way ;-)).

As Ian says, Xen has suspend_cancel because while the normal full
suspend/resume path works, it is much slower, and the work done during
resume is redundant. I don't remember the numbers offhand, but when we
added suspend_cancel I think we could do full suspend/resume in under
100us, which was maybe a couple of orders of magnitude faster than
full resume (largely due to slow xenstore handshaking on resume,
IIRC). It made a big difference for our Remus HA project, which was
checkpointing tens of times per second.

I'd like to keep the fast resume option, and expect that it can be
contained entirely in Xen-specific code. I'll try to get someone to
look into it here.

I think fast resume is somewhat orthogonal to the problem of hanging
on resume, which just sounds like a xen-specific bug in the slow
path.

2011-02-09 23:42:55

by Alan Stern

[permalink] [raw]

Subject: Re: [linux-pm] [PATCH 0/2] Fix hangup after creating checkpoint on Xen.

On Wed, 9 Feb 2011, Brendan Cully wrote:

> > > That is odd, and it is quite different from the intended design of the
> > > PM core. Drivers are supposed to put their devices into a known
> > > suspended state; then afterwards they put the devices back into an
> > > operational state. What happens while the devices are in the suspended
> > > state isn't supposed to matter -- the system transition can fail, but
> > > devices get treated exactly the same way as if it succeeded.
> > >
> > > Why do your drivers need to recover differently based on the success of
> > > the hypercall?
> >
> > checkpointing isn't really my area but AIUI you don't want to do a full
> > device teardown and reconnect like you would with a proper suspend
> > because of the time that takes which prevents you from doing continuous
> > rolling checkpoints at granularity which people want to implement
> > various disaster recovery schemes.
> >
> > Hopefully one of the Xen checkpointing folks will chime in and explain
> > why this is not possible to achieve at the individual driver level (or,
> > even better, with a patch which does it that way ;-)).
>
> As Ian says, Xen has suspend_cancel because while the normal full
> suspend/resume path works, it is much slower, and the work done during
> resume is redundant. I don't remember the numbers offhand, but when we
> added suspend_cancel I think we could do full suspend/resume in under
> 100us, which was maybe a couple of orders of magnitude faster than
> full resume (largely due to slow xenstore handshaking on resume,
> IIRC). It made a big difference for our Remus HA project, which was
> checkpointing tens of times per second.
>
> I'd like to keep the fast resume option, and expect that it can be
> contained entirely in Xen-specific code. I'll try to get someone to
> look into it here.
>
> I think fast resume is somewhat orthogonal to the problem of hanging
> on resume, which just sounds like a xen-specific bug in the slow
> path.

In fact there already is a "fast suspend & resume" path in the PM core.
It's the freeze/thaw procedure used when starting to hibernate. The
documentation specifically says that drivers' freeze methods are
supposed to quiesce their devices but not change power levels. In
addition, the thaw method is invoked as part of recovery from a failed
hibernation attempt, so it already has the "cancel" semantics that xen
seems to want.

Alan Stern

2011-02-10 11:31:35

by Ian Campbell

[permalink] [raw]

Subject: Re: [Xen-devel] Re: [linux-pm] [PATCH 0/2] Fix hangup after creating checkpoint on Xen.

On Wed, 2011-02-09 at 23:16 +0000, Brendan Cully wrote:
> I'd like to keep the fast resume option, and expect that it can be
> contained entirely in Xen-specific code. I'll try to get someone to
> look into it here.

Thanks, please put them in contact with Kazuhiro who has already been
looking into this.

> I think fast resume is somewhat orthogonal to the problem of hanging
> on resume, which just sounds like a xen-specific bug in the slow
> path.

The bug on the Xen side is taking the slow path when the suspend was
actually cancelled, which is what the original patch tries to fix.

Or should/could the slow path be able cope either way?

Ian.

--
Ian Campbell
Current Noise: Neil Young - Love And War

For every complex problem, there is a solution that is simple, neat, and wrong.
-- H. L. Mencken

2011-02-10 11:40:19

by Ian Campbell

[permalink] [raw]

Subject: Re: [Xen-devel] Re: [linux-pm] [PATCH 0/2] Fix hangup after creating checkpoint on Xen.

On Wed, 2011-02-09 at 23:42 +0000, Alan Stern wrote:
> In fact there already is a "fast suspend & resume" path in the PM core.
> It's the freeze/thaw procedure used when starting to hibernate. The
> documentation specifically says that drivers' freeze methods are
> supposed to quiesce their devices but not change power levels. In
> addition, the thaw method is invoked as part of recovery from a failed
> hibernation attempt, so it already has the "cancel" semantics that xen
> seems to want.

Sounds like that would work and I would much prefer to simply make
correct use of the core functionality.

So PMSG_FREEZE is balanced by either PMSG_RECOVER or PMSG_THAW depending
on whether the suspend was cancelled or not? So the sequence of events
is something like:
dpm_suspend_start(PMSG_FREEZE);

dpm_suspend_noirq(PMSG_FREEZE);

sysdev_suspend(PMSG_QUIESCE);
cancelled = suspend_hypercall()
sysdev_resume();

dpm_resume_noirq(cancelled ? PMSG_RECOVER : PMSG_THAW);

dpm_resume_end(cancelled ? PMSG_RECOVER : PMSG_THAW);
?

(For comparison we currently have:
> > > dpm_suspend_start(PMSG_SUSPEND);
> > >
> > > dpm_suspend_noirq(PMSG_SUSPEND);
> > >
> > > sysdev_suspend(PMSG_SUSPEND);
> > > /* suspend hypercall */
> > > sysdev_resume();
> > >
> > > dpm_resume_noirq(PMSG_RESUME);
> > >
> > > dpm_resume_end(PMSG_RESUME);
)

Ian.
--
Ian Campbell
Current Noise: Neil Young - Angry World

If only one could get that wonderful feeling of accomplishment without
having to accomplish anything.

2011-02-10 16:00:56

by Alan Stern

[permalink] [raw]

Subject: Re: [Xen-devel] Re: [linux-pm] [PATCH 0/2] Fix hangup after creating checkpoint on Xen.

On Thu, 10 Feb 2011, Ian Campbell wrote:

> On Wed, 2011-02-09 at 23:42 +0000, Alan Stern wrote:
> > In fact there already is a "fast suspend & resume" path in the PM core.
> > It's the freeze/thaw procedure used when starting to hibernate. The
> > documentation specifically says that drivers' freeze methods are
> > supposed to quiesce their devices but not change power levels. In
> > addition, the thaw method is invoked as part of recovery from a failed
> > hibernation attempt, so it already has the "cancel" semantics that xen
> > seems to want.
>
> Sounds like that would work and I would much prefer to simply make
> correct use of the core functionality.

It seems like a reasonable approach. Whether it will actually _work_
is a harder question... :-)

> So PMSG_FREEZE is balanced by either PMSG_RECOVER or PMSG_THAW depending
> on whether the suspend was cancelled or not?

Basically yes. It is also "balanced" by PMSG_RESTORE, which is used
after a memory image has been restored (although this isn't relevant to
your snapshotting). See the comments in include/linux/pm.h.

> So the sequence of events
> is something like:
> dpm_suspend_start(PMSG_FREEZE);
>
> dpm_suspend_noirq(PMSG_FREEZE);
>
> sysdev_suspend(PMSG_QUIESCE);

This should say sysdev_suspend(PMSG_FREEZE).

> cancelled = suspend_hypercall()

At this point swsusp_arch_suspend() is called. If that translates to
suspend_hypercall() in your setting, then yes.

> sysdev_resume();
>
> dpm_resume_noirq(cancelled ? PMSG_RECOVER : PMSG_THAW);
>
> dpm_resume_end(cancelled ? PMSG_RECOVER : PMSG_THAW);
> ?

Yes.

> (For comparison we currently have:
> > > > dpm_suspend_start(PMSG_SUSPEND);
> > > >
> > > > dpm_suspend_noirq(PMSG_SUSPEND);
> > > >
> > > > sysdev_suspend(PMSG_SUSPEND);
> > > > /* suspend hypercall */
> > > > sysdev_resume();
> > > >
> > > > dpm_resume_noirq(PMSG_RESUME);
> > > >
> > > > dpm_resume_end(PMSG_RESUME);
> )

Right. The sequence of calls is the same, but the PMSG_ argument is
different so drivers are expected to act differently in response.

Alan Stern

2011-02-10 16:26:51

by Rafael J. Wysocki

[permalink] [raw]

Subject: Re: [Xen-devel] Re: [linux-pm] [PATCH 0/2] Fix hangup after creating checkpoint on Xen.

On Thursday, February 10, 2011, Alan Stern wrote:
> On Thu, 10 Feb 2011, Ian Campbell wrote:
>
> > On Wed, 2011-02-09 at 23:42 +0000, Alan Stern wrote:
> > > In fact there already is a "fast suspend & resume" path in the PM core.
> > > It's the freeze/thaw procedure used when starting to hibernate. The
> > > documentation specifically says that drivers' freeze methods are
> > > supposed to quiesce their devices but not change power levels. In
> > > addition, the thaw method is invoked as part of recovery from a failed
> > > hibernation attempt, so it already has the "cancel" semantics that xen
> > > seems to want.
> >
> > Sounds like that would work and I would much prefer to simply make
> > correct use of the core functionality.
>
> It seems like a reasonable approach. Whether it will actually _work_
> is a harder question... :-)
>
> > So PMSG_FREEZE is balanced by either PMSG_RECOVER or PMSG_THAW depending
> > on whether the suspend was cancelled or not?

That's correct, but from drivers' point of view PMSG_THAW is equivalent to
PMSG_RECOVER, because the both of them cause ->thaw() callbacks to be executed.

> Basically yes. It is also "balanced" by PMSG_RESTORE, which is used
> after a memory image has been restored (although this isn't relevant to
> your snapshotting). See the comments in include/linux/pm.h.

Yup.

> > So the sequence of events
> > is something like:
> > dpm_suspend_start(PMSG_FREEZE);
> >
> > dpm_suspend_noirq(PMSG_FREEZE);
> >
> > sysdev_suspend(PMSG_QUIESCE);
>
> This should say sysdev_suspend(PMSG_FREEZE).

Yes, PMSG_QUIESCE is restore-specific.

> > cancelled = suspend_hypercall()
>
> At this point swsusp_arch_suspend() is called. If that translates to
> suspend_hypercall() in your setting, then yes.
>
> > sysdev_resume();
> >
> > dpm_resume_noirq(cancelled ? PMSG_RECOVER : PMSG_THAW);
> >
> > dpm_resume_end(cancelled ? PMSG_RECOVER : PMSG_THAW);
> > ?
>
> Yes.

Actually, I think PMSG_THAW can be used in both cases. The resume-side
routines only use the 'state' argument for diagnostics.

> > (For comparison we currently have:
> > > > > dpm_suspend_start(PMSG_SUSPEND);
> > > > >
> > > > > dpm_suspend_noirq(PMSG_SUSPEND);
> > > > >
> > > > > sysdev_suspend(PMSG_SUSPEND);
> > > > > /* suspend hypercall */
> > > > > sysdev_resume();
> > > > >
> > > > > dpm_resume_noirq(PMSG_RESUME);
> > > > >
> > > > > dpm_resume_end(PMSG_RESUME);
> > )
>
> Right. The sequence of calls is the same, but the PMSG_ argument is
> different so drivers are expected to act differently in response.

That's correct.

Thanks,
Rafael

2011-02-10 16:35:14

by Ian Campbell

[permalink] [raw]

Subject: Re: [Xen-devel] Re: [linux-pm] [PATCH 0/2] Fix hangup after creating checkpoint on Xen.

On Thu, 2011-02-10 at 11:00 -0500, Alan Stern wrote:
> On Thu, 10 Feb 2011, Ian Campbell wrote:
>
> > On Wed, 2011-02-09 at 23:42 +0000, Alan Stern wrote:
> > > In fact there already is a "fast suspend & resume" path in the PM core.
> > > It's the freeze/thaw procedure used when starting to hibernate. The
> > > documentation specifically says that drivers' freeze methods are
> > > supposed to quiesce their devices but not change power levels. In
> > > addition, the thaw method is invoked as part of recovery from a failed
> > > hibernation attempt, so it already has the "cancel" semantics that xen
> > > seems to want.
> >
> > Sounds like that would work and I would much prefer to simply make
> > correct use of the core functionality.
>
> It seems like a reasonable approach. Whether it will actually _work_
> is a harder question... :-)

Heh.

> > So PMSG_FREEZE is balanced by either PMSG_RECOVER or PMSG_THAW depending
> > on whether the suspend was cancelled or not?
>
> Basically yes. It is also "balanced" by PMSG_RESTORE, which is used
> after a memory image has been restored (although this isn't relevant to
> your snapshotting). See the comments in include/linux/pm.h.

The documentation of the individual events in pm.h is good. Is there a
reference for the sequence of events for the different types of
suspend/hibernate/etc?

> > So the sequence of events
> > is something like:
> > dpm_suspend_start(PMSG_FREEZE);
> >
> > dpm_suspend_noirq(PMSG_FREEZE);
> >
> > sysdev_suspend(PMSG_QUIESCE);
>
> This should say sysdev_suspend(PMSG_FREEZE).
>
> > cancelled = suspend_hypercall()
>
> At this point swsusp_arch_suspend() is called. If that translates to
> suspend_hypercall() in your setting, then yes.
>
> > sysdev_resume();
> >
> > dpm_resume_noirq(cancelled ? PMSG_RECOVER : PMSG_THAW);
> >
> > dpm_resume_end(cancelled ? PMSG_RECOVER : PMSG_THAW);
> > ?
>
> Yes.

Both of those call ->thaw ->complete. Did I mean "cancelled ?
PMSG_THAW : PMSG_RESTORE"? (or s/THAW/RECOVER?)

If the suspend was cancelled then we want the devices to simply pickup
where they were before the freeze, wereas if we really did suspend (or
migrate or whatever) then they need to do a more complete reset and
reconnect operation so we want some sort of indication to the driver
which happened.

> > (For comparison we currently have:
> > > > > dpm_suspend_start(PMSG_SUSPEND);
> > > > >
> > > > > dpm_suspend_noirq(PMSG_SUSPEND);
> > > > >
> > > > > sysdev_suspend(PMSG_SUSPEND);
> > > > > /* suspend hypercall */
> > > > > sysdev_resume();
> > > > >
> > > > > dpm_resume_noirq(PMSG_RESUME);
> > > > >
> > > > > dpm_resume_end(PMSG_RESUME);
> > )
>
> Right. The sequence of calls is the same, but the PMSG_ argument is
> different so drivers are expected to act differently in response.

The drivers don't actually see the PMSG_* though right? They only see a
differing sequence of hooks from dev_pm_ops called.

Thanks,
Ian.

>
> Alan Stern
>
>

--
Ian Campbell
Current Noise: Sworn Amongst - Useless

Accident, n.:
A condition in which presence of mind is good, but absence of
body is better.
-- Foolish Dictionary

2011-02-10 17:01:39

by Rafael J. Wysocki

[permalink] [raw]

Subject: Re: [Xen-devel] Re: [linux-pm] [PATCH 0/2] Fix hangup after creating checkpoint on Xen.

On Thursday, February 10, 2011, Ian Campbell wrote:
> On Thu, 2011-02-10 at 11:00 -0500, Alan Stern wrote:
> > On Thu, 10 Feb 2011, Ian Campbell wrote:
> >
> > > On Wed, 2011-02-09 at 23:42 +0000, Alan Stern wrote:
> > > > In fact there already is a "fast suspend & resume" path in the PM core.
> > > > It's the freeze/thaw procedure used when starting to hibernate. The
> > > > documentation specifically says that drivers' freeze methods are
> > > > supposed to quiesce their devices but not change power levels. In
> > > > addition, the thaw method is invoked as part of recovery from a failed
> > > > hibernation attempt, so it already has the "cancel" semantics that xen
> > > > seems to want.
> > >
> > > Sounds like that would work and I would much prefer to simply make
> > > correct use of the core functionality.
> >
> > It seems like a reasonable approach. Whether it will actually _work_
> > is a harder question... :-)
>
> Heh.
>
> > > So PMSG_FREEZE is balanced by either PMSG_RECOVER or PMSG_THAW depending
> > > on whether the suspend was cancelled or not?
> >
> > Basically yes. It is also "balanced" by PMSG_RESTORE, which is used
> > after a memory image has been restored (although this isn't relevant to
> > your snapshotting). See the comments in include/linux/pm.h.
>
> The documentation of the individual events in pm.h is good. Is there a
> reference for the sequence of events for the different types of
> suspend/hibernate/etc?
>
> > > So the sequence of events
> > > is something like:
> > > dpm_suspend_start(PMSG_FREEZE);
> > >
> > > dpm_suspend_noirq(PMSG_FREEZE);
> > >
> > > sysdev_suspend(PMSG_QUIESCE);
> >
> > This should say sysdev_suspend(PMSG_FREEZE).
> >
> > > cancelled = suspend_hypercall()
> >
> > At this point swsusp_arch_suspend() is called. If that translates to
> > suspend_hypercall() in your setting, then yes.
> >
> > > sysdev_resume();
> > >
> > > dpm_resume_noirq(cancelled ? PMSG_RECOVER : PMSG_THAW);
> > >
> > > dpm_resume_end(cancelled ? PMSG_RECOVER : PMSG_THAW);
> > > ?
> >
> > Yes.
>
> Both of those call ->thaw ->complete. Did I mean "cancelled ?
> PMSG_THAW : PMSG_RESTORE"? (or s/THAW/RECOVER?)
>
> If the suspend was cancelled then we want the devices to simply pickup
> where they were before the freeze, wereas if we really did suspend (or
> migrate or whatever) then they need to do a more complete reset and
> reconnect operation so we want some sort of indication to the driver
> which happened.

In that case you should probably use PMSG_THAW (or PMSG_RECOVER) for the
"cancel" case and PMSG_RESTORE for the "success" case (pretty much what
hibernation does).

And please don't forget to update the comments in pm.h to cover your usage
case. :-)

> > > (For comparison we currently have:
> > > > > > dpm_suspend_start(PMSG_SUSPEND);
> > > > > >
> > > > > > dpm_suspend_noirq(PMSG_SUSPEND);
> > > > > >
> > > > > > sysdev_suspend(PMSG_SUSPEND);
> > > > > > /* suspend hypercall */
> > > > > > sysdev_resume();
> > > > > >
> > > > > > dpm_resume_noirq(PMSG_RESUME);
> > > > > >
> > > > > > dpm_resume_end(PMSG_RESUME);
> > > )
> >
> > Right. The sequence of calls is the same, but the PMSG_ argument is
> > different so drivers are expected to act differently in response.
>
> The drivers don't actually see the PMSG_* though right? They only see a
> differing sequence of hooks from dev_pm_ops called.

That's correct.

Thanks,
Rafael

2011-02-10 17:53:47

by Brendan Cully

[permalink] [raw]

Subject: Re: [Xen-devel] Re: [linux-pm] [PATCH 0/2] Fix hangup after creating checkpoint on Xen.

On Thursday, 10 February 2011 at 11:31, Ian Campbell wrote:
> On Wed, 2011-02-09 at 23:16 +0000, Brendan Cully wrote:
> > I'd like to keep the fast resume option, and expect that it can be
> > contained entirely in Xen-specific code. I'll try to get someone to
> > look into it here.
>
> Thanks, please put them in contact with Kazuhiro who has already been
> looking into this.
>
> > I think fast resume is somewhat orthogonal to the problem of hanging
> > on resume, which just sounds like a xen-specific bug in the slow
> > path.
>
> The bug on the Xen side is taking the slow path when the suspend was
> actually cancelled, which is what the original patch tries to fix.
>
> Or should/could the slow path be able cope either way?

I haven't looked at what's actually happening yet, but what should be
happening is that the pvops kernel should not advertise SUSPEND_CANCEL
since it doesn't yet support it. So xm save -c should call
xc_domain_resume with the fast argument set to 0, to do slow path
resumption, and that slow path should be able to handle resuming on
the same host where it suspended. Slow path resumption from checkpoint
was supported in the pre-pvops kernel.

2011-02-10 18:56:33

by Alan Stern

[permalink] [raw]

Subject: Re: [Xen-devel] Re: [linux-pm] [PATCH 0/2] Fix hangup after creating checkpoint on Xen.

On Thu, 10 Feb 2011, Ian Campbell wrote:

> The documentation of the individual events in pm.h is good. Is there a
> reference for the sequence of events for the different types of
> suspend/hibernate/etc?

There's Documentation/power/devices.txt.

Alan Stern