2015-04-28 22:48:36

by Boris Ostrovsky

[permalink] [raw]
Subject: [PATCH v3] xen: Suspend ticks on all CPUs during suspend

Commit 77e32c89a711 ("clockevents: Manage device's state separately for
the core") decouples clockevent device's modes from states. With this
change when a Xen guest tries to resume, it won't be calling its
set_mode op which needs to be done on each VCPU in order to make the
hypervisor aware that we are in oneshot mode.

This happens because clockevents_tick_resume() (which is an intermediate
step of resuming ticks on a processor) doesn't call clockevents_set_state()
anymore and because during suspend clockevent devices on all VCPUs (except
for the one doing the suspend) are left in ONESHOT state. As result, during
resume the clockevents state machine will assume that device is already
where it should be and doesn't need to be updated.

To avoid this problem we should suspend ticks on all VCPUs during
suspend.

Signed-off-by: Boris Ostrovsky <[email protected]>
---
v3: Call xen_arch_resume() before out_resume label.

v2: Call xen_arch_resume() regardless of si.cancelled value. We can't call
xen_arch_suspend() after stop_machine() since we need to have all VCPUs
still alive and so we stop the tick even though subsequent xen_suspend()
might fail later and set si.cancelled. Thus we call xen_arch_resume()
(which really only resumes the tick) unconditionally.

arch/x86/xen/suspend.c | 10 ++++++++++
drivers/xen/manage.c | 9 ++++++---
include/xen/xen-ops.h | 1 +
3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c
index d949769..53b4c08 100644
--- a/arch/x86/xen/suspend.c
+++ b/arch/x86/xen/suspend.c
@@ -88,7 +88,17 @@ static void xen_vcpu_notify_restore(void *data)
tick_resume_local();
}

+static void xen_vcpu_notify_suspend(void *data)
+{
+ tick_suspend_local();
+}
+
void xen_arch_resume(void)
{
on_each_cpu(xen_vcpu_notify_restore, NULL, 1);
}
+
+void xen_arch_suspend(void)
+{
+ on_each_cpu(xen_vcpu_notify_suspend, NULL, 1);
+}
diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c
index bf19407..9e6a851 100644
--- a/drivers/xen/manage.c
+++ b/drivers/xen/manage.c
@@ -131,6 +131,8 @@ static void do_suspend(void)
goto out_resume;
}

+ xen_arch_suspend();
+
si.cancelled = 1;

err = stop_machine(xen_suspend, &si, cpumask_of(0));
@@ -148,11 +150,12 @@ static void do_suspend(void)
si.cancelled = 1;
}

+ xen_arch_resume();
+
out_resume:
- if (!si.cancelled) {
- xen_arch_resume();
+ if (!si.cancelled)
xs_resume();
- } else
+ else
xs_suspend_cancel();

dpm_resume_end(si.cancelled ? PMSG_THAW : PMSG_RESTORE);
diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
index c643e6a..0ce4f32 100644
--- a/include/xen/xen-ops.h
+++ b/include/xen/xen-ops.h
@@ -13,6 +13,7 @@ void xen_arch_post_suspend(int suspend_cancelled);

void xen_timer_resume(void);
void xen_arch_resume(void);
+void xen_arch_suspend(void);

void xen_resume_notifier_register(struct notifier_block *nb);
void xen_resume_notifier_unregister(struct notifier_block *nb);
--
1.7.1


2015-04-29 16:10:39

by David Vrabel

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH v3] xen: Suspend ticks on all CPUs during suspend

On 28/04/15 23:46, Boris Ostrovsky wrote:
> Commit 77e32c89a711 ("clockevents: Manage device's state separately for
> the core") decouples clockevent device's modes from states. With this
> change when a Xen guest tries to resume, it won't be calling its
> set_mode op which needs to be done on each VCPU in order to make the
> hypervisor aware that we are in oneshot mode.
>
> This happens because clockevents_tick_resume() (which is an intermediate
> step of resuming ticks on a processor) doesn't call clockevents_set_state()
> anymore and because during suspend clockevent devices on all VCPUs (except
> for the one doing the suspend) are left in ONESHOT state. As result, during
> resume the clockevents state machine will assume that device is already
> where it should be and doesn't need to be updated.
>
> To avoid this problem we should suspend ticks on all VCPUs during
> suspend.

Applied to for-linus-4.1b, thanks.

David