Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable'
sched_clock() interface") broke Xen guest time handling across
migration:
[ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done.
[ 187.251137] OOM killer disabled.
[ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[ 187.252299] suspending xenstore...
[ 187.266987] xen:grant_table: Grant tables using version 1 layout
[18446743811.706476] OOM killer enabled.
[18446743811.706478] Restarting tasks ... done.
[18446743811.720505] Setting capacity to 16777216
Fix that by setting xen_sched_clock_offset at resume time to ensure a
monotonic clock value.
Fixes: f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface")
Cc: <[email protected]> # 4.11
Reported-by: Hans van Kranenburg <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>
---
arch/x86/xen/time.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
index 72bf446c3fee..6e29794573b7 100644
--- a/arch/x86/xen/time.c
+++ b/arch/x86/xen/time.c
@@ -361,8 +361,6 @@ void xen_timer_resume(void)
{
int cpu;
- pvclock_resume();
-
if (xen_clockevent != &xen_vcpuop_clockevent)
return;
@@ -379,12 +377,15 @@ static const struct pv_time_ops xen_time_ops __initconst = {
};
static struct pvclock_vsyscall_time_info *xen_clock __read_mostly;
+static u64 xen_clock_value_saved;
void xen_save_time_memory_area(void)
{
struct vcpu_register_time_memory_area t;
int ret;
+ xen_clock_value_saved = xen_clocksource_read() - xen_sched_clock_offset;
+
if (!xen_clock)
return;
@@ -404,7 +405,7 @@ void xen_restore_time_memory_area(void)
int ret;
if (!xen_clock)
- return;
+ goto out;
t.addr.v = &xen_clock->pvti;
@@ -421,6 +422,11 @@ void xen_restore_time_memory_area(void)
if (ret != 0)
pr_notice("Cannot restore secondary vcpu_time_info (err %d)",
ret);
+
+out:
+ /* Need pvclock_resume() before using xen_clocksource_read(). */
+ pvclock_resume();
+ xen_sched_clock_offset = xen_clocksource_read() - xen_clock_value_saved;
}
static void xen_setup_vsyscall_time_info(void)
--
2.16.4
On 1/14/19 7:44 AM, Juergen Gross wrote:
> Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable'
> sched_clock() interface") broke Xen guest time handling across
> migration:
>
> [ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done.
> [ 187.251137] OOM killer disabled.
> [ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
> [ 187.252299] suspending xenstore...
> [ 187.266987] xen:grant_table: Grant tables using version 1 layout
> [18446743811.706476] OOM killer enabled.
> [18446743811.706478] Restarting tasks ... done.
> [18446743811.720505] Setting capacity to 16777216
>
> Fix that by setting xen_sched_clock_offset at resume time to ensure a
> monotonic clock value.
>
> Fixes: f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface")
> Cc: <[email protected]> # 4.11
> Reported-by: Hans van Kranenburg <[email protected]>
> Signed-off-by: Juergen Gross <[email protected]>
Reviewed-by: Boris Ostrovsky <[email protected]>
On 15/01/2019 11:43, Thomas Gleixner wrote:
> On Mon, 14 Jan 2019, Juergen Gross wrote:
>
>> Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable'
>> sched_clock() interface") broke Xen guest time handling across
>> migration:
>>
>> [ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done.
>> [ 187.251137] OOM killer disabled.
>> [ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
>> [ 187.252299] suspending xenstore...
>> [ 187.266987] xen:grant_table: Grant tables using version 1 layout
>> [18446743811.706476] OOM killer enabled.
>> [18446743811.706478] Restarting tasks ... done.
>> [18446743811.720505] Setting capacity to 16777216
>
> I see that it's broken, but the changelog could do with an explanation WHY
> it broke.
This seems to be rather complex.
I believe the mentioned commit just ignored Xen guests resulting in a
"stable" clock where it shouldn't, but maybe I have missed another
aspect of this commit which is to blame. I tried to fix that by
replacing using_native_sched_clock() with a hypervisor specific pvops
function.
Unfortunately this didn't work, maybe due to other uses of
using_native_sched_clock() added by later patches. In the end it was
quite clear that updating the Xen clock offset was the right thing to
do, so I ended up with this patch.
Juergen
On Mon, 14 Jan 2019, Juergen Gross wrote:
> Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable'
> sched_clock() interface") broke Xen guest time handling across
> migration:
>
> [ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done.
> [ 187.251137] OOM killer disabled.
> [ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
> [ 187.252299] suspending xenstore...
> [ 187.266987] xen:grant_table: Grant tables using version 1 layout
> [18446743811.706476] OOM killer enabled.
> [18446743811.706478] Restarting tasks ... done.
> [18446743811.720505] Setting capacity to 16777216
I see that it's broken, but the changelog could do with an explanation WHY
it broke.
Other than that this looks about right.
Thanks,
tglx
Hi,
On 1/14/19 1:44 PM, Juergen Gross wrote:
> Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable'
> sched_clock() interface") broke Xen guest time handling across
> migration:
>
> [ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done.
> [ 187.251137] OOM killer disabled.
> [ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
> [ 187.252299] suspending xenstore...
> [ 187.266987] xen:grant_table: Grant tables using version 1 layout
> [18446743811.706476] OOM killer enabled.
> [18446743811.706478] Restarting tasks ... done.
> [18446743811.720505] Setting capacity to 16777216
>
> Fix that by setting xen_sched_clock_offset at resume time to ensure a
> monotonic clock value.
>
> [...]
With v3 of the patch, I see the time jump in one log line happen, but
only when using PVH.
[ 49.486453] Freezing user space processes ... (elapsed 0.002 seconds)
done.
[ 49.488743] OOM killer disabled.
[ 49.488764] Freezing remaining freezable tasks ... (elapsed 0.001
seconds) done.
[ 49.491117] suspending xenstore...
[2000731.388722] xen:events: Xen HVM callback vector for event delivery
is enabled
[ 49.491750] xen:grant_table: Grant tables using version 1 layout
[ 49.810722] OOM killer enabled.
[ 49.810744] Restarting tasks ... done.
[ 49.856263] Setting capacity to 6291456
[ 50.006002] Setting capacity to 10485760
If I start as PV, it never seems to happen.
Up to you to decide how important this is. :)
FYI this is with v3 on top of the Debian stretch-backports 4.19 kernel,
which I'm starting to use now to reboot things with.
-# uname -a
Linux appnode-kylie 4.19.0-0.bpo.1-cloud-amd64 #1 SMP Debian
4.19.12-1~bpo9+1+mendix1 (2019-01-15) x86_64 GNU/Linux
https://salsa.debian.org/knorrie-guest/linux/commits/mendix/stretch-backports
Hans
On 16/01/2019 01:24, Hans van Kranenburg wrote:
> Hi,
>
> On 1/14/19 1:44 PM, Juergen Gross wrote:
>> Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable'
>> sched_clock() interface") broke Xen guest time handling across
>> migration:
>>
>> [ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done.
>> [ 187.251137] OOM killer disabled.
>> [ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
>> [ 187.252299] suspending xenstore...
>> [ 187.266987] xen:grant_table: Grant tables using version 1 layout
>> [18446743811.706476] OOM killer enabled.
>> [18446743811.706478] Restarting tasks ... done.
>> [18446743811.720505] Setting capacity to 16777216
>>
>> Fix that by setting xen_sched_clock_offset at resume time to ensure a
>> monotonic clock value.
>>
>> [...]
>
> With v3 of the patch, I see the time jump in one log line happen, but
> only when using PVH.
>
> [ 49.486453] Freezing user space processes ... (elapsed 0.002 seconds)
> done.
> [ 49.488743] OOM killer disabled.
> [ 49.488764] Freezing remaining freezable tasks ... (elapsed 0.001
> seconds) done.
> [ 49.491117] suspending xenstore...
> [2000731.388722] xen:events: Xen HVM callback vector for event delivery
> is enabled
> [ 49.491750] xen:grant_table: Grant tables using version 1 layout
> [ 49.810722] OOM killer enabled.
> [ 49.810744] Restarting tasks ... done.
> [ 49.856263] Setting capacity to 6291456
> [ 50.006002] Setting capacity to 10485760
>
> If I start as PV, it never seems to happen.
>
> Up to you to decide how important this is. :)
We could do something like below. Boris?
Juergen
---
diff --git a/arch/x86/xen/suspend_hvm.c b/arch/x86/xen/suspend_hvm.c
index e666b614cf6d..088f3a6b4be9 100644
--- a/arch/x86/xen/suspend_hvm.c
+++ b/arch/x86/xen/suspend_hvm.c
@@ -13,6 +13,6 @@ void xen_hvm_post_suspend(int suspend_cancelled)
xen_hvm_init_shared_info();
xen_vcpu_restore();
}
- xen_callback_vector();
+ xen_callback_vector(true);
xen_unplug_emulated_devices();
}
diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
index 0e60bd918695..ba293fda3265 100644
--- a/arch/x86/xen/xen-ops.h
+++ b/arch/x86/xen/xen-ops.h
@@ -55,7 +55,7 @@ void xen_enable_sysenter(void);
void xen_enable_syscall(void);
void xen_vcpu_restore(void);
-void xen_callback_vector(void);
+void xen_callback_vector(bool silent);
void xen_hvm_init_shared_info(void);
void xen_unplug_emulated_devices(void);
diff --git a/drivers/xen/events/events_base.c
b/drivers/xen/events/events_base.c
index 93194f3e7540..8d8d50bea215 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -1637,7 +1637,7 @@ EXPORT_SYMBOL_GPL(xen_set_callback_via);
/* Vector callbacks are better than PCI interrupts to receive event
* channel notifications because we can receive vector callbacks on any
* vcpu and we don't need PCI support or APIC interactions. */
-void xen_callback_vector(void)
+void xen_callback_vector(bool silent)
{
int rc;
uint64_t callback_via;
@@ -1650,13 +1650,14 @@ void xen_callback_vector(void)
xen_have_vector_callback = 0;
return;
}
- pr_info("Xen HVM callback vector for event delivery is
enabled\n");
+ if (!silent)
+ pr_info("Xen HVM callback vector for event
delivery is enabled\n");
alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR,
xen_hvm_callback_vector);
}
}
#else
-void xen_callback_vector(void) {}
+void xen_callback_vector(bool silent) {}
#endif
#undef MODULE_PARAM_PREFIX
@@ -1692,7 +1693,7 @@ void __init xen_init_IRQ(void)
pci_xen_initial_domain();
}
if (xen_feature(XENFEAT_hvm_callback_vector))
- xen_callback_vector();
+ xen_callback_vector(false);
if (xen_hvm_domain()) {
native_init_IRQ();
Hi Boris,
On 1/14/19 2:54 PM, Boris Ostrovsky wrote:
> On 1/14/19 7:44 AM, Juergen Gross wrote:
>> Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable'
>> sched_clock() interface") broke Xen guest time handling across
>> migration:
>>
>> [ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done.
>> [ 187.251137] OOM killer disabled.
>> [ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
>> [ 187.252299] suspending xenstore...
>> [ 187.266987] xen:grant_table: Grant tables using version 1 layout
>> [18446743811.706476] OOM killer enabled.
>> [18446743811.706478] Restarting tasks ... done.
>> [18446743811.720505] Setting capacity to 16777216
>>
>> Fix that by setting xen_sched_clock_offset at resume time to ensure a
>> monotonic clock value.
>>
>> Fixes: f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface")
>> Cc: <[email protected]> # 4.11
>> Reported-by: Hans van Kranenburg <[email protected]>
>> Signed-off-by: Juergen Gross <[email protected]>
>
> Reviewed-by: Boris Ostrovsky <[email protected]>
Can you please change the address to my work email?
Reported-by: Hans van Kranenburg <[email protected]>
Also (see other email):
Tested-by: Hans van Kranenburg <[email protected]>
Thanks,
Hans
On 16/01/2019 14:17, Boris Ostrovsky wrote:
> On Wed, Jan 16, 2019 at 08:50:13AM +0100, Juergen Gross wrote:
>
>> @@ -1650,13 +1650,14 @@ void xen_callback_vector(void)
>> xen_have_vector_callback = 0;
>> return;
>> }
>> - pr_info("Xen HVM callback vector for event delivery is
>> enabled\n");
>> + if (!silent)
>> + pr_info("Xen HVM callback vector for event
>> delivery is enabled\n");
>
> How about replacing pr_info() with pr_info_once()?
What a nice and simple idea!
Extra patch or V4?
Juergen
On 1/16/19 10:29 AM, Juergen Gross wrote:
> On 16/01/2019 16:07, Boris Ostrovsky wrote:
>> On 1/16/19 9:33 AM, Juergen Gross wrote:
>>> On 16/01/2019 14:17, Boris Ostrovsky wrote:
>>>> On Wed, Jan 16, 2019 at 08:50:13AM +0100, Juergen Gross wrote:
>>>>
>>>>> @@ -1650,13 +1650,14 @@ void xen_callback_vector(void)
>>>>> xen_have_vector_callback = 0;
>>>>> return;
>>>>> }
>>>>> - pr_info("Xen HVM callback vector for event delivery is
>>>>> enabled\n");
>>>>> + if (!silent)
>>>>> + pr_info("Xen HVM callback vector for event
>>>>> delivery is enabled\n");
>>>> How about replacing pr_info() with pr_info_once()?
>>> What a nice and simple idea!
>>>
>>> Extra patch or V4?
>>>
>>
>> I can add this while committing, I don't think it's worth a whole new patch.
>>
>> One outstanding question I have is whether anything needs to be added to
>> the commit message (Thomas had some questions)
> He didn't react to my explanation. I'm interpreting that as him being
> fine with my explanation, which I believe is not suitable to be added
> to the commit message.
OK. Applied to (now properly named) for-linus-5.0
-boris
On Wed, Jan 16, 2019 at 08:50:13AM +0100, Juergen Gross wrote:
> @@ -1650,13 +1650,14 @@ void xen_callback_vector(void)
> xen_have_vector_callback = 0;
> return;
> }
> - pr_info("Xen HVM callback vector for event delivery is
> enabled\n");
> + if (!silent)
> + pr_info("Xen HVM callback vector for event
> delivery is enabled\n");
How about replacing pr_info() with pr_info_once()?
-boris
On 1/16/19 9:33 AM, Juergen Gross wrote:
> On 16/01/2019 14:17, Boris Ostrovsky wrote:
>> On Wed, Jan 16, 2019 at 08:50:13AM +0100, Juergen Gross wrote:
>>
>>> @@ -1650,13 +1650,14 @@ void xen_callback_vector(void)
>>> xen_have_vector_callback = 0;
>>> return;
>>> }
>>> - pr_info("Xen HVM callback vector for event delivery is
>>> enabled\n");
>>> + if (!silent)
>>> + pr_info("Xen HVM callback vector for event
>>> delivery is enabled\n");
>> How about replacing pr_info() with pr_info_once()?
> What a nice and simple idea!
>
> Extra patch or V4?
>
I can add this while committing, I don't think it's worth a whole new patch.
One outstanding question I have is whether anything needs to be added to
the commit message (Thomas had some questions)
-boris
On 16/01/2019 16:07, Boris Ostrovsky wrote:
> On 1/16/19 9:33 AM, Juergen Gross wrote:
>> On 16/01/2019 14:17, Boris Ostrovsky wrote:
>>> On Wed, Jan 16, 2019 at 08:50:13AM +0100, Juergen Gross wrote:
>>>
>>>> @@ -1650,13 +1650,14 @@ void xen_callback_vector(void)
>>>> xen_have_vector_callback = 0;
>>>> return;
>>>> }
>>>> - pr_info("Xen HVM callback vector for event delivery is
>>>> enabled\n");
>>>> + if (!silent)
>>>> + pr_info("Xen HVM callback vector for event
>>>> delivery is enabled\n");
>>> How about replacing pr_info() with pr_info_once()?
>> What a nice and simple idea!
>>
>> Extra patch or V4?
>>
>
>
> I can add this while committing, I don't think it's worth a whole new patch.
>
> One outstanding question I have is whether anything needs to be added to
> the commit message (Thomas had some questions)
He didn't react to my explanation. I'm interpreting that as him being
fine with my explanation, which I believe is not suitable to be added
to the commit message.
Juergen