2010-08-20 08:08:34

by Zachary Amsden

[permalink] [raw]
Subject: [KVM timekeeping 12/35] Robust TSC compensation

Make the match of TSC find TSC writes that are close to each other
instead of perfectly identical; this allows the compensator to also
work in migration / suspend scenarios.

Signed-off-by: Zachary Amsden <[email protected]>
---
arch/x86/kvm/x86.c | 14 ++++++++++----
1 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 52680f6..0f3e5fb 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -928,21 +928,27 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu, u64 data)
struct kvm *kvm = vcpu->kvm;
u64 offset, ns, elapsed;
unsigned long flags;
+ s64 sdiff;

spin_lock_irqsave(&kvm->arch.tsc_write_lock, flags);
offset = data - native_read_tsc();
ns = get_kernel_ns();
elapsed = ns - kvm->arch.last_tsc_nsec;
+ sdiff = data - kvm->arch.last_tsc_write;
+ if (sdiff < 0)
+ sdiff = -sdiff;

/*
- * Special case: identical write to TSC within 5 seconds of
+ * Special case: close write to TSC within 5 seconds of
* another CPU is interpreted as an attempt to synchronize
- * (the 5 seconds is to accomodate host load / swapping).
+ * The 5 seconds is to accomodate host load / swapping as
+ * well as any reset of TSC during the boot process.
*
* In that case, for a reliable TSC, we can match TSC offsets,
- * or make a best guest using kernel_ns value.
+ * or make a best guest using elapsed value.
*/
- if (data == kvm->arch.last_tsc_write && elapsed < 5ULL * NSEC_PER_SEC) {
+ if (sdiff < nsec_to_cycles(5ULL * NSEC_PER_SEC) &&
+ elapsed < 5ULL * NSEC_PER_SEC) {
if (!check_tsc_unstable()) {
offset = kvm->arch.last_tsc_offset;
pr_debug("kvm: matched tsc offset for %llu\n", data);
--
1.7.1


2010-08-20 17:40:57

by Glauber Costa

[permalink] [raw]
Subject: Re: [KVM timekeeping 12/35] Robust TSC compensation

On Thu, Aug 19, 2010 at 10:07:26PM -1000, Zachary Amsden wrote:
> Make the match of TSC find TSC writes that are close to each other
> instead of perfectly identical; this allows the compensator to also
> work in migration / suspend scenarios.
>
> Signed-off-by: Zachary Amsden <[email protected]>
> ---
> arch/x86/kvm/x86.c | 14 ++++++++++----
> 1 files changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 52680f6..0f3e5fb 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -928,21 +928,27 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu, u64 data)
> struct kvm *kvm = vcpu->kvm;
> u64 offset, ns, elapsed;
> unsigned long flags;
> + s64 sdiff;
>
> spin_lock_irqsave(&kvm->arch.tsc_write_lock, flags);
> offset = data - native_read_tsc();
> ns = get_kernel_ns();
> elapsed = ns - kvm->arch.last_tsc_nsec;
> + sdiff = data - kvm->arch.last_tsc_write;
> + if (sdiff < 0)
> + sdiff = -sdiff;
>
> /*
> - * Special case: identical write to TSC within 5 seconds of
> + * Special case: close write to TSC within 5 seconds of
> * another CPU is interpreted as an attempt to synchronize
> - * (the 5 seconds is to accomodate host load / swapping).
> + * The 5 seconds is to accomodate host load / swapping as
> + * well as any reset of TSC during the boot process.
> *
> * In that case, for a reliable TSC, we can match TSC offsets,
> - * or make a best guest using kernel_ns value.
> + * or make a best guest using elapsed value.
> */
> - if (data == kvm->arch.last_tsc_write && elapsed < 5ULL * NSEC_PER_SEC) {
> + if (sdiff < nsec_to_cycles(5ULL * NSEC_PER_SEC) &&
> + elapsed < 5ULL * NSEC_PER_SEC) {
> if (!check_tsc_unstable()) {
Isn't 5 way too long for this case?

2010-08-24 01:01:42

by Zachary Amsden

[permalink] [raw]
Subject: Re: [KVM timekeeping 12/35] Robust TSC compensation

On 08/20/2010 07:40 AM, Glauber Costa wrote:
> On Thu, Aug 19, 2010 at 10:07:26PM -1000, Zachary Amsden wrote:
>
>> Make the match of TSC find TSC writes that are close to each other
>> instead of perfectly identical; this allows the compensator to also
>> work in migration / suspend scenarios.
>>
>> Signed-off-by: Zachary Amsden<[email protected]>
>> ---
>> arch/x86/kvm/x86.c | 14 ++++++++++----
>> 1 files changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 52680f6..0f3e5fb 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -928,21 +928,27 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu, u64 data)
>> struct kvm *kvm = vcpu->kvm;
>> u64 offset, ns, elapsed;
>> unsigned long flags;
>> + s64 sdiff;
>>
>> spin_lock_irqsave(&kvm->arch.tsc_write_lock, flags);
>> offset = data - native_read_tsc();
>> ns = get_kernel_ns();
>> elapsed = ns - kvm->arch.last_tsc_nsec;
>> + sdiff = data - kvm->arch.last_tsc_write;
>> + if (sdiff< 0)
>> + sdiff = -sdiff;
>>
>> /*
>> - * Special case: identical write to TSC within 5 seconds of
>> + * Special case: close write to TSC within 5 seconds of
>> * another CPU is interpreted as an attempt to synchronize
>> - * (the 5 seconds is to accomodate host load / swapping).
>> + * The 5 seconds is to accomodate host load / swapping as
>> + * well as any reset of TSC during the boot process.
>> *
>> * In that case, for a reliable TSC, we can match TSC offsets,
>> - * or make a best guest using kernel_ns value.
>> + * or make a best guest using elapsed value.
>> */
>> - if (data == kvm->arch.last_tsc_write&& elapsed< 5ULL * NSEC_PER_SEC) {
>> + if (sdiff< nsec_to_cycles(5ULL * NSEC_PER_SEC)&&
>> + elapsed< 5ULL * NSEC_PER_SEC) {
>> if (!check_tsc_unstable()) {
>>
> Isn't 5 way too long for this case?
>
>
>

It was actually too short for a while, and I didn't realize why until I
discovered on SVM, the APs were getting the TSC reset after the startup IPI.

In any case, the value is certainly up for debate. I chose a large
number because who knows how badly things can get off in the case of
host overcommit / swapping.

Zach

2010-08-24 21:33:20

by Daniel Verkamp

[permalink] [raw]
Subject: Re: [KVM timekeeping 12/35] Robust TSC compensation

On Fri, Aug 20, 2010 at 1:07 AM, Zachary Amsden <[email protected]> wrote:
[...]
> +        * or make a best guest using elapsed value.

Perhaps s/guest/guess/ while the line is changing anyway.