2012-02-13 15:46:37

by Igor Mammedov

[permalink] [raw]
Subject: [PATCH RFC] pvclock: Make pv_clock more robust and fixup it if overflow happens

Instead of hunting misterious stalls/hungs all over the kernel when
overflow occurs at pvclock.c:pvclock_get_nsec_offset

u64 delta = native_read_tsc() - shadow->tsc_timestamp;

and introducing hooks when places of unexpected access found, pv_clock
should be initialized for the calling cpu if overflow condition is detected.

Signed-off-by: Igor Mammedov <[email protected]>
---
arch/x86/kernel/pvclock.c | 18 +++++++++++++++---
1 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/pvclock.c b/arch/x86/kernel/pvclock.c
index 42eb330..b486756 100644
--- a/arch/x86/kernel/pvclock.c
+++ b/arch/x86/kernel/pvclock.c
@@ -41,9 +41,14 @@ void pvclock_set_flags(u8 flags)
valid_flags = flags;
}

-static u64 pvclock_get_nsec_offset(struct pvclock_shadow_time *shadow)
+static u64 pvclock_get_nsec_offset(struct pvclock_shadow_time *shadow,
+ bool *overflow)
{
- u64 delta = native_read_tsc() - shadow->tsc_timestamp;
+ u64 delta;
+ u64 tsc = native_read_tsc();
+ u64 shadow_timestamp = shadow->tsc_timestamp;
+ *overflow = tsc < shadow_timestamp;
+ delta = tsc - shadow_timestamp;
return pvclock_scale_delta(delta, shadow->tsc_to_nsec_mul,
shadow->tsc_shift);
}
@@ -94,12 +99,19 @@ cycle_t pvclock_clocksource_read(struct pvclock_vcpu_time_info *src)
unsigned version;
cycle_t ret, offset;
u64 last;
+ bool overflow;

do {
version = pvclock_get_time_values(&shadow, src);
barrier();
- offset = pvclock_get_nsec_offset(&shadow);
+ offset = pvclock_get_nsec_offset(&shadow, &overflow);
ret = shadow.system_timestamp + offset;
+ if (unlikely(overflow)) {
+ memset(src, 0, sizeof(*src));
+ barrier();
+ x86_cpuinit.early_percpu_clock_init();
+ continue;
+ }
barrier();
} while (version != src->version);

--
1.7.7.6


2012-02-13 17:51:23

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: [PATCH RFC] pvclock: Make pv_clock more robust and fixup it if overflow happens

On Mon, Feb 13, 2012 at 04:45:59PM +0100, Igor Mammedov wrote:
> Instead of hunting misterious stalls/hungs all over the kernel when
> overflow occurs at pvclock.c:pvclock_get_nsec_offset
>
> u64 delta = native_read_tsc() - shadow->tsc_timestamp;
>
> and introducing hooks when places of unexpected access found, pv_clock
> should be initialized for the calling cpu if overflow condition is detected.
>
> Signed-off-by: Igor Mammedov <[email protected]>

Igor,

I disagree. This is fixing the symptom not the root cause. Additionally,
Xen also uses pvclock_clocksource_read.

How about adding a BUG_ON to detect the overflow, this way hunting for
the problem is not necessary.

> arch/x86/kernel/pvclock.c | 18 +++++++++++++++---
> 1 files changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kernel/pvclock.c b/arch/x86/kernel/pvclock.c
> index 42eb330..b486756 100644
> --- a/arch/x86/kernel/pvclock.c
> +++ b/arch/x86/kernel/pvclock.c
> @@ -41,9 +41,14 @@ void pvclock_set_flags(u8 flags)
> valid_flags = flags;
> }
>
> -static u64 pvclock_get_nsec_offset(struct pvclock_shadow_time *shadow)
> +static u64 pvclock_get_nsec_offset(struct pvclock_shadow_time *shadow,
> + bool *overflow)
> {
> - u64 delta = native_read_tsc() - shadow->tsc_timestamp;
> + u64 delta;
> + u64 tsc = native_read_tsc();
> + u64 shadow_timestamp = shadow->tsc_timestamp;
> + *overflow = tsc < shadow_timestamp;
> + delta = tsc - shadow_timestamp;
> return pvclock_scale_delta(delta, shadow->tsc_to_nsec_mul,
> shadow->tsc_shift);
> }
> @@ -94,12 +99,19 @@ cycle_t pvclock_clocksource_read(struct pvclock_vcpu_time_info *src)
> unsigned version;
> cycle_t ret, offset;
> u64 last;
> + bool overflow;
>
> do {
> version = pvclock_get_time_values(&shadow, src);
> barrier();
> - offset = pvclock_get_nsec_offset(&shadow);
> + offset = pvclock_get_nsec_offset(&shadow, &overflow);
> ret = shadow.system_timestamp + offset;
> + if (unlikely(overflow)) {
> + memset(src, 0, sizeof(*src));
> + barrier();
> + x86_cpuinit.early_percpu_clock_init();
> + continue;
> + }
> barrier();
> } while (version != src->version);
>
> --
> 1.7.7.6
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2012-02-13 18:15:29

by Igor Mammedov

[permalink] [raw]
Subject: Re: [PATCH RFC] pvclock: Make pv_clock more robust and fixup it if overflow happens

On 02/13/2012 06:48 PM, Marcelo Tosatti wrote:
> On Mon, Feb 13, 2012 at 04:45:59PM +0100, Igor Mammedov wrote:
>> Instead of hunting misterious stalls/hungs all over the kernel when
>> overflow occurs at pvclock.c:pvclock_get_nsec_offset
>>
>> u64 delta = native_read_tsc() - shadow->tsc_timestamp;
>>
>> and introducing hooks when places of unexpected access found, pv_clock
>> should be initialized for the calling cpu if overflow condition is detected.
>>
>> Signed-off-by: Igor Mammedov<[email protected]>
>
> Igor,
>
> I disagree. This is fixing the symptom not the root cause. Additionally,
> Xen also uses pvclock_clocksource_read.
>
> How about adding a BUG_ON to detect the overflow, this way hunting for
> the problem is not necessary.
>
Ok, I'll repost bug_on version.