2008-12-16 08:32:33

by Ken Chen

[permalink] [raw]
Subject: [patch] x86: convert rdtscll() to use __native_read_tsc

Is there any reason why x86 rdtscll have to use the out of line
function instead of inline __native_read_tsc()? native_read_tsc and
__native_read_tsc is essentially the same functions.

Patch to let x86 rdtscll() to use the inline version of read_tsc.

Signed-off-by: Ken Chen <[email protected]>


diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index c2a812e..42f639b 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -181,10 +181,10 @@ static inline int rdmsrl_amd_safe
}

#define rdtscl(low) \
- ((low) = (u32)native_read_tsc())
+ ((low) = (u32)__native_read_tsc())

#define rdtscll(val) \
- ((val) = native_read_tsc())
+ ((val) = __native_read_tsc())

#define rdpmc(counter, low, high) \
do { \


2008-12-16 09:15:47

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch] x86: convert rdtscll() to use __native_read_tsc


* Ken Chen <[email protected]> wrote:

> Is there any reason why x86 rdtscll have to use the out of line function
> instead of inline __native_read_tsc()? native_read_tsc and
> __native_read_tsc is essentially the same functions.

Your patch is correct.

The reason for the __native_read_tsc() / native_read_tsc() distinction is
and obscure problem with paravirt function pointers. Such constructs:

./xen/enlighten.c: .read_tsc = native_read_tsc,

do not always work fine with all versions of gcc, if native_read_tsc() is
a simple static inline (as it should be) - the build would fail with
certain gcc flags. (and i remember runtime problems too) The C semantics
of taking the address of an inline function seem pretty clear: the inlined
function should be instantiated in that .o and a pointer should be
generated out of that local instantiation.

Perhaps the real fix is to do this rename as well:

native_read_tsc => native_read_tsc_paravirt
__native_read_tsc => native_read_tsc

as this makes the native_read_tsc_paravirt() a pure technical variant, to
be used in paravirt_ops function pointer assignments. People would thus
just use the obvious native_read_tsc() inline function most of the time
and could forget about native_read_tsc_paravirt().

Jeremy?

Ingo

2008-12-16 09:26:54

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [patch] x86: convert rdtscll() to use __native_read_tsc

Ingo Molnar wrote:
> The reason for the __native_read_tsc() / native_read_tsc() distinction is
> and obscure problem with paravirt function pointers. Such constructs:
>
> ./xen/enlighten.c: .read_tsc = native_read_tsc,
>
> do not always work fine with all versions of gcc, if native_read_tsc() is
> a simple static inline (as it should be) - the build would fail with
> certain gcc flags.

I don't think that's true. We rely on taking function pointers of
static inlines pretty extensively; native_read_tsc is hardly unique in
this respect. I don't remember seeing any problems of the sort you
describe. (I can well believe this may have been a problem at some
point, but not during the pv-ops development timeframe.)

> Perhaps the real fix is to do this rename as well:
>
> native_read_tsc => native_read_tsc_paravirt
> __native_read_tsc => native_read_tsc
>
> as this makes the native_read_tsc_paravirt() a pure technical variant, to
> be used in paravirt_ops function pointer assignments. People would thus
> just use the obvious native_read_tsc() inline function most of the time
> and could forget about native_read_tsc_paravirt().
>
> Jeremy?
>

I'm trying to remember the real reason for
__native_read_tsc/native_read_tsc. At least part of it is that
__native_read_tsc is used in a vdso, and so *must* be inlined to avoid a
bogus call from user to kernel space. But I don't know why you wouldn't
want to inline native_read_tsc everywhere. I have a feeling it may be a
relic from unification - possibly because x86-64 was late to the
clocksource party - but I don't remember anything specific.

I think we can probably make do with a single native_read_tsc, so long
as its always inlined.

J

2008-12-16 10:16:18

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch] x86: convert rdtscll() to use __native_read_tsc


* Jeremy Fitzhardinge <[email protected]> wrote:

> Ingo Molnar wrote:
>> The reason for the __native_read_tsc() / native_read_tsc() distinction
>> is and obscure problem with paravirt function pointers. Such
>> constructs:
>>
>> ./xen/enlighten.c: .read_tsc = native_read_tsc,
>>
>> do not always work fine with all versions of gcc, if native_read_tsc()
>> is a simple static inline (as it should be) - the build would fail with
>> certain gcc flags.
>
> I don't think that's true. We rely on taking function pointers of
> static inlines pretty extensively; native_read_tsc is hardly unique in
> this respect. I don't remember seeing any problems of the sort you
> describe. (I can well believe this may have been a problem at some
> point, but not during the pv-ops development timeframe.)

i do remember build and boot failures there - with weird combos of gcc
options. It's a clear GCC bug. Anyway, we can clean this up and we'll see
how relevant the failure modes are.

>> Perhaps the real fix is to do this rename as well:
>>
>> native_read_tsc => native_read_tsc_paravirt
>> __native_read_tsc => native_read_tsc
>>
>> as this makes the native_read_tsc_paravirt() a pure technical variant,
>> to be used in paravirt_ops function pointer assignments. People would
>> thus just use the obvious native_read_tsc() inline function most of the
>> time and could forget about native_read_tsc_paravirt().
>>
>> Jeremy?
>
> I'm trying to remember the real reason for
> __native_read_tsc/native_read_tsc. At least part of it is that
> __native_read_tsc is used in a vdso, and so *must* be inlined to avoid a
> bogus call from user to kernel space. But I don't know why you wouldn't
> want to inline native_read_tsc everywhere. I have a feeling it may be a
> relic from unification - possibly because x86-64 was late to the
> clocksource party - but I don't remember anything specific.
>
> I think we can probably make do with a single native_read_tsc, so long
> as its always inlined.

agreed mostly, with this twist: vdso inlining dependencies should be
expressed explicitly, via:

native_vread_tsc()

but we can also make native_read_tsc() __always_inline [it's a single
instruction with basically no preparatory halo around that instruction]
and document the vdso detail there.

Ingo

2008-12-17 08:07:30

by Ken Chen

[permalink] [raw]
Subject: Re: [patch] x86: convert rdtscll() to use __native_read_tsc

On Tue, Dec 16, 2008 at 2:15 AM, Ingo Molnar <[email protected]> wrote:
> agreed mostly, with this twist: vdso inlining dependencies should be
> expressed explicitly, via:
>
> native_vread_tsc()
>
> but we can also make native_read_tsc() __always_inline [it's a single
> instruction with basically no preparatory halo around that instruction]
> and document the vdso detail there.

Given that vdso already uses the inline version of __native_read_tsc,
we don't really have churn all that code around. I also take your
analysis as there is no issue in converting rdtscll() to
__native_read_tsc(). Would it be possible for you to merge the patch
then?

- Ken

2008-12-18 13:31:57

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch] x86: convert rdtscll() to use __native_read_tsc


* Ken Chen <[email protected]> wrote:

> On Tue, Dec 16, 2008 at 2:15 AM, Ingo Molnar <[email protected]> wrote:
> > agreed mostly, with this twist: vdso inlining dependencies should be
> > expressed explicitly, via:
> >
> > native_vread_tsc()
> >
> > but we can also make native_read_tsc() __always_inline [it's a single
> > instruction with basically no preparatory halo around that instruction]
> > and document the vdso detail there.
>
> Given that vdso already uses the inline version of __native_read_tsc, we
> don't really have churn all that code around. I also take your analysis
> as there is no issue in converting rdtscll() to __native_read_tsc().
> Would it be possible for you to merge the patch then?

yep, i already did that when i replied to your mail, it's in tip/x86/time.

Ingo