From: Wu Zhangjin <[email protected]>
(This v2 revision adds the missing CONFIG_HR_SCHED_CLOCK around
sched_clock().)
This patch adds a cnt32_to_63() and MIPS c0 count based sched_clock(),
which can provide high resolution. and also, two new kernel options are
added. the HR_SCHED_CLOCK is used to enable/disable this sched_clock(),
and the HT_SCHED_CLOCK_UPDATE is used to allow whether update the
sched_clock() automatically.
Without it, the Ftrace for MIPS will give useless timestamp information.
Because cnt32_to_63() needs to be called at least once per half period
to work properly, Differ from the old version, this v1 revision set up a
kernel timer to ensure the requirement of some MIPSs which have short c0
count period.
Signed-off-by: Wu Zhangjin <[email protected]>
---
arch/mips/Kconfig | 30 +++++++++++++++++++++
arch/mips/kernel/csrc-r4k.c | 60 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 90 insertions(+), 0 deletions(-)
diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index b342197..6264f97 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -1952,6 +1952,36 @@ config NR_CPUS
source "kernel/time/Kconfig"
#
+# High Resolution sched_clock() Configuration
+#
+
+config HR_SCHED_CLOCK
+ bool "High Resolution sched_clock()"
+ depends on CSRC_R4K
+ default n
+ help
+ This option enables the MIPS c0 count based high resolution
+ sched_clock().
+
+ If you need a ns precision timestamp, You are recommended to enable
+ this option. For example, If you are using the Ftrace subsystem to do
+ real time tracing, this option is needed.
+
+ If unsure, disable it.
+
+config HR_SCHED_CLOCK_UPDATE
+ bool "Update sched_clock() automatically"
+ depends on HR_SCHED_CLOCK
+ default y
+ help
+ Because Some of the MIPS c0 count period is quite short and because
+ cnt32_to_63() needs to be called at least once per half period to
+ work properly, a kernel timer is needed to set up to ensure this
+ requirement is always met.
+
+ If unusre, enable it.
+
+#
# Timer Interrupt Frequency Configuration
#
diff --git a/arch/mips/kernel/csrc-r4k.c b/arch/mips/kernel/csrc-r4k.c
index e95a3cd..4e52cca 100644
--- a/arch/mips/kernel/csrc-r4k.c
+++ b/arch/mips/kernel/csrc-r4k.c
@@ -6,10 +6,64 @@
* Copyright (C) 2007 by Ralf Baechle
*/
#include <linux/clocksource.h>
+#include <linux/cnt32_to_63.h>
+#include <linux/timer.h>
#include <linux/init.h>
#include <asm/time.h>
+/*
+ * MIPS' sched_clock implementation.
+ *
+ * because cnt32_to_63() needs to be called at least once per half period to
+ * work properly, and some of the MIPS' frequency is very low, perhaps a kernel
+ * timer is needed to be set up to ensure this requirement is always met.
+ * please refer to arch/arm/plat-orion/time.c and include/linux/cnt32_to_63.h
+ */
+static unsigned long __maybe_unused tclk2ns_scale;
+static unsigned long __maybe_unused tclk2ns_scale_factor;
+
+#ifdef CONFIG_HR_SCHED_CLOCK
+unsigned long long notrace sched_clock(void)
+{
+ unsigned long long v = cnt32_to_63(read_c0_count());
+ return (v * tclk2ns_scale) >> tclk2ns_scale_factor;
+}
+#endif
+
+static void __init __maybe_unused setup_sched_clock(struct clocksource *cs)
+{
+ unsigned long long v;
+
+ v = cs->mult;
+ /*
+ * We want an even value to automatically clear the top bit
+ * returned by cnt32_to_63() without an additional run time
+ * instruction. So if the LSB is 1 then round it up.
+ */
+ if (v & 1)
+ v++;
+ tclk2ns_scale = v;
+ tclk2ns_scale_factor = cs->shift;
+}
+
+static struct timer_list __maybe_unused cnt32_to_63_keepwarm_timer;
+
+static void __maybe_unused cnt32_to_63_keepwarm(unsigned long data)
+{
+ mod_timer(&cnt32_to_63_keepwarm_timer, round_jiffies(jiffies + data));
+ (void) sched_clock();
+}
+
+static void __maybe_unused setup_sched_clock_update(unsigned long tclk)
+{
+ unsigned long data;
+
+ data = (0xffffffffUL / tclk / 2 - 2) * HZ;
+ setup_timer(&cnt32_to_63_keepwarm_timer, cnt32_to_63_keepwarm, data);
+ mod_timer(&cnt32_to_63_keepwarm_timer, round_jiffies(jiffies + data));
+}
+
static cycle_t c0_hpt_read(struct clocksource *cs)
{
return read_c0_count();
@@ -32,7 +86,13 @@ int __init init_r4k_clocksource(void)
clocksource_set_clock(&clocksource_mips, mips_hpt_frequency);
+#ifdef CONFIG_HR_SCHED_CLOCK
+ setup_sched_clock(&clocksource_mips);
+#endif
clocksource_register(&clocksource_mips);
+#ifdef CONFIG_HR_SCHED_CLOCK_UPDATE
+ setup_sched_clock_update(mips_hpt_frequency);
+#endif
return 0;
}
--
1.6.2.1
* Wu Zhangjin <[email protected]> wrote:
> +config HR_SCHED_CLOCK
> + bool "High Resolution sched_clock()"
> + depends on CSRC_R4K
> + default n
> + help
> + This option enables the MIPS c0 count based high resolution
> + sched_clock().
> +
> + If you need a ns precision timestamp, You are recommended to enable
> + this option. For example, If you are using the Ftrace subsystem to do
> + real time tracing, this option is needed.
s/You/you
> +
> + If unsure, disable it.
> +
> +config HR_SCHED_CLOCK_UPDATE
> + bool "Update sched_clock() automatically"
> + depends on HR_SCHED_CLOCK
> + default y
> + help
> + Because Some of the MIPS c0 count period is quite short and because
> + cnt32_to_63() needs to be called at least once per half period to
> + work properly, a kernel timer is needed to set up to ensure this
> + requirement is always met.
s/Some/some
Why is this a config option? I suspect that hardware that _needs_ this
keep-warm periodic tick we should enable it unconditionally and
automatically - otherwise the user can misconfigure the kernel with a
bad clock.
> +
> + If unusre, enable it.
s/unusre/unsure
> +
> +#
> # Timer Interrupt Frequency Configuration
> #
>
> diff --git a/arch/mips/kernel/csrc-r4k.c b/arch/mips/kernel/csrc-r4k.c
> index e95a3cd..4e52cca 100644
> --- a/arch/mips/kernel/csrc-r4k.c
> +++ b/arch/mips/kernel/csrc-r4k.c
> @@ -6,10 +6,64 @@
> * Copyright (C) 2007 by Ralf Baechle
> */
> #include <linux/clocksource.h>
> +#include <linux/cnt32_to_63.h>
> +#include <linux/timer.h>
> #include <linux/init.h>
>
> #include <asm/time.h>
>
> +/*
> + * MIPS' sched_clock implementation.
s/MIPS'/MIPS's
or
s/MIPS'/MIPS
> + *
> + * because cnt32_to_63() needs to be called at least once per half period to
> + * work properly, and some of the MIPS' frequency is very low, perhaps a kernel
> + * timer is needed to be set up to ensure this requirement is always met.
> + * please refer to arch/arm/plat-orion/time.c and include/linux/cnt32_to_63.h
s/please/Please
> + */
> +static unsigned long __maybe_unused tclk2ns_scale;
> +static unsigned long __maybe_unused tclk2ns_scale_factor;
__read_mostly?
> +
> +#ifdef CONFIG_HR_SCHED_CLOCK
> +unsigned long long notrace sched_clock(void)
> +{
> + unsigned long long v = cnt32_to_63(read_c0_count());
> + return (v * tclk2ns_scale) >> tclk2ns_scale_factor;
> +}
> +#endif
> +
> +static void __init __maybe_unused setup_sched_clock(struct clocksource *cs)
> +{
> + unsigned long long v;
> +
> + v = cs->mult;
> + /*
> + * We want an even value to automatically clear the top bit
> + * returned by cnt32_to_63() without an additional run time
> + * instruction. So if the LSB is 1 then round it up.
> + */
> + if (v & 1)
> + v++;
>
> + tclk2ns_scale = v;
> + tclk2ns_scale_factor = cs->shift;
> +}
> +
> +static struct timer_list __maybe_unused cnt32_to_63_keepwarm_timer;
> +
> +static void __maybe_unused cnt32_to_63_keepwarm(unsigned long data)
> +{
> + mod_timer(&cnt32_to_63_keepwarm_timer, round_jiffies(jiffies + data));
> + (void) sched_clock();
>
> +}
> +
> +static void __maybe_unused setup_sched_clock_update(unsigned long tclk)
> +{
> + unsigned long data;
> +
> + data = (0xffffffffUL / tclk / 2 - 2) * HZ;
> + setup_timer(&cnt32_to_63_keepwarm_timer, cnt32_to_63_keepwarm, data);
> + mod_timer(&cnt32_to_63_keepwarm_timer, round_jiffies(jiffies + data));
> +}
> +
> static cycle_t c0_hpt_read(struct clocksource *cs)
> {
> return read_c0_count();
> @@ -32,7 +86,13 @@ int __init init_r4k_clocksource(void)
>
> clocksource_set_clock(&clocksource_mips, mips_hpt_frequency);
>
> +#ifdef CONFIG_HR_SCHED_CLOCK
> + setup_sched_clock(&clocksource_mips);
> +#endif
> clocksource_register(&clocksource_mips);
>
> +#ifdef CONFIG_HR_SCHED_CLOCK_UPDATE
> + setup_sched_clock_update(mips_hpt_frequency);
> +#endif
> return 0;
You could make the functions inline and move the #ifdef into those
functions.
That wold also remove those __maybe_unused tags.
Thanks,
Ingo
[...]
> > +
> > + If unsure, disable it.
> > +
> > +config HR_SCHED_CLOCK_UPDATE
> > + bool "Update sched_clock() automatically"
> > + depends on HR_SCHED_CLOCK
> > + default y
> > + help
> > + Because Some of the MIPS c0 count period is quite short and because
> > + cnt32_to_63() needs to be called at least once per half period to
> > + work properly, a kernel timer is needed to set up to ensure this
> > + requirement is always met.
>
> s/Some/some
>
Will apply all of the above feedbacks and the later relative ones,
thanks!
> Why is this a config option? I suspect that hardware that _needs_ this
> keep-warm periodic tick we should enable it unconditionally and
> automatically - otherwise the user can misconfigure the kernel with a
> bad clock.
>
It will be really hard to let the users make decision, so I tell them to
enable it if not sure. perhaps it's better to remove this option.
but one issue I have found here, if enabled this keep-warm by default, I
can not always get the result of function graph tracer, not sure what
the exact reason is. and even if get the result, lots of them are the
results about timer(that mod_timer() and sched_clock() is called
frequently?).
and as we can see from arch/arm/mach-pxa/time.c, there is also no
keep_warm there, which means that if the frequency of that MIPS is very
low, it will need enough seconds to make that 32bit long c0 count
overrall. so, they will not need this keep-warm. perhaps we can use that
mips_hpt_frequency variable to determine setup that keep-warm or not.
but it's hard to design the number of frequency.
[...]
> > + */
> > +static unsigned long __maybe_unused tclk2ns_scale;
> > +static unsigned long __maybe_unused tclk2ns_scale_factor;
>
>
> __read_mostly?
>
Yes.
[...]
> > +static void __maybe_unused setup_sched_clock_update(unsigned long tclk)
> > +{
> > + unsigned long data;
> > +
> > + data = (0xffffffffUL / tclk / 2 - 2) * HZ;
Because the MIPS c0 count's frequency is half of the cpu frequency(Hi,
Ralf, does every MIPS c0 count meet this feature?), so, the above line
should be:
data = (0xffffffffUL / tclk - 2) * HZ;
> > + setup_timer(&cnt32_to_63_keepwarm_timer, cnt32_to_63_keepwarm, data);
> > + mod_timer(&cnt32_to_63_keepwarm_timer, round_jiffies(jiffies + data));
> > +}
[...]
> >
> > +#ifdef CONFIG_HR_SCHED_CLOCK
> > + setup_sched_clock(&clocksource_mips);
> > +#endif
> > clocksource_register(&clocksource_mips);
> >
> > +#ifdef CONFIG_HR_SCHED_CLOCK_UPDATE
> > + setup_sched_clock_update(mips_hpt_frequency);
> > +#endif
> > return 0;
>
> You could make the functions inline and move the #ifdef into those
> functions.
>
> That wold also remove those __maybe_unused tags.
>
will apply it, thanks ;)
Regards,
Wu Zhangjin
Hello.
Ingo Molnar wrote:
>> +config HR_SCHED_CLOCK
>> + bool "High Resolution sched_clock()"
>> + depends on CSRC_R4K
>> + default n
>> + help
>> + This option enables the MIPS c0 count based high resolution
>> + sched_clock().
>> +
>> + If you need a ns precision timestamp, You are recommended to enable
>> + this option. For example, If you are using the Ftrace subsystem to do
>>
s/If/if/
>> diff --git a/arch/mips/kernel/csrc-r4k.c b/arch/mips/kernel/csrc-r4k.c
>> index e95a3cd..4e52cca 100644
>> --- a/arch/mips/kernel/csrc-r4k.c
>> +++ b/arch/mips/kernel/csrc-r4k.c
>> @@ -6,10 +6,64 @@
>> * Copyright (C) 2007 by Ralf Baechle
>> */
>> #include <linux/clocksource.h>
>> +#include <linux/cnt32_to_63.h>
>> +#include <linux/timer.h>
>> #include <linux/init.h>
>>
>> #include <asm/time.h>
>>
>> +/*
>> + * MIPS' sched_clock implementation.
>>
>
> s/MIPS'/MIPS's
>
MIPS's is not really a proper English. :-)
WBR, Sergei
Hello.
Wu Zhangjin wrote:
>>> +static void __maybe_unused setup_sched_clock_update(unsigned long tclk)
>>> +{
>>> + unsigned long data;
>>> +
>>> + data = (0xffffffffUL / tclk / 2 - 2) * HZ;
>>>
>
> Because the MIPS c0 count's frequency is half of the cpu frequency(Hi, Ralf, does every MIPS c0 count meet this feature?),
No, e.g. Alchemy's C0 Count ticks at full CPU speed.
WBR, Sergei
On Sun, Nov 22, 2009 at 07:08:05PM +0800, Wu Zhangjin wrote:
> > > + data = (0xffffffffUL / tclk / 2 - 2) * HZ;
>
> Because the MIPS c0 count's frequency is half of the cpu frequency(Hi,
> Ralf, does every MIPS c0 count meet this feature?), so, the above line
> should be:
There are processors which have no cp0 counter at all; these are mostly
very old pre-R4000 era 32-bit MIPS I and MIPS II cores.
Of those which have a cp0 counter most will clock it at "half the maximum
instruction issue rate" and a few at the full rate. Finally for a few
such as the RM52xx either half or the full count the rate is selectable by
the reset initialization bitstream fed into the processor. Too make this
feature suck nicely there is no way for software to find out which rate
was selected so software must know that or calibrate against a timer of
known frequency.
Platform-specific code does this by setting mips_hpt_frequency to the
count rate before calling init_r4k_clocksource; it's also the value being
passed into setup_sched_clock_update() so you don't need to count for the
half / full clock rate thing there.
I don't see why you need the -2 in your formula so the whole thing can
be simplified to:
data = 0x80000000 / tclk * HZ;
Ralf
On Sun, 2009-11-22 at 12:35 +0000, Ralf Baechle wrote:
> On Sun, Nov 22, 2009 at 07:08:05PM +0800, Wu Zhangjin wrote:
>
> > > > + data = (0xffffffffUL / tclk / 2 - 2) * HZ;
> >
> > Because the MIPS c0 count's frequency is half of the cpu frequency(Hi,
> > Ralf, does every MIPS c0 count meet this feature?), so, the above line
> > should be:
>
> There are processors which have no cp0 counter at all; these are mostly
> very old pre-R4000 era 32-bit MIPS I and MIPS II cores.
>
> Of those which have a cp0 counter most will clock it at "half the maximum
> instruction issue rate" and a few at the full rate. Finally for a few
> such as the RM52xx either half or the full count the rate is selectable by
> the reset initialization bitstream fed into the processor. Too make this
> feature suck nicely there is no way for software to find out which rate
> was selected so software must know that or calibrate against a timer of
> known frequency.
>
> Platform-specific code does this by setting mips_hpt_frequency to the
> count rate before calling init_r4k_clocksource; it's also the value being
> passed into setup_sched_clock_update() so you don't need to count for the
> half / full clock rate thing there.
>
> I don't see why you need the -2 in your formula so the whole thing can
> be simplified to:
>
> data = 0x80000000 / tclk * HZ;
>
Sorry, I have mixed the mips_hpt_frequency with the cpu frequency,
mips_hpt_frequency is exactly the frequency of the timer. so, there is
no need to consider the relation between it and the cpu frequency here.
therefore, my old formula should be okay, that -2 is used to ensure data
is smaller than half of the period of the timer.
Best Regards,
Wu Zhangjin
* Sergei Shtylyov <[email protected]> wrote:
> Hello.
>
> Ingo Molnar wrote:
>
> >>+config HR_SCHED_CLOCK
> >>+ bool "High Resolution sched_clock()"
> >>+ depends on CSRC_R4K
> >>+ default n
> >>+ help
> >>+ This option enables the MIPS c0 count based high resolution
> >>+ sched_clock().
> >>+
> >>+ If you need a ns precision timestamp, You are recommended to enable
> >>+ this option. For example, If you are using the Ftrace subsystem to do
>
> s/If/if/
>
> >>diff --git a/arch/mips/kernel/csrc-r4k.c b/arch/mips/kernel/csrc-r4k.c
> >>index e95a3cd..4e52cca 100644
> >>--- a/arch/mips/kernel/csrc-r4k.c
> >>+++ b/arch/mips/kernel/csrc-r4k.c
> >>@@ -6,10 +6,64 @@
> >> * Copyright (C) 2007 by Ralf Baechle
> >> */
> >> #include <linux/clocksource.h>
> >>+#include <linux/cnt32_to_63.h>
> >>+#include <linux/timer.h>
> >> #include <linux/init.h>
> >> #include <asm/time.h>
> >>+/*
> >>+ * MIPS' sched_clock implementation.
> >
> >s/MIPS'/MIPS's
>
> MIPS's is not really a proper English. :-)
AFAIK 'MIPS' is not the plural of 'MIP' (but an acronym ending with
'S'), hence the possessive form would be MIPS's.
It doesnt matter much i guess ;-)
Ingo
On Sun, Nov 22, 2009 at 07:06:16PM +0100, Ingo Molnar wrote:
> > MIPS's is not really a proper English. :-)
>
> AFAIK 'MIPS' is not the plural of 'MIP' (but an acronym ending with
> 'S'), hence the possessive form would be MIPS's.
MIPS Technologies' IP lawyers insist that MIPS is a proper name and not an
acronym - this position has certain advantages in trademark law.
Historically MIPS stood for Microprocessor without Interlocked Pipeline
Stages but technically may only have been true for the original Stanford
RISC processor but certainly not for the first commercial MIPS processor,
the R2000, which was released in 1985 which at least had some interlocks.
> It doesnt matter much i guess ;-)
Not to us because we're not trademark lawyers ;-)
Ralf
* Ralf Baechle <[email protected]> wrote:
> On Sun, Nov 22, 2009 at 07:06:16PM +0100, Ingo Molnar wrote:
>
> > > MIPS's is not really a proper English. :-)
> >
> > AFAIK 'MIPS' is not the plural of 'MIP' (but an acronym ending with
> > 'S'), hence the possessive form would be MIPS's.
>
> MIPS Technologies' IP lawyers insist that MIPS is a proper name and
> not an acronym - this position has certain advantages in trademark
> law.
That too seems to support my point that "MIPS's" is the right spelling.
Ingo
On Mon, 23 Nov 2009, Ingo Molnar wrote:
> > > > MIPS's is not really a proper English. :-)
> > >
> > > AFAIK 'MIPS' is not the plural of 'MIP' (but an acronym ending with
> > > 'S'), hence the possessive form would be MIPS's.
> >
> > MIPS Technologies' IP lawyers insist that MIPS is a proper name and
> > not an acronym - this position has certain advantages in trademark
> > law.
>
> That too seems to support my point that "MIPS's" is the right spelling.
My understanding is that "MIPS" is actually meant (as far as IP law is
concerned) to be an adjective, so you can't really make a possessive form
out of it as it is irrelevant. The correct form IMHO is thus simply:
"MIPS sched_clock implementation" which happens to sound the best to me
too.
Maciej