2010-07-14 00:56:34

by john stultz

[permalink] [raw]
Subject: [PATCH 00/11] -tip Timekeeping changes for 2.6.36

Hey Thomas,

I just wanted to send you my pending queue of timekeeping
change for 2.6.36. It would be nice to get these into the -tip
tree for testing prior to the merge window.

thanks
-john

CC: Thomas Gleixner <[email protected]>

John Stultz (11):
x86: Fix vtime/file timestamp inconsistencies
Implement timespec_add
time: Kill off CONFIG_GENERIC_TIME
powerpc: Simplify update_vsyscall
powerpc: Cleanup xtime usage
Fix update_vsyscall to provide wall_to_monotonic offset
Convert um to use read_persistent_clock
Cleanup hrtimer.c's direct access to wall_to_monotonic
Make xtime and wall_to_monotonic static
Convert common x86 clocksources to use clocksource_register_hz/khz
Add __clocksource_updatefreq_hz/khz methods

Documentation/feature-removal-schedule.txt | 10 ----
Documentation/kernel-parameters.txt | 3 +-
arch/alpha/Kconfig | 4 --
arch/arm/Kconfig | 4 --
arch/avr32/Kconfig | 3 -
arch/blackfin/Kconfig | 3 -
arch/cris/Kconfig | 3 -
arch/frv/Kconfig | 4 --
arch/h8300/Kconfig | 4 --
arch/ia64/Kconfig | 4 --
arch/ia64/kernel/time.c | 7 ++-
arch/m32r/Kconfig | 3 -
arch/m68k/Kconfig | 3 -
arch/m68knommu/Kconfig | 4 --
arch/microblaze/Kconfig | 3 -
arch/mips/Kconfig | 4 --
arch/mn10300/Kconfig | 3 -
arch/parisc/Kconfig | 4 --
arch/powerpc/Kconfig | 3 -
arch/powerpc/kernel/time.c | 61 ++++++++++------------
arch/s390/Kconfig | 3 -
arch/s390/kernel/time.c | 8 ++--
arch/score/Kconfig | 3 -
arch/sh/Kconfig | 3 -
arch/sparc/Kconfig | 3 -
arch/um/Kconfig.common | 4 --
arch/um/kernel/time.c | 13 +++--
arch/x86/Kconfig | 5 +--
arch/x86/kernel/hpet.c | 13 +++--
arch/x86/kernel/tsc.c | 6 +--
arch/x86/kernel/vsyscall_64.c | 17 ++++--
arch/xtensa/Kconfig | 3 -
drivers/Makefile | 4 +-
drivers/acpi/acpi_pad.c | 2 +-
drivers/acpi/processor_idle.c | 2 +-
drivers/clocksource/acpi_pm.c | 9 +---
drivers/misc/Kconfig | 4 +-
include/linux/clocksource.h | 17 +++++-
include/linux/time.h | 21 ++++++-
kernel/hrtimer.c | 9 ++--
kernel/time.c | 16 ------
kernel/time/Kconfig | 4 +-
kernel/time/clocksource.c | 33 +++++++++---
kernel/time/timekeeping.c | 79 +++++++---------------------
kernel/trace/Kconfig | 4 +-
45 files changed, 163 insertions(+), 259 deletions(-)


2010-07-14 00:56:37

by john stultz

[permalink] [raw]
Subject: [PATCH 09/11] Make xtime and wall_to_monotonic static

This patch makes xtime and wall_to_monotonic static, as planned in
Documentation/feature-removal-schedule.txt. This will allow for
further cleanups to the timekeeping core.

Signed-off-by: John Stultz <[email protected]>
CC: Thomas Gleixner <[email protected]>

---
Documentation/feature-removal-schedule.txt | 10 ----------
include/linux/time.h | 2 --
kernel/time/timekeeping.c | 4 ++--
3 files changed, 2 insertions(+), 14 deletions(-)

diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index c268783..0d91c6b 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -549,16 +549,6 @@ Who: Avi Kivity <[email protected]>

----------------------------

-What: xtime, wall_to_monotonic
-When: 2.6.36+
-Files: kernel/time/timekeeping.c include/linux/time.h
-Why: Cleaning up timekeeping internal values. Please use
- existing timekeeping accessor functions to access
- the equivalent functionality.
-Who: John Stultz <[email protected]>
-
-----------------------------
-
What: KVM kernel-allocated memory slots
When: July 2010
Why: Since 2.6.25, kvm supports user-allocated memory slots, which are
diff --git a/include/linux/time.h b/include/linux/time.h
index 3a9c0bf..50b3cb0 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -113,8 +113,6 @@ static inline struct timespec timespec_sub(struct timespec lhs,
#define timespec_valid(ts) \
(((ts)->tv_sec >= 0) && (((unsigned long) (ts)->tv_nsec) < NSEC_PER_SEC))

-extern struct timespec xtime;
-extern struct timespec wall_to_monotonic;
extern seqlock_t xtime_lock;

extern void read_persistent_clock(struct timespec *ts);
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index fb61c2e..e14c839 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -153,8 +153,8 @@ __cacheline_aligned_in_smp DEFINE_SEQLOCK(xtime_lock);
* - wall_to_monotonic is no longer the boot time, getboottime must be
* used instead.
*/
-struct timespec xtime __attribute__ ((aligned (16)));
-struct timespec wall_to_monotonic __attribute__ ((aligned (16)));
+static struct timespec xtime __attribute__ ((aligned (16)));
+static struct timespec wall_to_monotonic __attribute__ ((aligned (16)));
static struct timespec total_sleep_time;

/*
--
1.6.0.4

2010-07-14 00:56:36

by john stultz

[permalink] [raw]
Subject: [PATCH 04/11] powerpc: Simplify update_vsyscall

Currently powerpc's update_vsyscall calls an inline update_gtod.
However, both are straightforward, and there are no other users,
so this patch merges update_gtod into update_vsyscall.

Compiles, but otherwise untested.

Cc: Anton Blanchard <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: John Stultz <[email protected]>
---
arch/powerpc/kernel/time.c | 55 ++++++++++++++++++++------------------------
1 files changed, 25 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 0441bbd..6fcd648 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -423,30 +423,6 @@ void udelay(unsigned long usecs)
}
EXPORT_SYMBOL(udelay);

-static inline void update_gtod(u64 new_tb_stamp, u64 new_stamp_xsec,
- u64 new_tb_to_xs)
-{
- /*
- * tb_update_count is used to allow the userspace gettimeofday code
- * to assure itself that it sees a consistent view of the tb_to_xs and
- * stamp_xsec variables. It reads the tb_update_count, then reads
- * tb_to_xs and stamp_xsec and then reads tb_update_count again. If
- * the two values of tb_update_count match and are even then the
- * tb_to_xs and stamp_xsec values are consistent. If not, then it
- * loops back and reads them again until this criteria is met.
- * We expect the caller to have done the first increment of
- * vdso_data->tb_update_count already.
- */
- vdso_data->tb_orig_stamp = new_tb_stamp;
- vdso_data->stamp_xsec = new_stamp_xsec;
- vdso_data->tb_to_xs = new_tb_to_xs;
- vdso_data->wtom_clock_sec = wall_to_monotonic.tv_sec;
- vdso_data->wtom_clock_nsec = wall_to_monotonic.tv_nsec;
- vdso_data->stamp_xtime = xtime;
- smp_wmb();
- ++(vdso_data->tb_update_count);
-}
-
#ifdef CONFIG_SMP
unsigned long profile_pc(struct pt_regs *regs)
{
@@ -876,7 +852,7 @@ static cycle_t timebase_read(struct clocksource *cs)
void update_vsyscall(struct timespec *wall_time, struct clocksource *clock,
u32 mult)
{
- u64 t2x, stamp_xsec;
+ u64 new_tb_to_xs, new_stamp_xsec;

if (clock != &clocksource_timebase)
return;
@@ -887,11 +863,30 @@ void update_vsyscall(struct timespec *wall_time, struct clocksource *clock,

/* XXX this assumes clock->shift == 22 */
/* 4611686018 ~= 2^(20+64-22) / 1e9 */
- t2x = (u64) mult * 4611686018ULL;
- stamp_xsec = (u64) xtime.tv_nsec * XSEC_PER_SEC;
- do_div(stamp_xsec, 1000000000);
- stamp_xsec += (u64) xtime.tv_sec * XSEC_PER_SEC;
- update_gtod(clock->cycle_last, stamp_xsec, t2x);
+ new_tb_to_xs = (u64) mult * 4611686018ULL;
+ new_stamp_xsec = (u64) xtime.tv_nsec * XSEC_PER_SEC;
+ do_div(new_stamp_xsec, 1000000000);
+ new_stamp_xsec += (u64) xtime.tv_sec * XSEC_PER_SEC;
+
+ /*
+ * tb_update_count is used to allow the userspace gettimeofday code
+ * to assure itself that it sees a consistent view of the tb_to_xs and
+ * stamp_xsec variables. It reads the tb_update_count, then reads
+ * tb_to_xs and stamp_xsec and then reads tb_update_count again. If
+ * the two values of tb_update_count match and are even then the
+ * tb_to_xs and stamp_xsec values are consistent. If not, then it
+ * loops back and reads them again until this criteria is met.
+ * We expect the caller to have done the first increment of
+ * vdso_data->tb_update_count already.
+ */
+ vdso_data->tb_orig_stamp = clock->cycle_last;
+ vdso_data->stamp_xsec = new_stamp_xsec;
+ vdso_data->tb_to_xs = new_tb_to_xs;
+ vdso_data->wtom_clock_sec = wall_to_monotonic.tv_sec;
+ vdso_data->wtom_clock_nsec = wall_to_monotonic.tv_nsec;
+ vdso_data->stamp_xtime = xtime;
+ smp_wmb();
+ ++(vdso_data->tb_update_count);
}

void update_vsyscall_tz(void)
--
1.6.0.4

2010-07-14 00:56:55

by john stultz

[permalink] [raw]
Subject: [PATCH 07/11] Convert um to use read_persistent_clock

This patch converts the um arch to use read_persistent_clock().
This allows it to avoid accessing xtime and wall_to_monotonic
directly.

This patch is un-tested, so any help by testers or maintainers would
be greatly appreciated!

Signed-off-by: John Stultz <[email protected]>
CC: Jeff Dike <[email protected]>
CC: Thomas Gleixner <[email protected]>
---
arch/um/kernel/time.c | 13 +++++++------
1 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/um/kernel/time.c b/arch/um/kernel/time.c
index c8b9c46..2b8b262 100644
--- a/arch/um/kernel/time.c
+++ b/arch/um/kernel/time.c
@@ -102,16 +102,17 @@ static void __init setup_itimer(void)
clockevents_register_device(&itimer_clockevent);
}

+void read_persistent_clock(struct timespec *ts)
+{
+ nsecs = os_nsecs();
+ set_normalized_timespec(ts, nsecs / NSEC_PER_SEC,
+ nsecs % NSEC_PER_SEC);
+}
+
void __init time_init(void)
{
long long nsecs;

timer_init();
-
- nsecs = os_nsecs();
- set_normalized_timespec(&wall_to_monotonic, -nsecs / NSEC_PER_SEC,
- -nsecs % NSEC_PER_SEC);
- set_normalized_timespec(&xtime, nsecs / NSEC_PER_SEC,
- nsecs % NSEC_PER_SEC);
late_time_init = setup_itimer;
}
--
1.6.0.4

2010-07-14 00:56:33

by john stultz

[permalink] [raw]
Subject: [PATCH 01/11] x86: Fix vtime/file timestamp inconsistencies

Due to vtime calling vgettimeofday(), its possible that an application
could call time();create("stuff",O_RDRW); only to see the file's
creation timestamp to be before the value returned by time.

A similar way to reproduce the issue is to compare the vsyscall time()
with the syscall time(), and observe ordering issues.

The modified test case from Oleg Nesterov below can illustrate this:

int main(void)
{
time_t sec1,sec2;
do {
sec1 = time(&sec2);
sec2 = syscall(__NR_time, NULL);
} while (sec1 <= sec2);

printf("vtime: %d.000000\n", sec1);
printf("time: %d.000000\n", sec2);
return 0;
}

The proper fix is to make vtime use the same time value as
current_kernel_time() (which is exported via update_vsyscall) instead of
vgettime().

Thanks to Jiri Olsa for bringing up the issue and catching bugs in
earlier verisons of this fix.

Signed-off-by: John Stultz <[email protected]>

CC: Jiri Olsa <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Oleg Nesterov <[email protected]>
---
arch/x86/kernel/vsyscall_64.c | 11 ++++++++---
1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/vsyscall_64.c b/arch/x86/kernel/vsyscall_64.c
index 1c0c6ab..dce0c3c 100644
--- a/arch/x86/kernel/vsyscall_64.c
+++ b/arch/x86/kernel/vsyscall_64.c
@@ -169,13 +169,18 @@ int __vsyscall(0) vgettimeofday(struct timeval * tv, struct timezone * tz)
* unlikely */
time_t __vsyscall(1) vtime(time_t *t)
{
- struct timeval tv;
+ unsigned seq;
time_t result;
if (unlikely(!__vsyscall_gtod_data.sysctl_enabled))
return time_syscall(t);

- vgettimeofday(&tv, NULL);
- result = tv.tv_sec;
+ do {
+ seq = read_seqbegin(&__vsyscall_gtod_data.lock);
+
+ result = __vsyscall_gtod_data.wall_time_sec;
+
+ } while (read_seqretry(&__vsyscall_gtod_data.lock, seq));
+
if (t)
*t = result;
return result;
--
1.6.0.4

2010-07-14 00:57:22

by john stultz

[permalink] [raw]
Subject: [PATCH 06/11] Fix update_vsyscall to provide wall_to_monotonic offset

update_vsyscall() did not provide the wall_to_monotoinc offset,
so arch specific implementations tend to reference wall_to_monotonic
directly. This limits future cleanups in the timekeeping core, so
this patch fixes the update_vsyscall interface to provide
wall_to_monotonic, allowing wall_to_monotonic to be made static
as planned in Documentation/feature-removal-schedule.txt

Signed-off-by: John Stultz <[email protected]>
CC: Martin Schwidefsky <[email protected]>
CC: Anton Blanchard <[email protected]>
CC: Paul Mackerras <[email protected]>
CC: Tony Luck <[email protected]>
CC: Thomas Gleixner <[email protected]>
---
arch/ia64/kernel/time.c | 7 ++++---
arch/powerpc/kernel/time.c | 8 ++++----
arch/s390/kernel/time.c | 8 ++++----
arch/x86/kernel/vsyscall_64.c | 6 +++---
include/linux/clocksource.h | 6 ++++--
kernel/time/timekeeping.c | 9 ++++++---
6 files changed, 25 insertions(+), 19 deletions(-)

diff --git a/arch/ia64/kernel/time.c b/arch/ia64/kernel/time.c
index 653b3c4..ed6f22e 100644
--- a/arch/ia64/kernel/time.c
+++ b/arch/ia64/kernel/time.c
@@ -471,7 +471,8 @@ void update_vsyscall_tz(void)
{
}

-void update_vsyscall(struct timespec *wall, struct clocksource *c, u32 mult)
+void update_vsyscall(struct timespec *wall, struct timespec *wtm,
+ struct clocksource *c, u32 mult)
{
unsigned long flags;

@@ -487,9 +488,9 @@ void update_vsyscall(struct timespec *wall, struct clocksource *c, u32 mult)
/* copy kernel time structures */
fsyscall_gtod_data.wall_time.tv_sec = wall->tv_sec;
fsyscall_gtod_data.wall_time.tv_nsec = wall->tv_nsec;
- fsyscall_gtod_data.monotonic_time.tv_sec = wall_to_monotonic.tv_sec
+ fsyscall_gtod_data.monotonic_time.tv_sec = wtm->tv_sec
+ wall->tv_sec;
- fsyscall_gtod_data.monotonic_time.tv_nsec = wall_to_monotonic.tv_nsec
+ fsyscall_gtod_data.monotonic_time.tv_nsec = wtm->tv_nsec
+ wall->tv_nsec;

/* normalize */
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 0711d60..e215f76 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -849,8 +849,8 @@ static cycle_t timebase_read(struct clocksource *cs)
return (cycle_t)get_tb();
}

-void update_vsyscall(struct timespec *wall_time, struct clocksource *clock,
- u32 mult)
+void update_vsyscall(struct timespec *wall_time, struct timespec *wtm,
+ struct clocksource *clock, u32 mult)
{
u64 new_tb_to_xs, new_stamp_xsec;

@@ -882,8 +882,8 @@ void update_vsyscall(struct timespec *wall_time, struct clocksource *clock,
vdso_data->tb_orig_stamp = clock->cycle_last;
vdso_data->stamp_xsec = new_stamp_xsec;
vdso_data->tb_to_xs = new_tb_to_xs;
- vdso_data->wtom_clock_sec = wall_to_monotonic.tv_sec;
- vdso_data->wtom_clock_nsec = wall_to_monotonic.tv_nsec;
+ vdso_data->wtom_clock_sec = wtm->tv_sec;
+ vdso_data->wtom_clock_nsec = wtm->tv_nsec;
vdso_data->stamp_xtime = *wall_time;
smp_wmb();
++(vdso_data->tb_update_count);
diff --git a/arch/s390/kernel/time.c b/arch/s390/kernel/time.c
index a2163c9..aeb30c6 100644
--- a/arch/s390/kernel/time.c
+++ b/arch/s390/kernel/time.c
@@ -207,8 +207,8 @@ struct clocksource * __init clocksource_default_clock(void)
return &clocksource_tod;
}

-void update_vsyscall(struct timespec *wall_time, struct clocksource *clock,
- u32 mult)
+void update_vsyscall(struct timespec *wall_time, struct timespec *wtm,
+ struct clocksource *clock, u32 mult)
{
if (clock != &clocksource_tod)
return;
@@ -219,8 +219,8 @@ void update_vsyscall(struct timespec *wall_time, struct clocksource *clock,
vdso_data->xtime_tod_stamp = clock->cycle_last;
vdso_data->xtime_clock_sec = wall_time->tv_sec;
vdso_data->xtime_clock_nsec = wall_time->tv_nsec;
- vdso_data->wtom_clock_sec = wall_to_monotonic.tv_sec;
- vdso_data->wtom_clock_nsec = wall_to_monotonic.tv_nsec;
+ vdso_data->wtom_clock_sec = wtm->tv_sec;
+ vdso_data->wtom_clock_nsec = wtm->tv_nsec;
vdso_data->ntp_mult = mult;
smp_wmb();
++vdso_data->tb_update_count;
diff --git a/arch/x86/kernel/vsyscall_64.c b/arch/x86/kernel/vsyscall_64.c
index dce0c3c..dcbb28c 100644
--- a/arch/x86/kernel/vsyscall_64.c
+++ b/arch/x86/kernel/vsyscall_64.c
@@ -73,8 +73,8 @@ void update_vsyscall_tz(void)
write_sequnlock_irqrestore(&vsyscall_gtod_data.lock, flags);
}

-void update_vsyscall(struct timespec *wall_time, struct clocksource *clock,
- u32 mult)
+void update_vsyscall(struct timespec *wall_time, struct timespec *wtm,
+ struct clocksource *clock, u32 mult)
{
unsigned long flags;

@@ -87,7 +87,7 @@ void update_vsyscall(struct timespec *wall_time, struct clocksource *clock,
vsyscall_gtod_data.clock.shift = clock->shift;
vsyscall_gtod_data.wall_time_sec = wall_time->tv_sec;
vsyscall_gtod_data.wall_time_nsec = wall_time->tv_nsec;
- vsyscall_gtod_data.wall_to_monotonic = wall_to_monotonic;
+ vsyscall_gtod_data.wall_to_monotonic = *wtm;
vsyscall_gtod_data.wall_time_coarse = __current_kernel_time();
write_sequnlock_irqrestore(&vsyscall_gtod_data.lock, flags);
}
diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 5ea3c60..21677d9 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -313,11 +313,13 @@ clocksource_calc_mult_shift(struct clocksource *cs, u32 freq, u32 minsec)

#ifdef CONFIG_GENERIC_TIME_VSYSCALL
extern void
-update_vsyscall(struct timespec *ts, struct clocksource *c, u32 mult);
+update_vsyscall(struct timespec *ts, struct timespec *wtm,
+ struct clocksource *c, u32 mult);
extern void update_vsyscall_tz(void);
#else
static inline void
-update_vsyscall(struct timespec *ts, struct clocksource *c, u32 mult)
+update_vsyscall(struct timespec *ts, struct timespec *wtm,
+ struct clocksource *c, u32 mult)
{
}

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 73edd40..b15c3ac 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -170,7 +170,8 @@ void timekeeping_leap_insert(int leapsecond)
{
xtime.tv_sec += leapsecond;
wall_to_monotonic.tv_sec -= leapsecond;
- update_vsyscall(&xtime, timekeeper.clock, timekeeper.mult);
+ update_vsyscall(&xtime, &wall_to_monotonic, timekeeper.clock,
+ timekeeper.mult);
}

/**
@@ -326,7 +327,8 @@ int do_settimeofday(struct timespec *tv)
timekeeper.ntp_error = 0;
ntp_clear();

- update_vsyscall(&xtime, timekeeper.clock, timekeeper.mult);
+ update_vsyscall(&xtime, &wall_to_monotonic, timekeeper.clock,
+ timekeeper.mult);

write_sequnlock_irqrestore(&xtime_lock, flags);

@@ -809,7 +811,8 @@ void update_wall_time(void)
}

/* check to see if there is a new clocksource to use */
- update_vsyscall(&xtime, timekeeper.clock, timekeeper.mult);
+ update_vsyscall(&xtime, &wall_to_monotonic, timekeeper.clock,
+ timekeeper.mult);
}

/**
--
1.6.0.4

2010-07-14 00:57:25

by john stultz

[permalink] [raw]
Subject: [PATCH 05/11] powerpc: Cleanup xtime usage

This removes powerpc's direct xtime usage, allowing for further
generic timeekeping cleanups

Compiled but otherwise untested.

Cc: Paul Mackerras <[email protected]>
Cc: Anton Blanchard <[email protected]>
Cc: Thomas Gleixner <[email protected]>

Signed-off-by: John Stultz <[email protected]>
---
arch/powerpc/kernel/time.c | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 6fcd648..0711d60 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -864,9 +864,9 @@ void update_vsyscall(struct timespec *wall_time, struct clocksource *clock,
/* XXX this assumes clock->shift == 22 */
/* 4611686018 ~= 2^(20+64-22) / 1e9 */
new_tb_to_xs = (u64) mult * 4611686018ULL;
- new_stamp_xsec = (u64) xtime.tv_nsec * XSEC_PER_SEC;
+ new_stamp_xsec = (u64) wall_time->tv_nsec * XSEC_PER_SEC;
do_div(new_stamp_xsec, 1000000000);
- new_stamp_xsec += (u64) xtime.tv_sec * XSEC_PER_SEC;
+ new_stamp_xsec += (u64) wall_time->tv_sec * XSEC_PER_SEC;

/*
* tb_update_count is used to allow the userspace gettimeofday code
@@ -884,7 +884,7 @@ void update_vsyscall(struct timespec *wall_time, struct clocksource *clock,
vdso_data->tb_to_xs = new_tb_to_xs;
vdso_data->wtom_clock_sec = wall_to_monotonic.tv_sec;
vdso_data->wtom_clock_nsec = wall_to_monotonic.tv_nsec;
- vdso_data->stamp_xtime = xtime;
+ vdso_data->stamp_xtime = *wall_time;
smp_wmb();
++(vdso_data->tb_update_count);
}
@@ -1093,7 +1093,7 @@ void __init time_init(void)
vdso_data->tb_orig_stamp = tb_last_jiffy;
vdso_data->tb_update_count = 0;
vdso_data->tb_ticks_per_sec = tb_ticks_per_sec;
- vdso_data->stamp_xsec = (u64) xtime.tv_sec * XSEC_PER_SEC;
+ vdso_data->stamp_xsec = (u64) get_seconds() * XSEC_PER_SEC;
vdso_data->tb_to_xs = tb_to_xs;

write_sequnlock_irqrestore(&xtime_lock, flags);
--
1.6.0.4

2010-07-14 00:57:55

by john stultz

[permalink] [raw]
Subject: [PATCH 02/11] Implement timespec_add

After accidentally misusing timespec_add_safe, I wanted to make sure
we don't accidently trip over that issue again, so I created a simple
timespec_add() function which we can use to replace the instances
of timespec_add_safe() that don't want the overflow detection.

Signed-off-by: John Stultz <[email protected]>
CC: Thomas Gleixner <[email protected]>

---
include/linux/time.h | 16 ++++++++++++++++
kernel/time/timekeeping.c | 6 +++---
2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/include/linux/time.h b/include/linux/time.h
index ea3559f..36b42f5 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -76,9 +76,25 @@ extern unsigned long mktime(const unsigned int year, const unsigned int mon,
const unsigned int min, const unsigned int sec);

extern void set_normalized_timespec(struct timespec *ts, time_t sec, s64 nsec);
+
+/*
+ * timespec_add_safe assumes both values are positive and checks
+ * for overflow. It will return TIME_T_MAX if the reutrn would be
+ * smaller then either of the arguments.
+ */
extern struct timespec timespec_add_safe(const struct timespec lhs,
const struct timespec rhs);

+
+static inline struct timespec timespec_add(struct timespec lhs,
+ struct timespec rhs)
+{
+ struct timespec ts_delta;
+ set_normalized_timespec(&ts_delta, lhs.tv_sec + rhs.tv_sec,
+ lhs.tv_nsec + rhs.tv_nsec);
+ return ts_delta;
+}
+
/*
* sub = lhs - rhs, in normalized form
*/
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index caf8d4d..623fe3d 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -579,9 +579,9 @@ static int timekeeping_resume(struct sys_device *dev)

if (timespec_compare(&ts, &timekeeping_suspend_time) > 0) {
ts = timespec_sub(ts, timekeeping_suspend_time);
- xtime = timespec_add_safe(xtime, ts);
+ xtime = timespec_add(xtime, ts);
wall_to_monotonic = timespec_sub(wall_to_monotonic, ts);
- total_sleep_time = timespec_add_safe(total_sleep_time, ts);
+ total_sleep_time = timespec_add(total_sleep_time, ts);
}
/* re-base the last cycle value */
timekeeper.clock->cycle_last = timekeeper.clock->read(timekeeper.clock);
@@ -887,7 +887,7 @@ EXPORT_SYMBOL_GPL(getboottime);
*/
void monotonic_to_bootbased(struct timespec *ts)
{
- *ts = timespec_add_safe(*ts, total_sleep_time);
+ *ts = timespec_add(*ts, total_sleep_time);
}
EXPORT_SYMBOL_GPL(monotonic_to_bootbased);

--
1.6.0.4

2010-07-14 00:57:57

by john stultz

[permalink] [raw]
Subject: [PATCH 03/11] time: Kill off CONFIG_GENERIC_TIME

Now that all arches have been converted over to use generic time via clocksources or arch_gettimeoffset(), we can remove the GENERIC_TIME config option and simplify the generic code.

Signed-off-by: John Stultz <[email protected]>
CC: Thomas Gleixner <[email protected]>
---
Documentation/kernel-parameters.txt | 3 +-
arch/alpha/Kconfig | 4 --
arch/arm/Kconfig | 4 --
arch/avr32/Kconfig | 3 --
arch/blackfin/Kconfig | 3 --
arch/cris/Kconfig | 3 --
arch/frv/Kconfig | 4 --
arch/h8300/Kconfig | 4 --
arch/ia64/Kconfig | 4 --
arch/m32r/Kconfig | 3 --
arch/m68k/Kconfig | 3 --
arch/m68knommu/Kconfig | 4 --
arch/microblaze/Kconfig | 3 --
arch/mips/Kconfig | 4 --
arch/mn10300/Kconfig | 3 --
arch/parisc/Kconfig | 4 --
arch/powerpc/Kconfig | 3 --
arch/s390/Kconfig | 3 --
arch/score/Kconfig | 3 --
arch/sh/Kconfig | 3 --
arch/sparc/Kconfig | 3 --
arch/um/Kconfig.common | 4 --
arch/x86/Kconfig | 5 +--
arch/xtensa/Kconfig | 3 --
drivers/Makefile | 4 ++-
drivers/acpi/acpi_pad.c | 2 +-
drivers/acpi/processor_idle.c | 2 +-
drivers/misc/Kconfig | 4 +-
kernel/time.c | 16 ----------
kernel/time/Kconfig | 4 +-
kernel/time/clocksource.c | 4 +-
kernel/time/timekeeping.c | 55 ++--------------------------------
kernel/trace/Kconfig | 4 +-
33 files changed, 19 insertions(+), 159 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 82d6aeb..1014f91 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -73,7 +73,6 @@ parameter is applicable:
MTD MTD (Memory Technology Device) support is enabled.
NET Appropriate network support is enabled.
NUMA NUMA support is enabled.
- GENERIC_TIME The generic timeofday code is enabled.
NFS Appropriate NFS support is enabled.
OSS OSS sound support is enabled.
PV_OPS A paravirtualized kernel is enabled.
@@ -468,7 +467,7 @@ and is between 256 and 4096 characters. It is defined in the file
clocksource is not available, it defaults to PIT.
Format: { pit | tsc | cyclone | pmtmr }

- clocksource= [GENERIC_TIME] Override the default clocksource
+ clocksource= Override the default clocksource
Format: <string>
Override the default clocksource and use the clocksource
with the name specified.
diff --git a/arch/alpha/Kconfig b/arch/alpha/Kconfig
index 3e2e540..b9647bb 100644
--- a/arch/alpha/Kconfig
+++ b/arch/alpha/Kconfig
@@ -47,10 +47,6 @@ config GENERIC_CALIBRATE_DELAY
bool
default y

-config GENERIC_TIME
- bool
- default y
-
config GENERIC_CMOS_UPDATE
def_bool y

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 98922f7..655b4ae 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -41,10 +41,6 @@ config SYS_SUPPORTS_APM_EMULATION
config GENERIC_GPIO
bool

-config GENERIC_TIME
- bool
- default y
-
config ARCH_USES_GETTIMEOFFSET
bool
default n
diff --git a/arch/avr32/Kconfig b/arch/avr32/Kconfig
index f2b3193..f515727 100644
--- a/arch/avr32/Kconfig
+++ b/arch/avr32/Kconfig
@@ -45,9 +45,6 @@ config GENERIC_IRQ_PROBE
config RWSEM_GENERIC_SPINLOCK
def_bool y

-config GENERIC_TIME
- def_bool y
-
config GENERIC_CLOCKEVENTS
def_bool y

diff --git a/arch/blackfin/Kconfig b/arch/blackfin/Kconfig
index f66294b..c88fd35 100644
--- a/arch/blackfin/Kconfig
+++ b/arch/blackfin/Kconfig
@@ -614,9 +614,6 @@ comment "Kernel Timer/Scheduler"

source kernel/Kconfig.hz

-config GENERIC_TIME
- def_bool y
-
config GENERIC_CLOCKEVENTS
bool "Generic clock events"
default y
diff --git a/arch/cris/Kconfig b/arch/cris/Kconfig
index e25bf44..887ef85 100644
--- a/arch/cris/Kconfig
+++ b/arch/cris/Kconfig
@@ -20,9 +20,6 @@ config RWSEM_GENERIC_SPINLOCK
config RWSEM_XCHGADD_ALGORITHM
bool

-config GENERIC_TIME
- def_bool y
-
config GENERIC_CMOS_UPDATE
def_bool y

diff --git a/arch/frv/Kconfig b/arch/frv/Kconfig
index 4b5830b..16399bd 100644
--- a/arch/frv/Kconfig
+++ b/arch/frv/Kconfig
@@ -40,10 +40,6 @@ config GENERIC_HARDIRQS_NO__DO_IRQ
bool
default y

-config GENERIC_TIME
- bool
- default y
-
config TIME_LOW_RES
bool
default y
diff --git a/arch/h8300/Kconfig b/arch/h8300/Kconfig
index 53cc669..988b6ff 100644
--- a/arch/h8300/Kconfig
+++ b/arch/h8300/Kconfig
@@ -62,10 +62,6 @@ config GENERIC_CALIBRATE_DELAY
bool
default y

-config GENERIC_TIME
- bool
- default y
-
config GENERIC_BUG
bool
depends on BUG
diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 9561082..8711d13 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -82,10 +82,6 @@ config GENERIC_CALIBRATE_DELAY
bool
default y

-config GENERIC_TIME
- bool
- default y
-
config GENERIC_TIME_VSYSCALL
bool
default y
diff --git a/arch/m32r/Kconfig b/arch/m32r/Kconfig
index 3a9319f..836abbb 100644
--- a/arch/m32r/Kconfig
+++ b/arch/m32r/Kconfig
@@ -44,9 +44,6 @@ config HZ
int
default 100

-config GENERIC_TIME
- def_bool y
-
config ARCH_USES_GETTIMEOFFSET
def_bool y

diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
index 2e3737b..8030e24 100644
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@@ -59,9 +59,6 @@ config HZ
int
default 100

-config GENERIC_TIME
- def_bool y
-
config ARCH_USES_GETTIMEOFFSET
def_bool y

diff --git a/arch/m68knommu/Kconfig b/arch/m68knommu/Kconfig
index efeb603..2609c39 100644
--- a/arch/m68knommu/Kconfig
+++ b/arch/m68knommu/Kconfig
@@ -63,10 +63,6 @@ config GENERIC_CALIBRATE_DELAY
bool
default y

-config GENERIC_TIME
- bool
- default y
-
config GENERIC_CMOS_UPDATE
bool
default y
diff --git a/arch/microblaze/Kconfig b/arch/microblaze/Kconfig
index 76818f9..c96dab8 100644
--- a/arch/microblaze/Kconfig
+++ b/arch/microblaze/Kconfig
@@ -48,9 +48,6 @@ config GENERIC_IRQ_PROBE
config GENERIC_CALIBRATE_DELAY
def_bool y

-config GENERIC_TIME
- def_bool y
-
config GENERIC_TIME_VSYSCALL
def_bool n

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index cdaae94..01c44cb 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -733,10 +733,6 @@ config GENERIC_CLOCKEVENTS
bool
default y

-config GENERIC_TIME
- bool
- default y
-
config GENERIC_CMOS_UPDATE
bool
default y
diff --git a/arch/mn10300/Kconfig b/arch/mn10300/Kconfig
index 1c4565a..444b9f9 100644
--- a/arch/mn10300/Kconfig
+++ b/arch/mn10300/Kconfig
@@ -46,9 +46,6 @@ config GENERIC_FIND_NEXT_BIT
config GENERIC_HWEIGHT
def_bool y

-config GENERIC_TIME
- def_bool y
-
config GENERIC_BUG
def_bool y

diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
index 05a366a..907417d 100644
--- a/arch/parisc/Kconfig
+++ b/arch/parisc/Kconfig
@@ -66,10 +66,6 @@ config GENERIC_CALIBRATE_DELAY
bool
default y

-config GENERIC_TIME
- bool
- default y
-
config TIME_LOW_RES
bool
depends on SMP
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 6506bf4..dd12626 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -29,9 +29,6 @@ config MMU
config GENERIC_CMOS_UPDATE
def_bool y

-config GENERIC_TIME
- def_bool y
-
config GENERIC_TIME_VSYSCALL
def_bool y

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index bee1c0f..f0777a4 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -40,9 +40,6 @@ config ARCH_HAS_ILOG2_U64
config GENERIC_HWEIGHT
def_bool y

-config GENERIC_TIME
- def_bool y
-
config GENERIC_TIME_VSYSCALL
def_bool y

diff --git a/arch/score/Kconfig b/arch/score/Kconfig
index 55d413e..be4a155 100644
--- a/arch/score/Kconfig
+++ b/arch/score/Kconfig
@@ -55,9 +55,6 @@ config GENERIC_CALIBRATE_DELAY
config GENERIC_CLOCKEVENTS
def_bool y

-config GENERIC_TIME
- def_bool y
-
config SCHED_NO_NO_OMIT_FRAME_POINTER
def_bool y

diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
index 573fca1..1d0a711 100644
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -98,9 +98,6 @@ config GENERIC_CALIBRATE_DELAY
config GENERIC_IOMAP
bool

-config GENERIC_TIME
- def_bool y
-
config GENERIC_CLOCKEVENTS
def_bool y

diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 6f1470b..0011052 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -66,9 +66,6 @@ config BITS
default 32 if SPARC32
default 64 if SPARC64

-config GENERIC_TIME
- def_bool y
-
config ARCH_USES_GETTIMEOFFSET
bool
default y if SPARC32
diff --git a/arch/um/Kconfig.common b/arch/um/Kconfig.common
index 0d207e7..7c8e277 100644
--- a/arch/um/Kconfig.common
+++ b/arch/um/Kconfig.common
@@ -55,10 +55,6 @@ config GENERIC_BUG
default y
depends on BUG

-config GENERIC_TIME
- bool
- default y
-
config GENERIC_CLOCKEVENTS
bool
default y
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index dcb0593..546b610 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -72,9 +72,6 @@ config ARCH_DEFCONFIG
default "arch/x86/configs/i386_defconfig" if X86_32
default "arch/x86/configs/x86_64_defconfig" if X86_64

-config GENERIC_TIME
- def_bool y
-
config GENERIC_CMOS_UPDATE
def_bool y

@@ -2046,7 +2043,7 @@ config SCx200

config SCx200HR_TIMER
tristate "NatSemi SCx200 27MHz High-Resolution Timer Support"
- depends on SCx200 && GENERIC_TIME
+ depends on SCx200
default y
---help---
This driver provides a clocksource built upon the on-chip
diff --git a/arch/xtensa/Kconfig b/arch/xtensa/Kconfig
index ebe228d..0859bfd 100644
--- a/arch/xtensa/Kconfig
+++ b/arch/xtensa/Kconfig
@@ -48,9 +48,6 @@ config HZ
int
default 100

-config GENERIC_TIME
- def_bool y
-
source "init/Kconfig"
source "kernel/Kconfig.freezer"

diff --git a/drivers/Makefile b/drivers/Makefile
index 91874e0..ae47344 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -101,7 +101,9 @@ obj-y += firmware/
obj-$(CONFIG_CRYPTO) += crypto/
obj-$(CONFIG_SUPERH) += sh/
obj-$(CONFIG_ARCH_SHMOBILE) += sh/
-obj-$(CONFIG_GENERIC_TIME) += clocksource/
+ifndef CONFIG_ARCH_USES_GETTIMEOFFSET
+obj-y += clocksource/
+endif
obj-$(CONFIG_DMA_ENGINE) += dma/
obj-$(CONFIG_DCA) += dca/
obj-$(CONFIG_HID) += hid/
diff --git a/drivers/acpi/acpi_pad.c b/drivers/acpi/acpi_pad.c
index 446aced..b76848c 100644
--- a/drivers/acpi/acpi_pad.c
+++ b/drivers/acpi/acpi_pad.c
@@ -77,7 +77,7 @@ static void power_saving_mwait_init(void)
power_saving_mwait_eax = (highest_cstate << MWAIT_SUBSTATE_SIZE) |
(highest_subcstate - 1);

-#if defined(CONFIG_GENERIC_TIME) && defined(CONFIG_X86)
+#if defined(CONFIG_X86)
switch (boot_cpu_data.x86_vendor) {
case X86_VENDOR_AMD:
case X86_VENDOR_INTEL:
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index b1b3856..0e562d0 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -259,7 +259,7 @@ int acpi_processor_resume(struct acpi_device * device)
return 0;
}

-#if defined (CONFIG_GENERIC_TIME) && defined (CONFIG_X86)
+#if defined(CONFIG_X86)
static void tsc_check_state(int state)
{
switch (boot_cpu_data.x86_vendor) {
diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 26386a9..5b9ba48 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -72,7 +72,7 @@ config ATMEL_TCLIB

config ATMEL_TCB_CLKSRC
bool "TC Block Clocksource"
- depends on ATMEL_TCLIB && GENERIC_TIME
+ depends on ATMEL_TCLIB
default y
help
Select this to get a high precision clocksource based on a
@@ -240,7 +240,7 @@ config CS5535_MFGPT_DEFAULT_IRQ

config CS5535_CLOCK_EVENT_SRC
tristate "CS5535/CS5536 high-res timer (MFGPT) events"
- depends on GENERIC_TIME && GENERIC_CLOCKEVENTS && CS5535_MFGPT
+ depends on GENERIC_CLOCKEVENTS && CS5535_MFGPT
help
This driver provides a clock event source based on the MFGPT
timer(s) in the CS5535 and CS5536 companion chips.
diff --git a/kernel/time.c b/kernel/time.c
index 848b1c2..ba9b338 100644
--- a/kernel/time.c
+++ b/kernel/time.c
@@ -300,22 +300,6 @@ struct timespec timespec_trunc(struct timespec t, unsigned gran)
}
EXPORT_SYMBOL(timespec_trunc);

-#ifndef CONFIG_GENERIC_TIME
-/*
- * Simulate gettimeofday using do_gettimeofday which only allows a timeval
- * and therefore only yields usec accuracy
- */
-void getnstimeofday(struct timespec *tv)
-{
- struct timeval x;
-
- do_gettimeofday(&x);
- tv->tv_sec = x.tv_sec;
- tv->tv_nsec = x.tv_usec * NSEC_PER_USEC;
-}
-EXPORT_SYMBOL_GPL(getnstimeofday);
-#endif
-
/* Converts Gregorian date to seconds since 1970-01-01 00:00:00.
* Assumes input in normal date format, i.e. 1980-12-31 23:59:59
* => year=1980, mon=12, day=31, hour=23, min=59, sec=59.
diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
index 95ed429..f06a8a3 100644
--- a/kernel/time/Kconfig
+++ b/kernel/time/Kconfig
@@ -6,7 +6,7 @@ config TICK_ONESHOT

config NO_HZ
bool "Tickless System (Dynamic Ticks)"
- depends on GENERIC_TIME && GENERIC_CLOCKEVENTS
+ depends on !ARCH_USES_GETTIMEOFFSET && GENERIC_CLOCKEVENTS
select TICK_ONESHOT
help
This option enables a tickless system: timer interrupts will
@@ -15,7 +15,7 @@ config NO_HZ

config HIGH_RES_TIMERS
bool "High Resolution Timer Support"
- depends on GENERIC_TIME && GENERIC_CLOCKEVENTS
+ depends on !ARCH_USES_GETTIMEOFFSET && GENERIC_CLOCKEVENTS
select TICK_ONESHOT
help
This option enables high resolution timer support. If your
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index f08e99c..c543d21 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -531,7 +531,7 @@ static u64 clocksource_max_deferment(struct clocksource *cs)
return max_nsecs - (max_nsecs >> 5);
}

-#ifdef CONFIG_GENERIC_TIME
+#ifndef CONFIG_ARCH_USES_GETTIMEOFFSET

/**
* clocksource_select - Select the best clocksource available
@@ -577,7 +577,7 @@ static void clocksource_select(void)
}
}

-#else /* CONFIG_GENERIC_TIME */
+#else /* !CONFIG_ARCH_USES_GETTIMEOFFSET */

static inline void clocksource_select(void) { }

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 623fe3d..73edd40 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -173,8 +173,6 @@ void timekeeping_leap_insert(int leapsecond)
update_vsyscall(&xtime, timekeeper.clock, timekeeper.mult);
}

-#ifdef CONFIG_GENERIC_TIME
-
/**
* timekeeping_forward_now - update clock to the current time
*
@@ -376,52 +374,6 @@ void timekeeping_notify(struct clocksource *clock)
tick_clock_notify();
}

-#else /* GENERIC_TIME */
-
-static inline void timekeeping_forward_now(void) { }
-
-/**
- * ktime_get - get the monotonic time in ktime_t format
- *
- * returns the time in ktime_t format
- */
-ktime_t ktime_get(void)
-{
- struct timespec now;
-
- ktime_get_ts(&now);
-
- return timespec_to_ktime(now);
-}
-EXPORT_SYMBOL_GPL(ktime_get);
-
-/**
- * ktime_get_ts - get the monotonic clock in timespec format
- * @ts: pointer to timespec variable
- *
- * The function calculates the monotonic clock from the realtime
- * clock and the wall_to_monotonic offset and stores the result
- * in normalized timespec format in the variable pointed to by @ts.
- */
-void ktime_get_ts(struct timespec *ts)
-{
- struct timespec tomono;
- unsigned long seq;
-
- do {
- seq = read_seqbegin(&xtime_lock);
- getnstimeofday(ts);
- tomono = wall_to_monotonic;
-
- } while (read_seqretry(&xtime_lock, seq));
-
- set_normalized_timespec(ts, ts->tv_sec + tomono.tv_sec,
- ts->tv_nsec + tomono.tv_nsec);
-}
-EXPORT_SYMBOL_GPL(ktime_get_ts);
-
-#endif /* !GENERIC_TIME */
-
/**
* ktime_get_real - get the real (wall-) time in ktime_t format
*
@@ -784,10 +736,11 @@ void update_wall_time(void)
return;

clock = timekeeper.clock;
-#ifdef CONFIG_GENERIC_TIME
- offset = (clock->read(clock) - clock->cycle_last) & clock->mask;
-#else
+
+#ifdef CONFIG_ARCH_USES_GETTIMEOFFSET
offset = timekeeper.cycle_interval;
+#else
+ offset = (clock->read(clock) - clock->cycle_last) & clock->mask;
#endif
timekeeper.xtime_nsec = (s64)xtime.tv_nsec << timekeeper.shift;

diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 8b1797c..7531dda 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -153,7 +153,7 @@ config IRQSOFF_TRACER
bool "Interrupts-off Latency Tracer"
default n
depends on TRACE_IRQFLAGS_SUPPORT
- depends on GENERIC_TIME
+ depends on !ARCH_USES_GETTIMEOFFSET
select TRACE_IRQFLAGS
select GENERIC_TRACER
select TRACER_MAX_TRACE
@@ -175,7 +175,7 @@ config IRQSOFF_TRACER
config PREEMPT_TRACER
bool "Preemption-off Latency Tracer"
default n
- depends on GENERIC_TIME
+ depends on !ARCH_USES_GETTIMEOFFSET
depends on PREEMPT
select GENERIC_TRACER
select TRACER_MAX_TRACE
--
1.6.0.4

2010-07-14 00:57:53

by john stultz

[permalink] [raw]
Subject: [PATCH 08/11] Cleanup hrtimer.c's direct access to wall_to_monotonic

Provides an accessor function to replace hrtimer.c's
direct access of wall_to_monotonic.

This will allow wall_to_monotonic to be made static as
planned in Documentation/feature-removal-schedule.txt

Signed-off-by: John Stultz <[email protected]>
CC: Thomas Gleixner <[email protected]>
---
include/linux/time.h | 3 ++-
kernel/hrtimer.c | 9 ++++-----
kernel/time/timekeeping.c | 5 +++++
3 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/include/linux/time.h b/include/linux/time.h
index 36b42f5..3a9c0bf 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -126,7 +126,8 @@ extern int timekeeping_suspended;

unsigned long get_seconds(void);
struct timespec current_kernel_time(void);
-struct timespec __current_kernel_time(void); /* does not hold xtime_lock */
+struct timespec __current_kernel_time(void); /* does not take xtime_lock */
+struct timespec __get_wall_to_monotonic(void); /* does not take xtime_lock */
struct timespec get_monotonic_coarse(void);

#define CURRENT_TIME (current_kernel_time())
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 5c69e99..809f48c 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -90,7 +90,7 @@ static void hrtimer_get_softirq_time(struct hrtimer_cpu_base *base)
do {
seq = read_seqbegin(&xtime_lock);
xts = __current_kernel_time();
- tom = wall_to_monotonic;
+ tom = __get_wall_to_monotonic();
} while (read_seqretry(&xtime_lock, seq));

xtim = timespec_to_ktime(xts);
@@ -612,7 +612,7 @@ static int hrtimer_reprogram(struct hrtimer *timer,
static void retrigger_next_event(void *arg)
{
struct hrtimer_cpu_base *base;
- struct timespec realtime_offset;
+ struct timespec realtime_offset, wtm;
unsigned long seq;

if (!hrtimer_hres_active())
@@ -620,10 +620,9 @@ static void retrigger_next_event(void *arg)

do {
seq = read_seqbegin(&xtime_lock);
- set_normalized_timespec(&realtime_offset,
- -wall_to_monotonic.tv_sec,
- -wall_to_monotonic.tv_nsec);
+ wtm = __get_wall_to_monotonic();
} while (read_seqretry(&xtime_lock, seq));
+ set_normalized_timespec(&realtime_offset, -wtm.tv_sec, -wtm.tv_nsec);

base = &__get_cpu_var(hrtimer_bases);

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index b15c3ac..fb61c2e 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -858,6 +858,11 @@ struct timespec __current_kernel_time(void)
return xtime;
}

+struct timespec __get_wall_to_monotonic(void)
+{
+ return wall_to_monotonic;
+}
+
struct timespec current_kernel_time(void)
{
struct timespec now;
--
1.6.0.4

2010-07-14 00:57:50

by john stultz

[permalink] [raw]
Subject: [PATCH 11/11] Add __clocksource_updatefreq_hz/khz methods

To properly handle clocksources that change frequencies
at the clocksource->enable() point, this patch adds
a method that will update the clocksource's mult/shift and
max_idle_ns values.

Signed-off-by: John Stultz <[email protected]>
CC: Thomas Gleixner <[email protected]>

---
include/linux/clocksource.h | 11 +++++++++++
kernel/time/clocksource.c | 29 ++++++++++++++++++++++++-----
2 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 21677d9..edd20e6 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -292,6 +292,8 @@ clocks_calc_mult_shift(u32 *mult, u32 *shift, u32 from, u32 to, u32 minsec);
*/
extern int
__clocksource_register_scale(struct clocksource *cs, u32 scale, u32 freq);
+extern void
+__clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq);

static inline int clocksource_register_hz(struct clocksource *cs, u32 hz)
{
@@ -303,6 +305,15 @@ static inline int clocksource_register_khz(struct clocksource *cs, u32 khz)
return __clocksource_register_scale(cs, 1000, khz);
}

+static inline void __clocksource_updatefreq_hz(struct clocksource *cs, u32 hz)
+{
+ __clocksource_updatefreq_scale(cs, 1, hz);
+}
+
+static inline void __clocksource_updatefreq_khz(struct clocksource *cs, u32 khz)
+{
+ __clocksource_updatefreq_scale(cs, 1000, khz);
+}

static inline void
clocksource_calc_mult_shift(struct clocksource *cs, u32 freq, u32 minsec)
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index c543d21..c18d7ef 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -639,19 +639,18 @@ static void clocksource_enqueue(struct clocksource *cs)
#define MAX_UPDATE_LENGTH 5 /* Seconds */

/**
- * __clocksource_register_scale - Used to install new clocksources
+ * __clocksource_updatefreq_scale - Used update clocksource with new freq
* @t: clocksource to be registered
* @scale: Scale factor multiplied against freq to get clocksource hz
* @freq: clocksource frequency (cycles per second) divided by scale
*
- * Returns -EBUSY if registration fails, zero otherwise.
+ * This should only be called from the clocksource->enable() method.
*
* This *SHOULD NOT* be called directly! Please use the
- * clocksource_register_hz() or clocksource_register_khz helper functions.
+ * clocksource_updatefreq_hz() or clocksource_updatefreq_khz helper functions.
*/
-int __clocksource_register_scale(struct clocksource *cs, u32 scale, u32 freq)
+void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
{
-
/*
* Ideally we want to use some of the limits used in
* clocksource_max_deferment, to provide a more informed
@@ -662,7 +661,27 @@ int __clocksource_register_scale(struct clocksource *cs, u32 scale, u32 freq)
NSEC_PER_SEC/scale,
MAX_UPDATE_LENGTH*scale);
cs->max_idle_ns = clocksource_max_deferment(cs);
+}
+EXPORT_SYMBOL_GPL(__clocksource_updatefreq_scale);
+
+/**
+ * __clocksource_register_scale - Used to install new clocksources
+ * @t: clocksource to be registered
+ * @scale: Scale factor multiplied against freq to get clocksource hz
+ * @freq: clocksource frequency (cycles per second) divided by scale
+ *
+ * Returns -EBUSY if registration fails, zero otherwise.
+ *
+ * This *SHOULD NOT* be called directly! Please use the
+ * clocksource_register_hz() or clocksource_register_khz helper functions.
+ */
+int __clocksource_register_scale(struct clocksource *cs, u32 scale, u32 freq)
+{
+
+ /* Intialize mult/shift and max_idle_ns */
+ __clocksource_updatefreq_scale(cs, scale, freq);

+ /* Add clocksource to the clcoksource list */
mutex_lock(&clocksource_mutex);
clocksource_enqueue(cs);
clocksource_select();
--
1.6.0.4

2010-07-14 00:57:48

by john stultz

[permalink] [raw]
Subject: [PATCH 10/11] Convert common x86 clocksources to use clocksource_register_hz/khz

This converts the most common of the x86 clocksources over to use
clocksource_register_hz/khz.

CC: Thomas Gleixner <[email protected]>
Signed-off-by: John Stultz <[email protected]>
---
arch/x86/kernel/hpet.c | 13 +++++++++----
arch/x86/kernel/tsc.c | 6 ++----
drivers/clocksource/acpi_pm.c | 9 ++-------
3 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index a198b7c..54dac04 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -16,7 +16,6 @@
#include <asm/hpet.h>

#define HPET_MASK CLOCKSOURCE_MASK(32)
-#define HPET_SHIFT 22

/* FSEC = 10^-15
NSEC = 10^-9 */
@@ -787,7 +786,6 @@ static struct clocksource clocksource_hpet = {
.rating = 250,
.read = read_hpet,
.mask = HPET_MASK,
- .shift = HPET_SHIFT,
.flags = CLOCK_SOURCE_IS_CONTINUOUS,
.resume = hpet_resume_counter,
#ifdef CONFIG_X86_64
@@ -798,6 +796,7 @@ static struct clocksource clocksource_hpet = {
static int hpet_clocksource_register(void)
{
u64 start, now;
+ u64 hpet_freq;
cycle_t t1;

/* Start the counter */
@@ -832,9 +831,15 @@ static int hpet_clocksource_register(void)
* mult = (hpet_period * 2^shift)/10^6
* mult = (hpet_period << shift)/FSEC_PER_NSEC
*/
- clocksource_hpet.mult = div_sc(hpet_period, FSEC_PER_NSEC, HPET_SHIFT);

- clocksource_register(&clocksource_hpet);
+ /* Need to convert hpet_period (fsec/cyc) to cyc/sec:
+ *
+ * cyc/sec = FSEC_PER_SEC/hpet_period(fsec/cyc)
+ * cyc/sec = (FSEC_PER_NSEC * NSEC_PER_SEC)/hpet_period
+ */
+ hpet_freq = FSEC_PER_NSEC * NSEC_PER_SEC;
+ do_div(hpet_freq, hpet_period);
+ clocksource_register_hz(&clocksource_hpet, (u32)hpet_freq);

return 0;
}
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 9faf91a..5ca6370 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -751,7 +751,6 @@ static struct clocksource clocksource_tsc = {
.read = read_tsc,
.resume = resume_tsc,
.mask = CLOCKSOURCE_MASK(64),
- .shift = 22,
.flags = CLOCK_SOURCE_IS_CONTINUOUS |
CLOCK_SOURCE_MUST_VERIFY,
#ifdef CONFIG_X86_64
@@ -845,8 +844,6 @@ __cpuinit int unsynchronized_tsc(void)

static void __init init_tsc_clocksource(void)
{
- clocksource_tsc.mult = clocksource_khz2mult(tsc_khz,
- clocksource_tsc.shift);
if (tsc_clocksource_reliable)
clocksource_tsc.flags &= ~CLOCK_SOURCE_MUST_VERIFY;
/* lower the rating if we already know its unstable: */
@@ -854,7 +851,8 @@ static void __init init_tsc_clocksource(void)
clocksource_tsc.rating = 0;
clocksource_tsc.flags &= ~CLOCK_SOURCE_IS_CONTINUOUS;
}
- clocksource_register(&clocksource_tsc);
+
+ clocksource_register_khz(&clocksource_tsc, tsc_khz);
}

#ifdef CONFIG_X86_64
diff --git a/drivers/clocksource/acpi_pm.c b/drivers/clocksource/acpi_pm.c
index 72a633a..cfb0f52 100644
--- a/drivers/clocksource/acpi_pm.c
+++ b/drivers/clocksource/acpi_pm.c
@@ -68,10 +68,7 @@ static struct clocksource clocksource_acpi_pm = {
.rating = 200,
.read = acpi_pm_read,
.mask = (cycle_t)ACPI_PM_MASK,
- .mult = 0, /*to be calculated*/
- .shift = 22,
.flags = CLOCK_SOURCE_IS_CONTINUOUS,
-
};


@@ -190,9 +187,6 @@ static int __init init_acpi_pm_clocksource(void)
if (!pmtmr_ioport)
return -ENODEV;

- clocksource_acpi_pm.mult = clocksource_hz2mult(PMTMR_TICKS_PER_SEC,
- clocksource_acpi_pm.shift);
-
/* "verify" this timing source: */
for (j = 0; j < ACPI_PM_MONOTONICITY_CHECKS; j++) {
udelay(100 * j);
@@ -220,7 +214,8 @@ static int __init init_acpi_pm_clocksource(void)
if (verify_pmtmr_rate() != 0)
return -ENODEV;

- return clocksource_register(&clocksource_acpi_pm);
+ return clocksource_register_hz(&clocksource_acpi_pm,
+ PMTMR_TICKS_PER_SEC);
}

/* We use fs_initcall because we want the PCI fixups to have run
--
1.6.0.4

2010-07-14 02:40:44

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH 01/11] x86: Fix vtime/file timestamp inconsistencies

Hi

> Due to vtime calling vgettimeofday(), its possible that an application
> could call time();create("stuff",O_RDRW); only to see the file's
> creation timestamp to be before the value returned by time.

Just dumb question.

Almost application are using gettimeofday() instead time(). It mean
your fix don't solve almost application.

So, Why can't we fix vgettimeofday() vs create() inconsistency?
This is just question, I don't intend to disagree you.


>
> A similar way to reproduce the issue is to compare the vsyscall time()
> with the syscall time(), and observe ordering issues.
>
> The modified test case from Oleg Nesterov below can illustrate this:
>
> int main(void)
> {
> time_t sec1,sec2;
> do {
> sec1 = time(&sec2);
> sec2 = syscall(__NR_time, NULL);
> } while (sec1 <= sec2);
>
> printf("vtime: %d.000000\n", sec1);
> printf("time: %d.000000\n", sec2);
> return 0;
> }
>
> The proper fix is to make vtime use the same time value as
> current_kernel_time() (which is exported via update_vsyscall) instead of
> vgettime().
>
> Thanks to Jiri Olsa for bringing up the issue and catching bugs in
> earlier verisons of this fix.
>
> Signed-off-by: John Stultz <[email protected]>
>
> CC: Jiri Olsa <[email protected]>
> CC: Thomas Gleixner <[email protected]>
> CC: Oleg Nesterov <[email protected]>
> ---
> arch/x86/kernel/vsyscall_64.c | 11 ++++++++---
> 1 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kernel/vsyscall_64.c b/arch/x86/kernel/vsyscall_64.c
> index 1c0c6ab..dce0c3c 100644
> --- a/arch/x86/kernel/vsyscall_64.c
> +++ b/arch/x86/kernel/vsyscall_64.c
> @@ -169,13 +169,18 @@ int __vsyscall(0) vgettimeofday(struct timeval * tv, struct timezone * tz)
> * unlikely */
> time_t __vsyscall(1) vtime(time_t *t)
> {
> - struct timeval tv;
> + unsigned seq;
> time_t result;
> if (unlikely(!__vsyscall_gtod_data.sysctl_enabled))
> return time_syscall(t);
>
> - vgettimeofday(&tv, NULL);
> - result = tv.tv_sec;
> + do {
> + seq = read_seqbegin(&__vsyscall_gtod_data.lock);
> +
> + result = __vsyscall_gtod_data.wall_time_sec;
> +
> + } while (read_seqretry(&__vsyscall_gtod_data.lock, seq));
> +
> if (t)
> *t = result;
> return result;
> --
> 1.6.0.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/


2010-07-14 16:19:36

by john stultz

[permalink] [raw]
Subject: Re: [PATCH 01/11] x86: Fix vtime/file timestamp inconsistencies

On Wed, 2010-07-14 at 11:40 +0900, KOSAKI Motohiro wrote:
> Hi
>
> > Due to vtime calling vgettimeofday(), its possible that an application
> > could call time();create("stuff",O_RDRW); only to see the file's
> > creation timestamp to be before the value returned by time.
>
> Just dumb question.
>
> Almost application are using gettimeofday() instead time(). It mean
> your fix don't solve almost application.

Correct, filesystem timestamps and gettimeofday can still seem
inconsistently ordered. But that is expected.

Because of granularity differences (one interface is only tick
resolution, the other is clocksource resolution), we can't interleave
the two interfaces (time and gettimeofday, respectively) and expect to
get ordered results.

This is why the fix I'm proposing is important: Filesystem timestamps
have always been tick granular, so when vtime() was made clocksource
granular (by using vgettime internally) we broke the historic
expectation that the time() interface could be interleaved with
filesystem operations.

Side note: For full nanosecond resolution of the tick-granular
timestamps, check out the clock_gettime(CLOCK_REALTIME_COARSE, ...)
interface.


> So, Why can't we fix vgettimeofday() vs create() inconsistency?
> This is just question, I don't intend to disagree you.

The only way to make gettimeofday and create consistent is to use
gettimeofday clocksource resolution timestamps for files. This however
would potentially cause a large performance hit, since each every file
timestamp would require a possibly expensive read of the clocksource.

thanks
-john

2010-07-15 01:51:17

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH 01/11] x86: Fix vtime/file timestamp inconsistencies

> On Wed, 2010-07-14 at 11:40 +0900, KOSAKI Motohiro wrote:
> > Hi
> >
> > > Due to vtime calling vgettimeofday(), its possible that an application
> > > could call time();create("stuff",O_RDRW); only to see the file's
> > > creation timestamp to be before the value returned by time.
> >
> > Just dumb question.
> >
> > Almost application are using gettimeofday() instead time(). It mean
> > your fix don't solve almost application.
>
> Correct, filesystem timestamps and gettimeofday can still seem
> inconsistently ordered. But that is expected.
>
> Because of granularity differences (one interface is only tick
> resolution, the other is clocksource resolution), we can't interleave
> the two interfaces (time and gettimeofday, respectively) and expect to
> get ordered results.

hmmm...
Yes, times() vs gettimeofday() mekes no sense. nobody want this. but
I don't understand why we can ignore gettimeofday() vs file-tiemstamp.


> This is why the fix I'm proposing is important: Filesystem timestamps
> have always been tick granular, so when vtime() was made clocksource
> granular (by using vgettime internally) we broke the historic
> expectation that the time() interface could be interleaved with
> filesystem operations.
>
> Side note: For full nanosecond resolution of the tick-granular
> timestamps, check out the clock_gettime(CLOCK_REALTIME_COARSE, ...)
> interface.
>
>
> > So, Why can't we fix vgettimeofday() vs create() inconsistency?
> > This is just question, I don't intend to disagree you.
>
> The only way to make gettimeofday and create consistent is to use
> gettimeofday clocksource resolution timestamps for files. This however
> would potentially cause a large performance hit, since each every file
> timestamp would require a possibly expensive read of the clocksource.

Why clocksource() reading is so slow? the implementation of current
tsc clocksource ->read method is here.


static cycle_t read_tsc(struct clocksource *cs)
{
cycle_t ret = (cycle_t)get_cycles();

return ret >= clocksource_tsc.cycle_last ?
ret : clocksource_tsc.cycle_last;
}

It mean, the difference is almost only one rdtsc.
And, now we have RELATIME. then crazy atime frequently updating issue
has been solved.

Can you please elaborate your worry? I think I haven't get which case
you worry.

Thanks.

2010-07-15 02:47:00

by john stultz

[permalink] [raw]
Subject: Re: [PATCH 01/11] x86: Fix vtime/file timestamp inconsistencies

On Thu, 2010-07-15 at 10:51 +0900, KOSAKI Motohiro wrote:
> > On Wed, 2010-07-14 at 11:40 +0900, KOSAKI Motohiro wrote:
> > > Hi
> > >
> > > > Due to vtime calling vgettimeofday(), its possible that an application
> > > > could call time();create("stuff",O_RDRW); only to see the file's
> > > > creation timestamp to be before the value returned by time.
> > >
> > > Just dumb question.
> > >
> > > Almost application are using gettimeofday() instead time(). It mean
> > > your fix don't solve almost application.
> >
> > Correct, filesystem timestamps and gettimeofday can still seem
> > inconsistently ordered. But that is expected.
> >
> > Because of granularity differences (one interface is only tick
> > resolution, the other is clocksource resolution), we can't interleave
> > the two interfaces (time and gettimeofday, respectively) and expect to
> > get ordered results.
>
> hmmm...
> Yes, times() vs gettimeofday() mekes no sense. nobody want this. but
> I don't understand why we can ignore gettimeofday() vs file-tiemstamp.


So, just to be clear, this discussion is really around the question of
"Why don't filesystems use a clocksource-granular (ie: getnstimeofday())
timestamps instead of tick-granular (ie current_kernel_time())
timestamps."

However, this is *not* what the patch that started this thread was
about. In the patch I'm simply fixing an inconsistency in the vtime
interface, where it does not align with what the syscall-time interface
provides.

The issue was noticed via inconsistencies with filesystem timestamps,
but the patch does not change anything to do with filesystem timestamp
behavior.


> > This is why the fix I'm proposing is important: Filesystem timestamps
> > have always been tick granular, so when vtime() was made clocksource
> > granular (by using vgettime internally) we broke the historic
> > expectation that the time() interface could be interleaved with
> > filesystem operations.
> >
> > Side note: For full nanosecond resolution of the tick-granular
> > timestamps, check out the clock_gettime(CLOCK_REALTIME_COARSE, ...)
> > interface.
> >
> >
> > > So, Why can't we fix vgettimeofday() vs create() inconsistency?
> > > This is just question, I don't intend to disagree you.
> >
> > The only way to make gettimeofday and create consistent is to use
> > gettimeofday clocksource resolution timestamps for files. This however
> > would potentially cause a large performance hit, since each every file
> > timestamp would require a possibly expensive read of the clocksource.
>
> Why clocksource() reading is so slow? the implementation of current
> tsc clocksource ->read method is here.
>
>
> static cycle_t read_tsc(struct clocksource *cs)
> {
> cycle_t ret = (cycle_t)get_cycles();
>
> return ret >= clocksource_tsc.cycle_last ?
> ret : clocksource_tsc.cycle_last;
> }
>
> It mean, the difference is almost only one rdtsc.

Sure, for hardware that can use the TSC clocksource, it is fairly cheap,
however there are numerous systems that cannot use the TSC (or
architectures that don't have a fast TSC like counter) and in those
cases a read can take more then a microsecond.

Even with the TSC, the multiplication required to convert to nanoseconds
adds extra overhead that isn't seen when using the pre-calculated
tick-granular current_kernel_time() value.

It may not seem like much, but with filesystems each small delay adds
up.

I'm not a filesystems guy, and maybe there are some filesystems that
really want very fine-grained timestamps. If so they can consider
switching from using current_kernel_time() to getnstimeofday(). But due
to the likely performance impact, its not something I'd suggest doing.

thanks
-john

2010-07-15 02:51:43

by john stultz

[permalink] [raw]
Subject: Re: [PATCH 01/11] x86: Fix vtime/file timestamp inconsistencies

On Thu, 2010-07-15 at 10:51 +0900, KOSAKI Motohiro wrote:
> > On Wed, 2010-07-14 at 11:40 +0900, KOSAKI Motohiro wrote:
> > > Hi
> > >
> > > > Due to vtime calling vgettimeofday(), its possible that an application
> > > > could call time();create("stuff",O_RDRW); only to see the file's
> > > > creation timestamp to be before the value returned by time.
> > >
> > > Just dumb question.
> > >
> > > Almost application are using gettimeofday() instead time(). It mean
> > > your fix don't solve almost application.
> >
> > Correct, filesystem timestamps and gettimeofday can still seem
> > inconsistently ordered. But that is expected.
> >
> > Because of granularity differences (one interface is only tick
> > resolution, the other is clocksource resolution), we can't interleave
> > the two interfaces (time and gettimeofday, respectively) and expect to
> > get ordered results.
>
> hmmm...
> Yes, times() vs gettimeofday() mekes no sense. nobody want this. but
> I don't understand why we can ignore gettimeofday() vs file-tiemstamp.

Oh.. and another bit worth mentioning again:
clock_gettime(CLOCK_REALTIME_COARSE, ...) provides tick-granular output
that should be able to be correctly interleaved with filesystem
timestmaps.

So if there's an application that is using gettimeofday() for logging
and having problems trying to map the log timestmaps with filesystem
timestamps, they can use clock_gettime(CLOCK_REALTIME_COARSE,...) to do
so correctly.

thanks
-john

2010-07-15 04:42:14

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH 01/11] x86: Fix vtime/file timestamp inconsistencies

> On Thu, 2010-07-15 at 10:51 +0900, KOSAKI Motohiro wrote:
> > > On Wed, 2010-07-14 at 11:40 +0900, KOSAKI Motohiro wrote:
> > > > Hi
> > > >
> > > > > Due to vtime calling vgettimeofday(), its possible that an application
> > > > > could call time();create("stuff",O_RDRW); only to see the file's
> > > > > creation timestamp to be before the value returned by time.
> > > >
> > > > Just dumb question.
> > > >
> > > > Almost application are using gettimeofday() instead time(). It mean
> > > > your fix don't solve almost application.
> > >
> > > Correct, filesystem timestamps and gettimeofday can still seem
> > > inconsistently ordered. But that is expected.
> > >
> > > Because of granularity differences (one interface is only tick
> > > resolution, the other is clocksource resolution), we can't interleave
> > > the two interfaces (time and gettimeofday, respectively) and expect to
> > > get ordered results.
> >
> > hmmm...
> > Yes, times() vs gettimeofday() mekes no sense. nobody want this. but
> > I don't understand why we can ignore gettimeofday() vs file-tiemstamp.
>
>
> So, just to be clear, this discussion is really around the question of
> "Why don't filesystems use a clocksource-granular (ie: getnstimeofday())
> timestamps instead of tick-granular (ie current_kernel_time())
> timestamps."
>
> However, this is *not* what the patch that started this thread was
> about. In the patch I'm simply fixing an inconsistency in the vtime
> interface, where it does not align with what the syscall-time interface
> provides.
>
> The issue was noticed via inconsistencies with filesystem timestamps,
> but the patch does not change anything to do with filesystem timestamp
> behavior.

Ah, I see. This patch is unrelated to filesystem timestamp. It fix inconsistency
vsyscall with syscall.

I agree that it should be fixed. So yes, other parts of my mail is a bit offtopic.


> > > This is why the fix I'm proposing is important: Filesystem timestamps
> > > have always been tick granular, so when vtime() was made clocksource
> > > granular (by using vgettime internally) we broke the historic
> > > expectation that the time() interface could be interleaved with
> > > filesystem operations.
> > >
> > > Side note: For full nanosecond resolution of the tick-granular
> > > timestamps, check out the clock_gettime(CLOCK_REALTIME_COARSE, ...)
> > > interface.
> > >
> > >
> > > > So, Why can't we fix vgettimeofday() vs create() inconsistency?
> > > > This is just question, I don't intend to disagree you.
> > >
> > > The only way to make gettimeofday and create consistent is to use
> > > gettimeofday clocksource resolution timestamps for files. This however
> > > would potentially cause a large performance hit, since each every file
> > > timestamp would require a possibly expensive read of the clocksource.
> >
> > Why clocksource() reading is so slow? the implementation of current
> > tsc clocksource ->read method is here.
> >
> >
> > static cycle_t read_tsc(struct clocksource *cs)
> > {
> > cycle_t ret = (cycle_t)get_cycles();
> >
> > return ret >= clocksource_tsc.cycle_last ?
> > ret : clocksource_tsc.cycle_last;
> > }
> >
> > It mean, the difference is almost only one rdtsc.
>
> Sure, for hardware that can use the TSC clocksource, it is fairly cheap,
> however there are numerous systems that cannot use the TSC (or
> architectures that don't have a fast TSC like counter) and in those
> cases a read can take more then a microsecond.

I'm not timekeeping expert. but my first impression is, if clocksource->read
need more than a microsecond, it's really problematic. ->read of such clocksource
should always return 0 instead honestly reading h/w counter.

>
> Even with the TSC, the multiplication required to convert to nanoseconds
> adds extra overhead that isn't seen when using the pre-calculated
> tick-granular current_kernel_time() value.
>
> It may not seem like much, but with filesystems each small delay adds
> up.
>
> I'm not a filesystems guy, and maybe there are some filesystems that
> really want very fine-grained timestamps. If so they can consider
> switching from using current_kernel_time() to getnstimeofday(). But due
> to the likely performance impact, its not something I'd suggest doing.

Again, I'm not against you. I only would like to hear what you propose. because
I'm not sure rough granularity time() vsyscall really makes userland happy.
because (again) as far as iknow, alomsot applications don't use time().

So, I worry about more big issue remained.



2010-07-15 04:42:06

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH 01/11] x86: Fix vtime/file timestamp inconsistencies

> On Thu, 2010-07-15 at 10:51 +0900, KOSAKI Motohiro wrote:
> > > On Wed, 2010-07-14 at 11:40 +0900, KOSAKI Motohiro wrote:
> > > > Hi
> > > >
> > > > > Due to vtime calling vgettimeofday(), its possible that an application
> > > > > could call time();create("stuff",O_RDRW); only to see the file's
> > > > > creation timestamp to be before the value returned by time.
> > > >
> > > > Just dumb question.
> > > >
> > > > Almost application are using gettimeofday() instead time(). It mean
> > > > your fix don't solve almost application.
> > >
> > > Correct, filesystem timestamps and gettimeofday can still seem
> > > inconsistently ordered. But that is expected.
> > >
> > > Because of granularity differences (one interface is only tick
> > > resolution, the other is clocksource resolution), we can't interleave
> > > the two interfaces (time and gettimeofday, respectively) and expect to
> > > get ordered results.
> >
> > hmmm...
> > Yes, times() vs gettimeofday() mekes no sense. nobody want this. but
> > I don't understand why we can ignore gettimeofday() vs file-tiemstamp.
>
> Oh.. and another bit worth mentioning again:
> clock_gettime(CLOCK_REALTIME_COARSE, ...) provides tick-granular output
> that should be able to be correctly interleaved with filesystem
> timestmaps.
>
> So if there's an application that is using gettimeofday() for logging
> and having problems trying to map the log timestmaps with filesystem
> timestamps, they can use clock_gettime(CLOCK_REALTIME_COARSE,...) to do
> so correctly.

Correct. but I disagree few bit . 1) application naturally assume time don't
makes interleaving. so almost all applications don't have such care. 2) tick-granular fs
timestamp is only current implementaion. perhaps we will change it later. so, applications
don't want to assume fs timestamp granularity is equal to CLOCK_REALTIME_COARSE.


2010-07-15 19:30:38

by john stultz

[permalink] [raw]
Subject: Re: [PATCH 01/11] x86: Fix vtime/file timestamp inconsistencies

On Thu, 2010-07-15 at 13:41 +0900, KOSAKI Motohiro wrote:
> > On Thu, 2010-07-15 at 10:51 +0900, KOSAKI Motohiro wrote:
> > > > On Wed, 2010-07-14 at 11:40 +0900, KOSAKI Motohiro wrote:
> > > > > Hi
> > > > >
> > > > > > Due to vtime calling vgettimeofday(), its possible that an application
> > > > > > could call time();create("stuff",O_RDRW); only to see the file's
> > > > > > creation timestamp to be before the value returned by time.
> > > > >
> > > > > Just dumb question.
> > > > >
> > > > > Almost application are using gettimeofday() instead time(). It mean
> > > > > your fix don't solve almost application.
> > > >
> > > > Correct, filesystem timestamps and gettimeofday can still seem
> > > > inconsistently ordered. But that is expected.
> > > >
> > > > Because of granularity differences (one interface is only tick
> > > > resolution, the other is clocksource resolution), we can't interleave
> > > > the two interfaces (time and gettimeofday, respectively) and expect to
> > > > get ordered results.
> > >
> > > hmmm...
> > > Yes, times() vs gettimeofday() mekes no sense. nobody want this. but
> > > I don't understand why we can ignore gettimeofday() vs file-tiemstamp.
> >
> >
> > So, just to be clear, this discussion is really around the question of
> > "Why don't filesystems use a clocksource-granular (ie: getnstimeofday())
> > timestamps instead of tick-granular (ie current_kernel_time())
> > timestamps."
> >
> > However, this is *not* what the patch that started this thread was
> > about. In the patch I'm simply fixing an inconsistency in the vtime
> > interface, where it does not align with what the syscall-time interface
> > provides.
> >
> > The issue was noticed via inconsistencies with filesystem timestamps,
> > but the patch does not change anything to do with filesystem timestamp
> > behavior.
>
> Ah, I see. This patch is unrelated to filesystem timestamp. It fix inconsistency
> vsyscall with syscall.
>
> I agree that it should be fixed. So yes, other parts of my mail is a bit offtopic.
>
>
> > > > This is why the fix I'm proposing is important: Filesystem timestamps
> > > > have always been tick granular, so when vtime() was made clocksource
> > > > granular (by using vgettime internally) we broke the historic
> > > > expectation that the time() interface could be interleaved with
> > > > filesystem operations.
> > > >
> > > > Side note: For full nanosecond resolution of the tick-granular
> > > > timestamps, check out the clock_gettime(CLOCK_REALTIME_COARSE, ...)
> > > > interface.
> > > >
> > > >
> > > > > So, Why can't we fix vgettimeofday() vs create() inconsistency?
> > > > > This is just question, I don't intend to disagree you.
> > > >
> > > > The only way to make gettimeofday and create consistent is to use
> > > > gettimeofday clocksource resolution timestamps for files. This however
> > > > would potentially cause a large performance hit, since each every file
> > > > timestamp would require a possibly expensive read of the clocksource.
> > >
> > > Why clocksource() reading is so slow? the implementation of current
> > > tsc clocksource ->read method is here.
> > >
> > >
> > > static cycle_t read_tsc(struct clocksource *cs)
> > > {
> > > cycle_t ret = (cycle_t)get_cycles();
> > >
> > > return ret >= clocksource_tsc.cycle_last ?
> > > ret : clocksource_tsc.cycle_last;
> > > }
> > >
> > > It mean, the difference is almost only one rdtsc.
> >
> > Sure, for hardware that can use the TSC clocksource, it is fairly cheap,
> > however there are numerous systems that cannot use the TSC (or
> > architectures that don't have a fast TSC like counter) and in those
> > cases a read can take more then a microsecond.
>
> I'm not timekeeping expert. but my first impression is, if clocksource->read
> need more than a microsecond, it's really problematic. ->read of such clocksource
> should always return 0 instead honestly reading h/w counter.

Sadly there is quite a lot of x86 hardware that cannot use the TSC. So
the only alternative is the HPET (~0.8us) or ACPI PM (~1.2us).

If the clocksource->read() function returned 0 on those systems, then
gettimeofday would return only tick-granular time (again, which is what
CLOCK_REALTIME_COARSE already provides).

That said, Ingo had an optimization patch to do something quite similar,
giving up resolution for speed. And now the CLOCK_REALTIME_COARSE code
is there it might be even easier to implement, but its not something we
can enable by default, as inter-tick resolution is need in many cases.

And yes, ideally every system would have a fast TSC like counter that
was accurate and reliable, and this would be less of an issue, but we
have to work with the hardware that is out there.


> > Even with the TSC, the multiplication required to convert to nanoseconds
> > adds extra overhead that isn't seen when using the pre-calculated
> > tick-granular current_kernel_time() value.
> >
> > It may not seem like much, but with filesystems each small delay adds
> > up.
> >
> > I'm not a filesystems guy, and maybe there are some filesystems that
> > really want very fine-grained timestamps. If so they can consider
> > switching from using current_kernel_time() to getnstimeofday(). But due
> > to the likely performance impact, its not something I'd suggest doing.
>
> Again, I'm not against you. I only would like to hear what you propose. because
> I'm not sure rough granularity time() vsyscall really makes userland happy.
> because (again) as far as iknow, alomsot applications don't use time().

Since I assume the developers who implemented the filesystem have
considered this trade off and made a choice. I honestly don't have much
to propose here. :)

I think if you feel strongly that filesystems should use
clocksource-granular instead of tick-granular timestamps, you might try
to bring it up on ext4 devel list or even generate a patch and try it
out yourself (I've provided a trivial starting point for you below - but
its likely a real solution will be a bit more complex).

Good luck!

thanks
-john


diff --git a/kernel/time.c b/kernel/time.c
index 848b1c2..ce10dae 100644
--- a/kernel/time.c
+++ b/kernel/time.c
@@ -227,7 +227,8 @@ SYSCALL_DEFINE1(adjtimex, struct timex __user *, txc_p)
*/
struct timespec current_fs_time(struct super_block *sb)
{
- struct timespec now = current_kernel_time();
+ struct timespec now;
+ getnstimeofday(&now);
return timespec_trunc(now, sb->s_time_gran);
}
EXPORT_SYMBOL(current_fs_time);




2010-07-27 10:46:10

by john stultz

[permalink] [raw]
Subject: [tip:timers/clocksource] x86: Fix vtime/file timestamp inconsistencies

Commit-ID: 8c73626ab28527b7eb7f3061c027fbfe530c488c
Gitweb: http://git.kernel.org/tip/8c73626ab28527b7eb7f3061c027fbfe530c488c
Author: John Stultz <[email protected]>
AuthorDate: Tue, 13 Jul 2010 17:56:18 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Tue, 27 Jul 2010 12:40:53 +0200

x86: Fix vtime/file timestamp inconsistencies

Due to vtime calling vgettimeofday(), its possible that an application
could call time();create("stuff",O_RDRW); only to see the file's
creation timestamp to be before the value returned by time.

A similar way to reproduce the issue is to compare the vsyscall time()
with the syscall time(), and observe ordering issues.

The modified test case from Oleg Nesterov below can illustrate this:

int main(void)
{
time_t sec1,sec2;
do {
sec1 = time(&sec2);
sec2 = syscall(__NR_time, NULL);
} while (sec1 <= sec2);

printf("vtime: %d.000000\n", sec1);
printf("time: %d.000000\n", sec2);
return 0;
}

The proper fix is to make vtime use the same time value as
current_kernel_time() (which is exported via update_vsyscall) instead of
vgettime().

Thanks to Jiri Olsa for bringing up the issue and catching bugs in
earlier verisons of this fix.

Signed-off-by: John Stultz <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Oleg Nesterov <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
---
arch/x86/kernel/vsyscall_64.c | 11 ++++++++---
1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/vsyscall_64.c b/arch/x86/kernel/vsyscall_64.c
index 1c0c6ab..dce0c3c 100644
--- a/arch/x86/kernel/vsyscall_64.c
+++ b/arch/x86/kernel/vsyscall_64.c
@@ -169,13 +169,18 @@ int __vsyscall(0) vgettimeofday(struct timeval * tv, struct timezone * tz)
* unlikely */
time_t __vsyscall(1) vtime(time_t *t)
{
- struct timeval tv;
+ unsigned seq;
time_t result;
if (unlikely(!__vsyscall_gtod_data.sysctl_enabled))
return time_syscall(t);

- vgettimeofday(&tv, NULL);
- result = tv.tv_sec;
+ do {
+ seq = read_seqbegin(&__vsyscall_gtod_data.lock);
+
+ result = __vsyscall_gtod_data.wall_time_sec;
+
+ } while (read_seqretry(&__vsyscall_gtod_data.lock, seq));
+
if (t)
*t = result;
return result;

2010-07-27 10:46:29

by john stultz

[permalink] [raw]
Subject: [tip:timers/clocksource] time: Implement timespec_add

Commit-ID: ce3bf7ab22527183634a76512d9854a38615e4d5
Gitweb: http://git.kernel.org/tip/ce3bf7ab22527183634a76512d9854a38615e4d5
Author: John Stultz <[email protected]>
AuthorDate: Tue, 13 Jul 2010 17:56:19 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Tue, 27 Jul 2010 12:40:53 +0200

time: Implement timespec_add

After accidentally misusing timespec_add_safe, I wanted to make sure
we don't accidently trip over that issue again, so I created a simple
timespec_add() function which we can use to replace the instances
of timespec_add_safe() that don't want the overflow detection.

Signed-off-by: John Stultz <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>

---
include/linux/time.h | 16 ++++++++++++++++
kernel/time/timekeeping.c | 6 +++---
2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/include/linux/time.h b/include/linux/time.h
index ea3559f0..9072df8 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -76,9 +76,25 @@ extern unsigned long mktime(const unsigned int year, const unsigned int mon,
const unsigned int min, const unsigned int sec);

extern void set_normalized_timespec(struct timespec *ts, time_t sec, s64 nsec);
+
+/*
+ * timespec_add_safe assumes both values are positive and checks
+ * for overflow. It will return TIME_T_MAX if the reutrn would be
+ * smaller then either of the arguments.
+ */
extern struct timespec timespec_add_safe(const struct timespec lhs,
const struct timespec rhs);

+
+static inline struct timespec timespec_add(struct timespec lhs,
+ struct timespec rhs)
+{
+ struct timespec ts_delta;
+ set_normalized_timespec(&ts_delta, lhs.tv_sec + rhs.tv_sec,
+ lhs.tv_nsec + rhs.tv_nsec);
+ return ts_delta;
+}
+
/*
* sub = lhs - rhs, in normalized form
*/
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index caf8d4d..623fe3d 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -579,9 +579,9 @@ static int timekeeping_resume(struct sys_device *dev)

if (timespec_compare(&ts, &timekeeping_suspend_time) > 0) {
ts = timespec_sub(ts, timekeeping_suspend_time);
- xtime = timespec_add_safe(xtime, ts);
+ xtime = timespec_add(xtime, ts);
wall_to_monotonic = timespec_sub(wall_to_monotonic, ts);
- total_sleep_time = timespec_add_safe(total_sleep_time, ts);
+ total_sleep_time = timespec_add(total_sleep_time, ts);
}
/* re-base the last cycle value */
timekeeper.clock->cycle_last = timekeeper.clock->read(timekeeper.clock);
@@ -887,7 +887,7 @@ EXPORT_SYMBOL_GPL(getboottime);
*/
void monotonic_to_bootbased(struct timespec *ts)
{
- *ts = timespec_add_safe(*ts, total_sleep_time);
+ *ts = timespec_add(*ts, total_sleep_time);
}
EXPORT_SYMBOL_GPL(monotonic_to_bootbased);

2010-07-27 10:46:52

by john stultz

[permalink] [raw]
Subject: [tip:timers/clocksource] time: Kill off CONFIG_GENERIC_TIME

Commit-ID: 592913ecb87a9e06f98ddb55b298f1a66bf94c6b
Gitweb: http://git.kernel.org/tip/592913ecb87a9e06f98ddb55b298f1a66bf94c6b
Author: John Stultz <[email protected]>
AuthorDate: Tue, 13 Jul 2010 17:56:20 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Tue, 27 Jul 2010 12:40:54 +0200

time: Kill off CONFIG_GENERIC_TIME

Now that all arches have been converted over to use generic time via
clocksources or arch_gettimeoffset(), we can remove the GENERIC_TIME
config option and simplify the generic code.

Signed-off-by: John Stultz <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
---
Documentation/kernel-parameters.txt | 3 +-
arch/alpha/Kconfig | 4 --
arch/arm/Kconfig | 4 --
arch/avr32/Kconfig | 3 --
arch/blackfin/Kconfig | 3 --
arch/cris/Kconfig | 3 --
arch/frv/Kconfig | 4 --
arch/h8300/Kconfig | 4 --
arch/ia64/Kconfig | 4 --
arch/m32r/Kconfig | 3 --
arch/m68k/Kconfig | 3 --
arch/m68knommu/Kconfig | 4 --
arch/microblaze/Kconfig | 3 --
arch/mips/Kconfig | 4 --
arch/mn10300/Kconfig | 3 --
arch/parisc/Kconfig | 4 --
arch/powerpc/Kconfig | 3 --
arch/s390/Kconfig | 3 --
arch/score/Kconfig | 3 --
arch/sh/Kconfig | 3 --
arch/sparc/Kconfig | 3 --
arch/um/Kconfig.common | 4 --
arch/x86/Kconfig | 5 +--
arch/xtensa/Kconfig | 3 --
drivers/Makefile | 4 ++-
drivers/acpi/acpi_pad.c | 2 +-
drivers/acpi/processor_idle.c | 2 +-
drivers/misc/Kconfig | 4 +-
kernel/time.c | 16 ----------
kernel/time/Kconfig | 4 +-
kernel/time/clocksource.c | 4 +-
kernel/time/timekeeping.c | 55 ++--------------------------------
kernel/trace/Kconfig | 4 +-
33 files changed, 19 insertions(+), 159 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 2b2407d..8abdfd7 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -73,7 +73,6 @@ parameter is applicable:
MTD MTD (Memory Technology Device) support is enabled.
NET Appropriate network support is enabled.
NUMA NUMA support is enabled.
- GENERIC_TIME The generic timeofday code is enabled.
NFS Appropriate NFS support is enabled.
OSS OSS sound support is enabled.
PV_OPS A paravirtualized kernel is enabled.
@@ -468,7 +467,7 @@ and is between 256 and 4096 characters. It is defined in the file
clocksource is not available, it defaults to PIT.
Format: { pit | tsc | cyclone | pmtmr }

- clocksource= [GENERIC_TIME] Override the default clocksource
+ clocksource= Override the default clocksource
Format: <string>
Override the default clocksource and use the clocksource
with the name specified.
diff --git a/arch/alpha/Kconfig b/arch/alpha/Kconfig
index 3e2e540..b9647bb 100644
--- a/arch/alpha/Kconfig
+++ b/arch/alpha/Kconfig
@@ -47,10 +47,6 @@ config GENERIC_CALIBRATE_DELAY
bool
default y

-config GENERIC_TIME
- bool
- default y
-
config GENERIC_CMOS_UPDATE
def_bool y

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 98922f7..655b4ae 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -41,10 +41,6 @@ config SYS_SUPPORTS_APM_EMULATION
config GENERIC_GPIO
bool

-config GENERIC_TIME
- bool
- default y
-
config ARCH_USES_GETTIMEOFFSET
bool
default n
diff --git a/arch/avr32/Kconfig b/arch/avr32/Kconfig
index f2b3193..f515727 100644
--- a/arch/avr32/Kconfig
+++ b/arch/avr32/Kconfig
@@ -45,9 +45,6 @@ config GENERIC_IRQ_PROBE
config RWSEM_GENERIC_SPINLOCK
def_bool y

-config GENERIC_TIME
- def_bool y
-
config GENERIC_CLOCKEVENTS
def_bool y

diff --git a/arch/blackfin/Kconfig b/arch/blackfin/Kconfig
index f66294b..c88fd35 100644
--- a/arch/blackfin/Kconfig
+++ b/arch/blackfin/Kconfig
@@ -614,9 +614,6 @@ comment "Kernel Timer/Scheduler"

source kernel/Kconfig.hz

-config GENERIC_TIME
- def_bool y
-
config GENERIC_CLOCKEVENTS
bool "Generic clock events"
default y
diff --git a/arch/cris/Kconfig b/arch/cris/Kconfig
index e25bf44..887ef85 100644
--- a/arch/cris/Kconfig
+++ b/arch/cris/Kconfig
@@ -20,9 +20,6 @@ config RWSEM_GENERIC_SPINLOCK
config RWSEM_XCHGADD_ALGORITHM
bool

-config GENERIC_TIME
- def_bool y
-
config GENERIC_CMOS_UPDATE
def_bool y

diff --git a/arch/frv/Kconfig b/arch/frv/Kconfig
index 4b5830b..16399bd 100644
--- a/arch/frv/Kconfig
+++ b/arch/frv/Kconfig
@@ -40,10 +40,6 @@ config GENERIC_HARDIRQS_NO__DO_IRQ
bool
default y

-config GENERIC_TIME
- bool
- default y
-
config TIME_LOW_RES
bool
default y
diff --git a/arch/h8300/Kconfig b/arch/h8300/Kconfig
index 53cc669..988b6ff 100644
--- a/arch/h8300/Kconfig
+++ b/arch/h8300/Kconfig
@@ -62,10 +62,6 @@ config GENERIC_CALIBRATE_DELAY
bool
default y

-config GENERIC_TIME
- bool
- default y
-
config GENERIC_BUG
bool
depends on BUG
diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 9561082..8711d13 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -82,10 +82,6 @@ config GENERIC_CALIBRATE_DELAY
bool
default y

-config GENERIC_TIME
- bool
- default y
-
config GENERIC_TIME_VSYSCALL
bool
default y
diff --git a/arch/m32r/Kconfig b/arch/m32r/Kconfig
index 3a9319f..836abbb 100644
--- a/arch/m32r/Kconfig
+++ b/arch/m32r/Kconfig
@@ -44,9 +44,6 @@ config HZ
int
default 100

-config GENERIC_TIME
- def_bool y
-
config ARCH_USES_GETTIMEOFFSET
def_bool y

diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
index 2e3737b..8030e24 100644
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@@ -59,9 +59,6 @@ config HZ
int
default 100

-config GENERIC_TIME
- def_bool y
-
config ARCH_USES_GETTIMEOFFSET
def_bool y

diff --git a/arch/m68knommu/Kconfig b/arch/m68knommu/Kconfig
index efeb603..2609c39 100644
--- a/arch/m68knommu/Kconfig
+++ b/arch/m68knommu/Kconfig
@@ -63,10 +63,6 @@ config GENERIC_CALIBRATE_DELAY
bool
default y

-config GENERIC_TIME
- bool
- default y
-
config GENERIC_CMOS_UPDATE
bool
default y
diff --git a/arch/microblaze/Kconfig b/arch/microblaze/Kconfig
index 505a085..14f03ce 100644
--- a/arch/microblaze/Kconfig
+++ b/arch/microblaze/Kconfig
@@ -48,9 +48,6 @@ config GENERIC_IRQ_PROBE
config GENERIC_CALIBRATE_DELAY
def_bool y

-config GENERIC_TIME
- def_bool y
-
config GENERIC_TIME_VSYSCALL
def_bool n

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index cdaae94..01c44cb 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -733,10 +733,6 @@ config GENERIC_CLOCKEVENTS
bool
default y

-config GENERIC_TIME
- bool
- default y
-
config GENERIC_CMOS_UPDATE
bool
default y
diff --git a/arch/mn10300/Kconfig b/arch/mn10300/Kconfig
index 1c4565a..444b9f9 100644
--- a/arch/mn10300/Kconfig
+++ b/arch/mn10300/Kconfig
@@ -46,9 +46,6 @@ config GENERIC_FIND_NEXT_BIT
config GENERIC_HWEIGHT
def_bool y

-config GENERIC_TIME
- def_bool y
-
config GENERIC_BUG
def_bool y

diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
index 05a366a..907417d 100644
--- a/arch/parisc/Kconfig
+++ b/arch/parisc/Kconfig
@@ -66,10 +66,6 @@ config GENERIC_CALIBRATE_DELAY
bool
default y

-config GENERIC_TIME
- bool
- default y
-
config TIME_LOW_RES
bool
depends on SMP
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 2031a28..25e6bf4 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -29,9 +29,6 @@ config MMU
config GENERIC_CMOS_UPDATE
def_bool y

-config GENERIC_TIME
- def_bool y
-
config GENERIC_TIME_VSYSCALL
def_bool y

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index bee1c0f..f0777a4 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -40,9 +40,6 @@ config ARCH_HAS_ILOG2_U64
config GENERIC_HWEIGHT
def_bool y

-config GENERIC_TIME
- def_bool y
-
config GENERIC_TIME_VSYSCALL
def_bool y

diff --git a/arch/score/Kconfig b/arch/score/Kconfig
index 55d413e..be4a155 100644
--- a/arch/score/Kconfig
+++ b/arch/score/Kconfig
@@ -55,9 +55,6 @@ config GENERIC_CALIBRATE_DELAY
config GENERIC_CLOCKEVENTS
def_bool y

-config GENERIC_TIME
- def_bool y
-
config SCHED_NO_NO_OMIT_FRAME_POINTER
def_bool y

diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
index 82868fe..33990fa 100644
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -98,9 +98,6 @@ config GENERIC_CALIBRATE_DELAY
config GENERIC_IOMAP
bool

-config GENERIC_TIME
- def_bool y
-
config GENERIC_CLOCKEVENTS
def_bool y

diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index c0015db..1cd0d9d 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -66,9 +66,6 @@ config BITS
default 32 if SPARC32
default 64 if SPARC64

-config GENERIC_TIME
- def_bool y
-
config ARCH_USES_GETTIMEOFFSET
bool
default y if SPARC32
diff --git a/arch/um/Kconfig.common b/arch/um/Kconfig.common
index 0d207e7..7c8e277 100644
--- a/arch/um/Kconfig.common
+++ b/arch/um/Kconfig.common
@@ -55,10 +55,6 @@ config GENERIC_BUG
default y
depends on BUG

-config GENERIC_TIME
- bool
- default y
-
config GENERIC_CLOCKEVENTS
bool
default y
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index dcb0593..546b610a 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -72,9 +72,6 @@ config ARCH_DEFCONFIG
default "arch/x86/configs/i386_defconfig" if X86_32
default "arch/x86/configs/x86_64_defconfig" if X86_64

-config GENERIC_TIME
- def_bool y
-
config GENERIC_CMOS_UPDATE
def_bool y

@@ -2046,7 +2043,7 @@ config SCx200

config SCx200HR_TIMER
tristate "NatSemi SCx200 27MHz High-Resolution Timer Support"
- depends on SCx200 && GENERIC_TIME
+ depends on SCx200
default y
---help---
This driver provides a clocksource built upon the on-chip
diff --git a/arch/xtensa/Kconfig b/arch/xtensa/Kconfig
index ebe228d..0859bfd 100644
--- a/arch/xtensa/Kconfig
+++ b/arch/xtensa/Kconfig
@@ -48,9 +48,6 @@ config HZ
int
default 100

-config GENERIC_TIME
- def_bool y
-
source "init/Kconfig"
source "kernel/Kconfig.freezer"

diff --git a/drivers/Makefile b/drivers/Makefile
index 91874e0..ae47344 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -101,7 +101,9 @@ obj-y += firmware/
obj-$(CONFIG_CRYPTO) += crypto/
obj-$(CONFIG_SUPERH) += sh/
obj-$(CONFIG_ARCH_SHMOBILE) += sh/
-obj-$(CONFIG_GENERIC_TIME) += clocksource/
+ifndef CONFIG_ARCH_USES_GETTIMEOFFSET
+obj-y += clocksource/
+endif
obj-$(CONFIG_DMA_ENGINE) += dma/
obj-$(CONFIG_DCA) += dca/
obj-$(CONFIG_HID) += hid/
diff --git a/drivers/acpi/acpi_pad.c b/drivers/acpi/acpi_pad.c
index 446aced..b76848c 100644
--- a/drivers/acpi/acpi_pad.c
+++ b/drivers/acpi/acpi_pad.c
@@ -77,7 +77,7 @@ static void power_saving_mwait_init(void)
power_saving_mwait_eax = (highest_cstate << MWAIT_SUBSTATE_SIZE) |
(highest_subcstate - 1);

-#if defined(CONFIG_GENERIC_TIME) && defined(CONFIG_X86)
+#if defined(CONFIG_X86)
switch (boot_cpu_data.x86_vendor) {
case X86_VENDOR_AMD:
case X86_VENDOR_INTEL:
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index e9a8026..294e10b 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -264,7 +264,7 @@ int acpi_processor_resume(struct acpi_device * device)
return 0;
}

-#if defined (CONFIG_GENERIC_TIME) && defined (CONFIG_X86)
+#if defined(CONFIG_X86)
static void tsc_check_state(int state)
{
switch (boot_cpu_data.x86_vendor) {
diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 26386a9..5b9ba48 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -72,7 +72,7 @@ config ATMEL_TCLIB

config ATMEL_TCB_CLKSRC
bool "TC Block Clocksource"
- depends on ATMEL_TCLIB && GENERIC_TIME
+ depends on ATMEL_TCLIB
default y
help
Select this to get a high precision clocksource based on a
@@ -240,7 +240,7 @@ config CS5535_MFGPT_DEFAULT_IRQ

config CS5535_CLOCK_EVENT_SRC
tristate "CS5535/CS5536 high-res timer (MFGPT) events"
- depends on GENERIC_TIME && GENERIC_CLOCKEVENTS && CS5535_MFGPT
+ depends on GENERIC_CLOCKEVENTS && CS5535_MFGPT
help
This driver provides a clock event source based on the MFGPT
timer(s) in the CS5535 and CS5536 companion chips.
diff --git a/kernel/time.c b/kernel/time.c
index 848b1c2..ba9b338 100644
--- a/kernel/time.c
+++ b/kernel/time.c
@@ -300,22 +300,6 @@ struct timespec timespec_trunc(struct timespec t, unsigned gran)
}
EXPORT_SYMBOL(timespec_trunc);

-#ifndef CONFIG_GENERIC_TIME
-/*
- * Simulate gettimeofday using do_gettimeofday which only allows a timeval
- * and therefore only yields usec accuracy
- */
-void getnstimeofday(struct timespec *tv)
-{
- struct timeval x;
-
- do_gettimeofday(&x);
- tv->tv_sec = x.tv_sec;
- tv->tv_nsec = x.tv_usec * NSEC_PER_USEC;
-}
-EXPORT_SYMBOL_GPL(getnstimeofday);
-#endif
-
/* Converts Gregorian date to seconds since 1970-01-01 00:00:00.
* Assumes input in normal date format, i.e. 1980-12-31 23:59:59
* => year=1980, mon=12, day=31, hour=23, min=59, sec=59.
diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
index 95ed429..f06a8a3 100644
--- a/kernel/time/Kconfig
+++ b/kernel/time/Kconfig
@@ -6,7 +6,7 @@ config TICK_ONESHOT

config NO_HZ
bool "Tickless System (Dynamic Ticks)"
- depends on GENERIC_TIME && GENERIC_CLOCKEVENTS
+ depends on !ARCH_USES_GETTIMEOFFSET && GENERIC_CLOCKEVENTS
select TICK_ONESHOT
help
This option enables a tickless system: timer interrupts will
@@ -15,7 +15,7 @@ config NO_HZ

config HIGH_RES_TIMERS
bool "High Resolution Timer Support"
- depends on GENERIC_TIME && GENERIC_CLOCKEVENTS
+ depends on !ARCH_USES_GETTIMEOFFSET && GENERIC_CLOCKEVENTS
select TICK_ONESHOT
help
This option enables high resolution timer support. If your
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index f08e99c..c543d21 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -531,7 +531,7 @@ static u64 clocksource_max_deferment(struct clocksource *cs)
return max_nsecs - (max_nsecs >> 5);
}

-#ifdef CONFIG_GENERIC_TIME
+#ifndef CONFIG_ARCH_USES_GETTIMEOFFSET

/**
* clocksource_select - Select the best clocksource available
@@ -577,7 +577,7 @@ static void clocksource_select(void)
}
}

-#else /* CONFIG_GENERIC_TIME */
+#else /* !CONFIG_ARCH_USES_GETTIMEOFFSET */

static inline void clocksource_select(void) { }

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 623fe3d..73edd40 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -173,8 +173,6 @@ void timekeeping_leap_insert(int leapsecond)
update_vsyscall(&xtime, timekeeper.clock, timekeeper.mult);
}

-#ifdef CONFIG_GENERIC_TIME
-
/**
* timekeeping_forward_now - update clock to the current time
*
@@ -376,52 +374,6 @@ void timekeeping_notify(struct clocksource *clock)
tick_clock_notify();
}

-#else /* GENERIC_TIME */
-
-static inline void timekeeping_forward_now(void) { }
-
-/**
- * ktime_get - get the monotonic time in ktime_t format
- *
- * returns the time in ktime_t format
- */
-ktime_t ktime_get(void)
-{
- struct timespec now;
-
- ktime_get_ts(&now);
-
- return timespec_to_ktime(now);
-}
-EXPORT_SYMBOL_GPL(ktime_get);
-
-/**
- * ktime_get_ts - get the monotonic clock in timespec format
- * @ts: pointer to timespec variable
- *
- * The function calculates the monotonic clock from the realtime
- * clock and the wall_to_monotonic offset and stores the result
- * in normalized timespec format in the variable pointed to by @ts.
- */
-void ktime_get_ts(struct timespec *ts)
-{
- struct timespec tomono;
- unsigned long seq;
-
- do {
- seq = read_seqbegin(&xtime_lock);
- getnstimeofday(ts);
- tomono = wall_to_monotonic;
-
- } while (read_seqretry(&xtime_lock, seq));
-
- set_normalized_timespec(ts, ts->tv_sec + tomono.tv_sec,
- ts->tv_nsec + tomono.tv_nsec);
-}
-EXPORT_SYMBOL_GPL(ktime_get_ts);
-
-#endif /* !GENERIC_TIME */
-
/**
* ktime_get_real - get the real (wall-) time in ktime_t format
*
@@ -784,10 +736,11 @@ void update_wall_time(void)
return;

clock = timekeeper.clock;
-#ifdef CONFIG_GENERIC_TIME
- offset = (clock->read(clock) - clock->cycle_last) & clock->mask;
-#else
+
+#ifdef CONFIG_ARCH_USES_GETTIMEOFFSET
offset = timekeeper.cycle_interval;
+#else
+ offset = (clock->read(clock) - clock->cycle_last) & clock->mask;
#endif
timekeeper.xtime_nsec = (s64)xtime.tv_nsec << timekeeper.shift;

diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 8b1797c..7531dda 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -153,7 +153,7 @@ config IRQSOFF_TRACER
bool "Interrupts-off Latency Tracer"
default n
depends on TRACE_IRQFLAGS_SUPPORT
- depends on GENERIC_TIME
+ depends on !ARCH_USES_GETTIMEOFFSET
select TRACE_IRQFLAGS
select GENERIC_TRACER
select TRACER_MAX_TRACE
@@ -175,7 +175,7 @@ config IRQSOFF_TRACER
config PREEMPT_TRACER
bool "Preemption-off Latency Tracer"
default n
- depends on GENERIC_TIME
+ depends on !ARCH_USES_GETTIMEOFFSET
depends on PREEMPT
select GENERIC_TRACER
select TRACER_MAX_TRACE

2010-07-27 10:47:10

by john stultz

[permalink] [raw]
Subject: [tip:timers/clocksource] powerpc: Simplify update_vsyscall

Commit-ID: b0797b60d0067fe437baa97a743c7d9de98fd769
Gitweb: http://git.kernel.org/tip/b0797b60d0067fe437baa97a743c7d9de98fd769
Author: John Stultz <[email protected]>
AuthorDate: Tue, 13 Jul 2010 17:56:21 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Tue, 27 Jul 2010 12:40:54 +0200

powerpc: Simplify update_vsyscall

Currently powerpc's update_vsyscall calls an inline update_gtod.
However, both are straightforward, and there are no other users,
so this patch merges update_gtod into update_vsyscall.

Signed-off-by: John Stultz <[email protected]>
Cc: Anton Blanchard <[email protected]>
Cc: Paul Mackerras <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
---
arch/powerpc/kernel/time.c | 55 ++++++++++++++++++++------------------------
1 files changed, 25 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 0441bbd..6fcd648 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -423,30 +423,6 @@ void udelay(unsigned long usecs)
}
EXPORT_SYMBOL(udelay);

-static inline void update_gtod(u64 new_tb_stamp, u64 new_stamp_xsec,
- u64 new_tb_to_xs)
-{
- /*
- * tb_update_count is used to allow the userspace gettimeofday code
- * to assure itself that it sees a consistent view of the tb_to_xs and
- * stamp_xsec variables. It reads the tb_update_count, then reads
- * tb_to_xs and stamp_xsec and then reads tb_update_count again. If
- * the two values of tb_update_count match and are even then the
- * tb_to_xs and stamp_xsec values are consistent. If not, then it
- * loops back and reads them again until this criteria is met.
- * We expect the caller to have done the first increment of
- * vdso_data->tb_update_count already.
- */
- vdso_data->tb_orig_stamp = new_tb_stamp;
- vdso_data->stamp_xsec = new_stamp_xsec;
- vdso_data->tb_to_xs = new_tb_to_xs;
- vdso_data->wtom_clock_sec = wall_to_monotonic.tv_sec;
- vdso_data->wtom_clock_nsec = wall_to_monotonic.tv_nsec;
- vdso_data->stamp_xtime = xtime;
- smp_wmb();
- ++(vdso_data->tb_update_count);
-}
-
#ifdef CONFIG_SMP
unsigned long profile_pc(struct pt_regs *regs)
{
@@ -876,7 +852,7 @@ static cycle_t timebase_read(struct clocksource *cs)
void update_vsyscall(struct timespec *wall_time, struct clocksource *clock,
u32 mult)
{
- u64 t2x, stamp_xsec;
+ u64 new_tb_to_xs, new_stamp_xsec;

if (clock != &clocksource_timebase)
return;
@@ -887,11 +863,30 @@ void update_vsyscall(struct timespec *wall_time, struct clocksource *clock,

/* XXX this assumes clock->shift == 22 */
/* 4611686018 ~= 2^(20+64-22) / 1e9 */
- t2x = (u64) mult * 4611686018ULL;
- stamp_xsec = (u64) xtime.tv_nsec * XSEC_PER_SEC;
- do_div(stamp_xsec, 1000000000);
- stamp_xsec += (u64) xtime.tv_sec * XSEC_PER_SEC;
- update_gtod(clock->cycle_last, stamp_xsec, t2x);
+ new_tb_to_xs = (u64) mult * 4611686018ULL;
+ new_stamp_xsec = (u64) xtime.tv_nsec * XSEC_PER_SEC;
+ do_div(new_stamp_xsec, 1000000000);
+ new_stamp_xsec += (u64) xtime.tv_sec * XSEC_PER_SEC;
+
+ /*
+ * tb_update_count is used to allow the userspace gettimeofday code
+ * to assure itself that it sees a consistent view of the tb_to_xs and
+ * stamp_xsec variables. It reads the tb_update_count, then reads
+ * tb_to_xs and stamp_xsec and then reads tb_update_count again. If
+ * the two values of tb_update_count match and are even then the
+ * tb_to_xs and stamp_xsec values are consistent. If not, then it
+ * loops back and reads them again until this criteria is met.
+ * We expect the caller to have done the first increment of
+ * vdso_data->tb_update_count already.
+ */
+ vdso_data->tb_orig_stamp = clock->cycle_last;
+ vdso_data->stamp_xsec = new_stamp_xsec;
+ vdso_data->tb_to_xs = new_tb_to_xs;
+ vdso_data->wtom_clock_sec = wall_to_monotonic.tv_sec;
+ vdso_data->wtom_clock_nsec = wall_to_monotonic.tv_nsec;
+ vdso_data->stamp_xtime = xtime;
+ smp_wmb();
+ ++(vdso_data->tb_update_count);
}

void update_vsyscall_tz(void)

2010-07-27 10:47:28

by john stultz

[permalink] [raw]
Subject: [tip:timers/clocksource] powerpc: Cleanup xtime usage

Commit-ID: 06d518e3dfb25334282c7e38b4d7a4eada215f6d
Gitweb: http://git.kernel.org/tip/06d518e3dfb25334282c7e38b4d7a4eada215f6d
Author: John Stultz <[email protected]>
AuthorDate: Tue, 13 Jul 2010 17:56:22 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Tue, 27 Jul 2010 12:40:54 +0200

powerpc: Cleanup xtime usage

This removes powerpc's direct xtime usage, allowing for further
generic timeekeping cleanups

Signed-off-by: John Stultz <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Anton Blanchard <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
---
arch/powerpc/kernel/time.c | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 6fcd648..0711d60 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -864,9 +864,9 @@ void update_vsyscall(struct timespec *wall_time, struct clocksource *clock,
/* XXX this assumes clock->shift == 22 */
/* 4611686018 ~= 2^(20+64-22) / 1e9 */
new_tb_to_xs = (u64) mult * 4611686018ULL;
- new_stamp_xsec = (u64) xtime.tv_nsec * XSEC_PER_SEC;
+ new_stamp_xsec = (u64) wall_time->tv_nsec * XSEC_PER_SEC;
do_div(new_stamp_xsec, 1000000000);
- new_stamp_xsec += (u64) xtime.tv_sec * XSEC_PER_SEC;
+ new_stamp_xsec += (u64) wall_time->tv_sec * XSEC_PER_SEC;

/*
* tb_update_count is used to allow the userspace gettimeofday code
@@ -884,7 +884,7 @@ void update_vsyscall(struct timespec *wall_time, struct clocksource *clock,
vdso_data->tb_to_xs = new_tb_to_xs;
vdso_data->wtom_clock_sec = wall_to_monotonic.tv_sec;
vdso_data->wtom_clock_nsec = wall_to_monotonic.tv_nsec;
- vdso_data->stamp_xtime = xtime;
+ vdso_data->stamp_xtime = *wall_time;
smp_wmb();
++(vdso_data->tb_update_count);
}
@@ -1093,7 +1093,7 @@ void __init time_init(void)
vdso_data->tb_orig_stamp = tb_last_jiffy;
vdso_data->tb_update_count = 0;
vdso_data->tb_ticks_per_sec = tb_ticks_per_sec;
- vdso_data->stamp_xsec = (u64) xtime.tv_sec * XSEC_PER_SEC;
+ vdso_data->stamp_xsec = (u64) get_seconds() * XSEC_PER_SEC;
vdso_data->tb_to_xs = tb_to_xs;

write_sequnlock_irqrestore(&xtime_lock, flags);

2010-07-27 10:47:49

by john stultz

[permalink] [raw]
Subject: [tip:timers/clocksource] timkeeping: Fix update_vsyscall to provide wall_to_monotonic offset

Commit-ID: 7615856ebfee52b080c22d263ca4debbd0df0ac1
Gitweb: http://git.kernel.org/tip/7615856ebfee52b080c22d263ca4debbd0df0ac1
Author: John Stultz <[email protected]>
AuthorDate: Tue, 13 Jul 2010 17:56:23 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Tue, 27 Jul 2010 12:40:54 +0200

timkeeping: Fix update_vsyscall to provide wall_to_monotonic offset

update_vsyscall() did not provide the wall_to_monotoinc offset,
so arch specific implementations tend to reference wall_to_monotonic
directly. This limits future cleanups in the timekeeping core, so
this patch fixes the update_vsyscall interface to provide
wall_to_monotonic, allowing wall_to_monotonic to be made static
as planned in Documentation/feature-removal-schedule.txt

Signed-off-by: John Stultz <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Anton Blanchard <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Tony Luck <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
---
arch/ia64/kernel/time.c | 7 ++++---
arch/powerpc/kernel/time.c | 8 ++++----
arch/s390/kernel/time.c | 8 ++++----
arch/x86/kernel/vsyscall_64.c | 6 +++---
include/linux/clocksource.h | 6 ++++--
kernel/time/timekeeping.c | 9 ++++++---
6 files changed, 25 insertions(+), 19 deletions(-)

diff --git a/arch/ia64/kernel/time.c b/arch/ia64/kernel/time.c
index 653b3c4..ed6f22e 100644
--- a/arch/ia64/kernel/time.c
+++ b/arch/ia64/kernel/time.c
@@ -471,7 +471,8 @@ void update_vsyscall_tz(void)
{
}

-void update_vsyscall(struct timespec *wall, struct clocksource *c, u32 mult)
+void update_vsyscall(struct timespec *wall, struct timespec *wtm,
+ struct clocksource *c, u32 mult)
{
unsigned long flags;

@@ -487,9 +488,9 @@ void update_vsyscall(struct timespec *wall, struct clocksource *c, u32 mult)
/* copy kernel time structures */
fsyscall_gtod_data.wall_time.tv_sec = wall->tv_sec;
fsyscall_gtod_data.wall_time.tv_nsec = wall->tv_nsec;
- fsyscall_gtod_data.monotonic_time.tv_sec = wall_to_monotonic.tv_sec
+ fsyscall_gtod_data.monotonic_time.tv_sec = wtm->tv_sec
+ wall->tv_sec;
- fsyscall_gtod_data.monotonic_time.tv_nsec = wall_to_monotonic.tv_nsec
+ fsyscall_gtod_data.monotonic_time.tv_nsec = wtm->tv_nsec
+ wall->tv_nsec;

/* normalize */
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 0711d60..e215f76 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -849,8 +849,8 @@ static cycle_t timebase_read(struct clocksource *cs)
return (cycle_t)get_tb();
}

-void update_vsyscall(struct timespec *wall_time, struct clocksource *clock,
- u32 mult)
+void update_vsyscall(struct timespec *wall_time, struct timespec *wtm,
+ struct clocksource *clock, u32 mult)
{
u64 new_tb_to_xs, new_stamp_xsec;

@@ -882,8 +882,8 @@ void update_vsyscall(struct timespec *wall_time, struct clocksource *clock,
vdso_data->tb_orig_stamp = clock->cycle_last;
vdso_data->stamp_xsec = new_stamp_xsec;
vdso_data->tb_to_xs = new_tb_to_xs;
- vdso_data->wtom_clock_sec = wall_to_monotonic.tv_sec;
- vdso_data->wtom_clock_nsec = wall_to_monotonic.tv_nsec;
+ vdso_data->wtom_clock_sec = wtm->tv_sec;
+ vdso_data->wtom_clock_nsec = wtm->tv_nsec;
vdso_data->stamp_xtime = *wall_time;
smp_wmb();
++(vdso_data->tb_update_count);
diff --git a/arch/s390/kernel/time.c b/arch/s390/kernel/time.c
index a2163c9..aeb30c6 100644
--- a/arch/s390/kernel/time.c
+++ b/arch/s390/kernel/time.c
@@ -207,8 +207,8 @@ struct clocksource * __init clocksource_default_clock(void)
return &clocksource_tod;
}

-void update_vsyscall(struct timespec *wall_time, struct clocksource *clock,
- u32 mult)
+void update_vsyscall(struct timespec *wall_time, struct timespec *wtm,
+ struct clocksource *clock, u32 mult)
{
if (clock != &clocksource_tod)
return;
@@ -219,8 +219,8 @@ void update_vsyscall(struct timespec *wall_time, struct clocksource *clock,
vdso_data->xtime_tod_stamp = clock->cycle_last;
vdso_data->xtime_clock_sec = wall_time->tv_sec;
vdso_data->xtime_clock_nsec = wall_time->tv_nsec;
- vdso_data->wtom_clock_sec = wall_to_monotonic.tv_sec;
- vdso_data->wtom_clock_nsec = wall_to_monotonic.tv_nsec;
+ vdso_data->wtom_clock_sec = wtm->tv_sec;
+ vdso_data->wtom_clock_nsec = wtm->tv_nsec;
vdso_data->ntp_mult = mult;
smp_wmb();
++vdso_data->tb_update_count;
diff --git a/arch/x86/kernel/vsyscall_64.c b/arch/x86/kernel/vsyscall_64.c
index dce0c3c..dcbb28c 100644
--- a/arch/x86/kernel/vsyscall_64.c
+++ b/arch/x86/kernel/vsyscall_64.c
@@ -73,8 +73,8 @@ void update_vsyscall_tz(void)
write_sequnlock_irqrestore(&vsyscall_gtod_data.lock, flags);
}

-void update_vsyscall(struct timespec *wall_time, struct clocksource *clock,
- u32 mult)
+void update_vsyscall(struct timespec *wall_time, struct timespec *wtm,
+ struct clocksource *clock, u32 mult)
{
unsigned long flags;

@@ -87,7 +87,7 @@ void update_vsyscall(struct timespec *wall_time, struct clocksource *clock,
vsyscall_gtod_data.clock.shift = clock->shift;
vsyscall_gtod_data.wall_time_sec = wall_time->tv_sec;
vsyscall_gtod_data.wall_time_nsec = wall_time->tv_nsec;
- vsyscall_gtod_data.wall_to_monotonic = wall_to_monotonic;
+ vsyscall_gtod_data.wall_to_monotonic = *wtm;
vsyscall_gtod_data.wall_time_coarse = __current_kernel_time();
write_sequnlock_irqrestore(&vsyscall_gtod_data.lock, flags);
}
diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 5ea3c60..21677d9 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -313,11 +313,13 @@ clocksource_calc_mult_shift(struct clocksource *cs, u32 freq, u32 minsec)

#ifdef CONFIG_GENERIC_TIME_VSYSCALL
extern void
-update_vsyscall(struct timespec *ts, struct clocksource *c, u32 mult);
+update_vsyscall(struct timespec *ts, struct timespec *wtm,
+ struct clocksource *c, u32 mult);
extern void update_vsyscall_tz(void);
#else
static inline void
-update_vsyscall(struct timespec *ts, struct clocksource *c, u32 mult)
+update_vsyscall(struct timespec *ts, struct timespec *wtm,
+ struct clocksource *c, u32 mult)
{
}

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 73edd40..b15c3ac 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -170,7 +170,8 @@ void timekeeping_leap_insert(int leapsecond)
{
xtime.tv_sec += leapsecond;
wall_to_monotonic.tv_sec -= leapsecond;
- update_vsyscall(&xtime, timekeeper.clock, timekeeper.mult);
+ update_vsyscall(&xtime, &wall_to_monotonic, timekeeper.clock,
+ timekeeper.mult);
}

/**
@@ -326,7 +327,8 @@ int do_settimeofday(struct timespec *tv)
timekeeper.ntp_error = 0;
ntp_clear();

- update_vsyscall(&xtime, timekeeper.clock, timekeeper.mult);
+ update_vsyscall(&xtime, &wall_to_monotonic, timekeeper.clock,
+ timekeeper.mult);

write_sequnlock_irqrestore(&xtime_lock, flags);

@@ -809,7 +811,8 @@ void update_wall_time(void)
}

/* check to see if there is a new clocksource to use */
- update_vsyscall(&xtime, timekeeper.clock, timekeeper.mult);
+ update_vsyscall(&xtime, &wall_to_monotonic, timekeeper.clock,
+ timekeeper.mult);
}

/**

2010-07-27 10:48:12

by john stultz

[permalink] [raw]
Subject: [tip:timers/clocksource] um: Convert to use read_persistent_clock

Commit-ID: 9f31f5774961a735687fee17953ab505b3df3abf
Gitweb: http://git.kernel.org/tip/9f31f5774961a735687fee17953ab505b3df3abf
Author: John Stultz <[email protected]>
AuthorDate: Tue, 13 Jul 2010 17:56:24 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Tue, 27 Jul 2010 12:40:55 +0200

um: Convert to use read_persistent_clock

This patch converts the um arch to use read_persistent_clock().
This allows it to avoid accessing xtime and wall_to_monotonic
directly.

Signed-off-by: John Stultz <[email protected]>
Cc: Jeff Dike <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
---
arch/um/kernel/time.c | 13 +++++++------
1 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/um/kernel/time.c b/arch/um/kernel/time.c
index c8b9c46..2b8b262 100644
--- a/arch/um/kernel/time.c
+++ b/arch/um/kernel/time.c
@@ -102,16 +102,17 @@ static void __init setup_itimer(void)
clockevents_register_device(&itimer_clockevent);
}

+void read_persistent_clock(struct timespec *ts)
+{
+ nsecs = os_nsecs();
+ set_normalized_timespec(ts, nsecs / NSEC_PER_SEC,
+ nsecs % NSEC_PER_SEC);
+}
+
void __init time_init(void)
{
long long nsecs;

timer_init();
-
- nsecs = os_nsecs();
- set_normalized_timespec(&wall_to_monotonic, -nsecs / NSEC_PER_SEC,
- -nsecs % NSEC_PER_SEC);
- set_normalized_timespec(&xtime, nsecs / NSEC_PER_SEC,
- nsecs % NSEC_PER_SEC);
late_time_init = setup_itimer;
}

2010-07-27 10:48:26

by john stultz

[permalink] [raw]
Subject: [tip:timers/clocksource] hrtimer: Cleanup direct access to wall_to_monotonic

Commit-ID: 8ab4351a4c888016620f43bde605b3d0964af339
Gitweb: http://git.kernel.org/tip/8ab4351a4c888016620f43bde605b3d0964af339
Author: John Stultz <[email protected]>
AuthorDate: Tue, 13 Jul 2010 17:56:25 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Tue, 27 Jul 2010 12:40:55 +0200

hrtimer: Cleanup direct access to wall_to_monotonic

Provides an accessor function to replace hrtimer.c's
direct access of wall_to_monotonic.

This will allow wall_to_monotonic to be made static as
planned in Documentation/feature-removal-schedule.txt

Signed-off-by: John Stultz <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
---
include/linux/time.h | 3 ++-
kernel/hrtimer.c | 9 ++++-----
kernel/time/timekeeping.c | 5 +++++
3 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/include/linux/time.h b/include/linux/time.h
index 9072df8..a57e0f6 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -126,7 +126,8 @@ extern int timekeeping_suspended;

unsigned long get_seconds(void);
struct timespec current_kernel_time(void);
-struct timespec __current_kernel_time(void); /* does not hold xtime_lock */
+struct timespec __current_kernel_time(void); /* does not take xtime_lock */
+struct timespec __get_wall_to_monotonic(void); /* does not take xtime_lock */
struct timespec get_monotonic_coarse(void);

#define CURRENT_TIME (current_kernel_time())
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 5c69e99..809f48c 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -90,7 +90,7 @@ static void hrtimer_get_softirq_time(struct hrtimer_cpu_base *base)
do {
seq = read_seqbegin(&xtime_lock);
xts = __current_kernel_time();
- tom = wall_to_monotonic;
+ tom = __get_wall_to_monotonic();
} while (read_seqretry(&xtime_lock, seq));

xtim = timespec_to_ktime(xts);
@@ -612,7 +612,7 @@ static int hrtimer_reprogram(struct hrtimer *timer,
static void retrigger_next_event(void *arg)
{
struct hrtimer_cpu_base *base;
- struct timespec realtime_offset;
+ struct timespec realtime_offset, wtm;
unsigned long seq;

if (!hrtimer_hres_active())
@@ -620,10 +620,9 @@ static void retrigger_next_event(void *arg)

do {
seq = read_seqbegin(&xtime_lock);
- set_normalized_timespec(&realtime_offset,
- -wall_to_monotonic.tv_sec,
- -wall_to_monotonic.tv_nsec);
+ wtm = __get_wall_to_monotonic();
} while (read_seqretry(&xtime_lock, seq));
+ set_normalized_timespec(&realtime_offset, -wtm.tv_sec, -wtm.tv_nsec);

base = &__get_cpu_var(hrtimer_bases);

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index b15c3ac..fb61c2e 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -858,6 +858,11 @@ struct timespec __current_kernel_time(void)
return xtime;
}

+struct timespec __get_wall_to_monotonic(void)
+{
+ return wall_to_monotonic;
+}
+
struct timespec current_kernel_time(void)
{
struct timespec now;

2010-07-27 10:48:45

by john stultz

[permalink] [raw]
Subject: [tip:timers/clocksource] timekeeping: Make xtime and wall_to_monotonic static

Commit-ID: 0fb86b06298b6cd3205cac2e68a499f269282dac
Gitweb: http://git.kernel.org/tip/0fb86b06298b6cd3205cac2e68a499f269282dac
Author: John Stultz <[email protected]>
AuthorDate: Tue, 13 Jul 2010 17:56:26 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Tue, 27 Jul 2010 12:40:55 +0200

timekeeping: Make xtime and wall_to_monotonic static

This patch makes xtime and wall_to_monotonic static, as planned in
Documentation/feature-removal-schedule.txt. This will allow for
further cleanups to the timekeeping core.

Signed-off-by: John Stultz <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>

---
Documentation/feature-removal-schedule.txt | 10 ----------
include/linux/time.h | 2 --
kernel/time/timekeeping.c | 4 ++--
3 files changed, 2 insertions(+), 14 deletions(-)

diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index 1571c0c..cd648db 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -549,16 +549,6 @@ Who: Avi Kivity <[email protected]>

----------------------------

-What: xtime, wall_to_monotonic
-When: 2.6.36+
-Files: kernel/time/timekeeping.c include/linux/time.h
-Why: Cleaning up timekeeping internal values. Please use
- existing timekeeping accessor functions to access
- the equivalent functionality.
-Who: John Stultz <[email protected]>
-
-----------------------------
-
What: KVM kernel-allocated memory slots
When: July 2010
Why: Since 2.6.25, kvm supports user-allocated memory slots, which are
diff --git a/include/linux/time.h b/include/linux/time.h
index a57e0f6..cb34e35 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -113,8 +113,6 @@ static inline struct timespec timespec_sub(struct timespec lhs,
#define timespec_valid(ts) \
(((ts)->tv_sec >= 0) && (((unsigned long) (ts)->tv_nsec) < NSEC_PER_SEC))

-extern struct timespec xtime;
-extern struct timespec wall_to_monotonic;
extern seqlock_t xtime_lock;

extern void read_persistent_clock(struct timespec *ts);
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index fb61c2e..e14c839 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -153,8 +153,8 @@ __cacheline_aligned_in_smp DEFINE_SEQLOCK(xtime_lock);
* - wall_to_monotonic is no longer the boot time, getboottime must be
* used instead.
*/
-struct timespec xtime __attribute__ ((aligned (16)));
-struct timespec wall_to_monotonic __attribute__ ((aligned (16)));
+static struct timespec xtime __attribute__ ((aligned (16)));
+static struct timespec wall_to_monotonic __attribute__ ((aligned (16)));
static struct timespec total_sleep_time;

/*

2010-07-27 10:49:05

by john stultz

[permalink] [raw]
Subject: [tip:timers/clocksource] x86: Convert common clocksources to use clocksource_register_hz/khz

Commit-ID: f12a15be63d1de9a35971f35f06b73088fa25c3a
Gitweb: http://git.kernel.org/tip/f12a15be63d1de9a35971f35f06b73088fa25c3a
Author: John Stultz <[email protected]>
AuthorDate: Tue, 13 Jul 2010 17:56:27 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Tue, 27 Jul 2010 12:40:55 +0200

x86: Convert common clocksources to use clocksource_register_hz/khz

This converts the most common of the x86 clocksources over to use
clocksource_register_hz/khz.

Signed-off-by: John Stultz <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
---
arch/x86/kernel/hpet.c | 13 +++++++++----
arch/x86/kernel/tsc.c | 5 +----
drivers/clocksource/acpi_pm.c | 9 ++-------
3 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index ba390d7..33dbcc4 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -16,7 +16,6 @@
#include <asm/hpet.h>

#define HPET_MASK CLOCKSOURCE_MASK(32)
-#define HPET_SHIFT 22

/* FSEC = 10^-15
NSEC = 10^-9 */
@@ -787,7 +786,6 @@ static struct clocksource clocksource_hpet = {
.rating = 250,
.read = read_hpet,
.mask = HPET_MASK,
- .shift = HPET_SHIFT,
.flags = CLOCK_SOURCE_IS_CONTINUOUS,
.resume = hpet_resume_counter,
#ifdef CONFIG_X86_64
@@ -798,6 +796,7 @@ static struct clocksource clocksource_hpet = {
static int hpet_clocksource_register(void)
{
u64 start, now;
+ u64 hpet_freq;
cycle_t t1;

/* Start the counter */
@@ -832,9 +831,15 @@ static int hpet_clocksource_register(void)
* mult = (hpet_period * 2^shift)/10^6
* mult = (hpet_period << shift)/FSEC_PER_NSEC
*/
- clocksource_hpet.mult = div_sc(hpet_period, FSEC_PER_NSEC, HPET_SHIFT);

- clocksource_register(&clocksource_hpet);
+ /* Need to convert hpet_period (fsec/cyc) to cyc/sec:
+ *
+ * cyc/sec = FSEC_PER_SEC/hpet_period(fsec/cyc)
+ * cyc/sec = (FSEC_PER_NSEC * NSEC_PER_SEC)/hpet_period
+ */
+ hpet_freq = FSEC_PER_NSEC * NSEC_PER_SEC;
+ do_div(hpet_freq, hpet_period);
+ clocksource_register_hz(&clocksource_hpet, (u32)hpet_freq);

return 0;
}
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 9faf91a..ce8e502 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -751,7 +751,6 @@ static struct clocksource clocksource_tsc = {
.read = read_tsc,
.resume = resume_tsc,
.mask = CLOCKSOURCE_MASK(64),
- .shift = 22,
.flags = CLOCK_SOURCE_IS_CONTINUOUS |
CLOCK_SOURCE_MUST_VERIFY,
#ifdef CONFIG_X86_64
@@ -845,8 +844,6 @@ __cpuinit int unsynchronized_tsc(void)

static void __init init_tsc_clocksource(void)
{
- clocksource_tsc.mult = clocksource_khz2mult(tsc_khz,
- clocksource_tsc.shift);
if (tsc_clocksource_reliable)
clocksource_tsc.flags &= ~CLOCK_SOURCE_MUST_VERIFY;
/* lower the rating if we already know its unstable: */
@@ -854,7 +851,7 @@ static void __init init_tsc_clocksource(void)
clocksource_tsc.rating = 0;
clocksource_tsc.flags &= ~CLOCK_SOURCE_IS_CONTINUOUS;
}
- clocksource_register(&clocksource_tsc);
+ clocksource_register_khz(&clocksource_tsc, tsc_khz);
}

#ifdef CONFIG_X86_64
diff --git a/drivers/clocksource/acpi_pm.c b/drivers/clocksource/acpi_pm.c
index 72a633a..cfb0f52 100644
--- a/drivers/clocksource/acpi_pm.c
+++ b/drivers/clocksource/acpi_pm.c
@@ -68,10 +68,7 @@ static struct clocksource clocksource_acpi_pm = {
.rating = 200,
.read = acpi_pm_read,
.mask = (cycle_t)ACPI_PM_MASK,
- .mult = 0, /*to be calculated*/
- .shift = 22,
.flags = CLOCK_SOURCE_IS_CONTINUOUS,
-
};


@@ -190,9 +187,6 @@ static int __init init_acpi_pm_clocksource(void)
if (!pmtmr_ioport)
return -ENODEV;

- clocksource_acpi_pm.mult = clocksource_hz2mult(PMTMR_TICKS_PER_SEC,
- clocksource_acpi_pm.shift);
-
/* "verify" this timing source: */
for (j = 0; j < ACPI_PM_MONOTONICITY_CHECKS; j++) {
udelay(100 * j);
@@ -220,7 +214,8 @@ static int __init init_acpi_pm_clocksource(void)
if (verify_pmtmr_rate() != 0)
return -ENODEV;

- return clocksource_register(&clocksource_acpi_pm);
+ return clocksource_register_hz(&clocksource_acpi_pm,
+ PMTMR_TICKS_PER_SEC);
}

/* We use fs_initcall because we want the PCI fixups to have run

2010-07-27 10:49:23

by john stultz

[permalink] [raw]
Subject: [tip:timers/clocksource] clocksource: Add __clocksource_updatefreq_hz/khz methods

Commit-ID: 852db46d55e85b475a72e665ca08d3317769ceef
Gitweb: http://git.kernel.org/tip/852db46d55e85b475a72e665ca08d3317769ceef
Author: John Stultz <[email protected]>
AuthorDate: Tue, 13 Jul 2010 17:56:28 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Tue, 27 Jul 2010 12:40:55 +0200

clocksource: Add __clocksource_updatefreq_hz/khz methods

To properly handle clocksources that change frequencies
at the clocksource->enable() point, this patch adds
a method that will update the clocksource's mult/shift and
max_idle_ns values.

Signed-off-by: John Stultz <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>

---
include/linux/clocksource.h | 11 +++++++++++
kernel/time/clocksource.c | 29 ++++++++++++++++++++++++-----
2 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 21677d9..c37b21a 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -292,6 +292,8 @@ clocks_calc_mult_shift(u32 *mult, u32 *shift, u32 from, u32 to, u32 minsec);
*/
extern int
__clocksource_register_scale(struct clocksource *cs, u32 scale, u32 freq);
+extern void
+__clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq);

static inline int clocksource_register_hz(struct clocksource *cs, u32 hz)
{
@@ -303,6 +305,15 @@ static inline int clocksource_register_khz(struct clocksource *cs, u32 khz)
return __clocksource_register_scale(cs, 1000, khz);
}

+static inline void __clocksource_updatefreq_hz(struct clocksource *cs, u32 hz)
+{
+ __clocksource_updatefreq_scale(cs, 1, hz);
+}
+
+static inline void __clocksource_updatefreq_khz(struct clocksource *cs, u32 khz)
+{
+ __clocksource_updatefreq_scale(cs, 1000, khz);
+}

static inline void
clocksource_calc_mult_shift(struct clocksource *cs, u32 freq, u32 minsec)
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index c543d21..c18d7ef 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -639,19 +639,18 @@ static void clocksource_enqueue(struct clocksource *cs)
#define MAX_UPDATE_LENGTH 5 /* Seconds */

/**
- * __clocksource_register_scale - Used to install new clocksources
+ * __clocksource_updatefreq_scale - Used update clocksource with new freq
* @t: clocksource to be registered
* @scale: Scale factor multiplied against freq to get clocksource hz
* @freq: clocksource frequency (cycles per second) divided by scale
*
- * Returns -EBUSY if registration fails, zero otherwise.
+ * This should only be called from the clocksource->enable() method.
*
* This *SHOULD NOT* be called directly! Please use the
- * clocksource_register_hz() or clocksource_register_khz helper functions.
+ * clocksource_updatefreq_hz() or clocksource_updatefreq_khz helper functions.
*/
-int __clocksource_register_scale(struct clocksource *cs, u32 scale, u32 freq)
+void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
{
-
/*
* Ideally we want to use some of the limits used in
* clocksource_max_deferment, to provide a more informed
@@ -662,7 +661,27 @@ int __clocksource_register_scale(struct clocksource *cs, u32 scale, u32 freq)
NSEC_PER_SEC/scale,
MAX_UPDATE_LENGTH*scale);
cs->max_idle_ns = clocksource_max_deferment(cs);
+}
+EXPORT_SYMBOL_GPL(__clocksource_updatefreq_scale);
+
+/**
+ * __clocksource_register_scale - Used to install new clocksources
+ * @t: clocksource to be registered
+ * @scale: Scale factor multiplied against freq to get clocksource hz
+ * @freq: clocksource frequency (cycles per second) divided by scale
+ *
+ * Returns -EBUSY if registration fails, zero otherwise.
+ *
+ * This *SHOULD NOT* be called directly! Please use the
+ * clocksource_register_hz() or clocksource_register_khz helper functions.
+ */
+int __clocksource_register_scale(struct clocksource *cs, u32 scale, u32 freq)
+{
+
+ /* Intialize mult/shift and max_idle_ns */
+ __clocksource_updatefreq_scale(cs, scale, freq);

+ /* Add clocksource to the clcoksource list */
mutex_lock(&clocksource_mutex);
clocksource_enqueue(cs);
clocksource_select();

2010-07-27 23:42:12

by Paul Mackerras

[permalink] [raw]
Subject: Re: [PATCH 04/11] powerpc: Simplify update_vsyscall

On Tue, Jul 13, 2010 at 05:56:21PM -0700, John Stultz wrote:

> Currently powerpc's update_vsyscall calls an inline update_gtod.
> However, both are straightforward, and there are no other users,
> so this patch merges update_gtod into update_vsyscall.
>
> Compiles, but otherwise untested.

This and the following two patches will cause interesting conflicts
with two commits in Ben Herrenschmidt's powerpc.git next branch,
specifically 8fd63a9e ("powerpc: Rework VDSO gettimeofday to prevent
time going backwards") and c1aa687d ("powerpc: Clean up obsolete code
relating to decrementer and timebase") from me. In fact the first of
those two commits includes changes equivalent to those in your 5/11
patch ("powerpc: Cleanup xtime usage"), as far as I can see.

BTW, BenH's tree is at:

git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc.git

Paul.

2010-07-28 01:33:33

by john stultz

[permalink] [raw]
Subject: Re: [PATCH 04/11] powerpc: Simplify update_vsyscall

On Wed, 2010-07-28 at 09:41 +1000, Paul Mackerras wrote:
> On Tue, Jul 13, 2010 at 05:56:21PM -0700, John Stultz wrote:
>
> > Currently powerpc's update_vsyscall calls an inline update_gtod.
> > However, both are straightforward, and there are no other users,
> > so this patch merges update_gtod into update_vsyscall.
> >
> > Compiles, but otherwise untested.
>
> This and the following two patches will cause interesting conflicts
> with two commits in Ben Herrenschmidt's powerpc.git next branch,
> specifically 8fd63a9e ("powerpc: Rework VDSO gettimeofday to prevent
> time going backwards") and c1aa687d ("powerpc: Clean up obsolete code
> relating to decrementer and timebase") from me. In fact the first of
> those two commits includes changes equivalent to those in your 5/11
> patch ("powerpc: Cleanup xtime usage"), as far as I can see.

Ahh.. Right.. I guess I should have remembered you were working on those
changes (even though I don't think I saw the final results sent to lkml
or anything).

Sorry about that.

> BTW, BenH's tree is at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc.git

So I've cherry picked the two changes from the ppc tree, applied them
onto linus' git tree and then rebased my changes ontop of them.

The net of the change to the patch set:

Added to the head of the patch queue:
powerpc: Rework VDSO gettimeofday to prevent time going backwards
powerpc: Clean up obsolete code relating to decrementer and timebase

Modified to resolve collision:
powerpc: Simplify update_vsyscall

Dropped (as earlier patches already made equivalent changes):
powerpc: Cleanup xtime usage

The full set is in the attached tarball.

Thomas, would you consider re-adding these? Hopefully that will avoid
any -next collisions.

thanks
-john


Attachments:
patches.tar.bz2 (20.90 kB)