2006-02-24 00:27:34

by Tony Lindgren

[permalink] [raw]
Subject: [PATCH] Fix next_timer_interrupt() for hrtimer

This patch adds support for hrtimer to next_timer_interrupt()
and fixes current breakage.

Function next_timer_interrupt() got broken with a recent patch
6ba1b91213e81aa92b5cf7539f7d2a94ff54947c as sys_nanosleep() was
moved to hrtimer. This broke things as next_timer_interrupt()
did not check hrtimer tree for next event.

Function next_timer_interrupt() is needed with dyntick
(CONFIG_NO_IDLE_HZ, VST) implementations, as the system can
be in idle when next hrtimer event was supposed to happen.
At least ARM and S390 currently use next_timer_interrupt().

Signed-off-by: Tony Lindgren <[email protected]>

--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -505,6 +505,94 @@
return rem;
}

+#ifdef CONFIG_NO_IDLE_HZ
+
+/**
+ * hrtimer_get_next - get next hrtimer to expire
+ *
+ * @bases: ktimer base array
+ */
+static inline struct hrtimer * hrtimer_get_next(struct hrtimer_base *bases)
+{
+ unsigned long flags;
+ struct hrtimer *timer = NULL;
+ int i;
+
+ for (i = 0; i < MAX_HRTIMER_BASES; i++) {
+ struct hrtimer_base *base;
+ struct hrtimer *cur;
+
+ base = &bases[i];
+ spin_lock_irqsave(&base->lock, flags);
+ cur = rb_entry(base->first, struct hrtimer, node);
+ spin_unlock_irqrestore(&base->lock, flags);
+
+ if (cur == NULL)
+ continue;
+
+ if (timer == NULL || cur->expires.tv64 < timer->expires.tv64)
+ timer = cur;
+ }
+
+ return timer;
+}
+
+/**
+ * ktime_to_jiffies - converts ktime to jiffies
+ *
+ * @event: ktime event to be converted
+ *
+ * Caller must take care xtime locking.
+ */
+static inline unsigned long ktime_to_jiffies(const ktime_t event)
+{
+ ktime_t now, delta;
+ unsigned long sec, nsec;
+ struct timespec tv;
+
+ tv = ktime_to_timespec(event);
+
+ /* Assume read xtime_lock is held, so we can't use getnstimeofday() */
+ sec = xtime.tv_sec;
+ nsec = xtime.tv_nsec;
+ while (unlikely(nsec >= NSEC_PER_SEC)) {
+ nsec -= NSEC_PER_SEC;
+ ++sec;
+ }
+ tv.tv_sec = sec;
+ tv.tv_nsec = nsec;
+
+ now = timespec_to_ktime(tv);
+ delta = ktime_sub(event, now);
+
+ tv = ktime_to_timespec(delta);
+
+ return jiffies - 1 + timespec_to_jiffies(&tv);
+}
+
+/**
+ * hrtimer_next_jiffie - get next hrtimer event in jiffies
+ *
+ * Called from next_timer_interrupt() to get the next hrtimer event.
+ * Eventually we should change next_timer_interrupt() to return
+ * results in nanoseconds instead of jiffies. Caller must host xtime_lock.
+ */
+int hrtimer_next_jiffie(unsigned long *next_jiffie)
+{
+ struct hrtimer_base *base = __get_cpu_var(hrtimer_bases);
+ struct hrtimer * timer;
+
+ timer = hrtimer_get_next(base);
+ if (timer == NULL)
+ return -EAGAIN;
+
+ *next_jiffie = ktime_to_jiffies(timer->expires);
+
+ return 0;
+}
+
+#endif
+
/**
* hrtimer_init - initialize a timer to the given clock
*
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -478,6 +478,7 @@
}

#ifdef CONFIG_NO_IDLE_HZ
+
/*
* Find out when the next timer event is due to happen. This
* is used on S/390 to stop all activity when a cpus is idle.
@@ -489,9 +490,15 @@
struct list_head *list;
struct timer_list *nte;
unsigned long expires;
+ unsigned long hr_expires = jiffies + 10 * HZ; /* Anything far ahead */
tvec_t *varray[4];
int i, j;

+ /* Look for timer events in hrtimer. */
+ if ((hrtimer_next_jiffie(&hr_expires) == 0)
+ && (time_before(hr_expires, jiffies + 2)))
+ return hr_expires;
+
base = &__get_cpu_var(tvec_bases);
spin_lock(&base->t_base.lock);
expires = base->timer_jiffies + (LONG_MAX >> 1);
@@ -542,6 +549,10 @@
}
}
spin_unlock(&base->t_base.lock);
+
+ if (time_before(hr_expires, expires))
+ expires = hr_expires;
+
return expires;
}
#endif
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -115,6 +115,7 @@ extern int hrtimer_try_to_cancel(struct
/* Query timers: */
extern ktime_t hrtimer_get_remaining(const struct hrtimer *timer);
extern int hrtimer_get_res(const clockid_t which_clock, struct timespec *tp);
+extern int hrtimer_next_jiffie(unsigned long *next_jiffie);

static inline int hrtimer_active(const struct hrtimer *timer)
{


Attachments:
(No filename) (196.00 B)
patch-hrtimer-dyntick (3.95 kB)
Download all attachments

2006-02-24 00:37:56

by john stultz

[permalink] [raw]
Subject: Re: [PATCH] Fix next_timer_interrupt() for hrtimer

On Thu, 2006-02-23 at 16:26 -0800, Tony Lindgren wrote:
> + tv = ktime_to_timespec(event);
> +
> + /* Assume read xtime_lock is held, so we can't use getnstimeofday() */
> + sec = xtime.tv_sec;
> + nsec = xtime.tv_nsec;
> + while (unlikely(nsec >= NSEC_PER_SEC)) {
> + nsec -= NSEC_PER_SEC;
> + ++sec;
> + }
> + tv.tv_sec = sec;
> + tv.tv_nsec = nsec;

Er, I think you should be able to nest readers. Thus getnstimeofday()
should be safe to call. Or is the comment wrong and you are assuming a
write lock is held?

thanks
-john


2006-02-24 00:41:13

by Tony Lindgren

[permalink] [raw]
Subject: Re: [PATCH] Fix next_timer_interrupt() for hrtimer

* john stultz <[email protected]> [060223 16:37]:
> On Thu, 2006-02-23 at 16:26 -0800, Tony Lindgren wrote:
> > + tv = ktime_to_timespec(event);
> > +
> > + /* Assume read xtime_lock is held, so we can't use getnstimeofday() */
> > + sec = xtime.tv_sec;
> > + nsec = xtime.tv_nsec;
> > + while (unlikely(nsec >= NSEC_PER_SEC)) {
> > + nsec -= NSEC_PER_SEC;
> > + ++sec;
> > + }
> > + tv.tv_sec = sec;
> > + tv.tv_nsec = nsec;
>
> Er, I think you should be able to nest readers. Thus getnstimeofday()
> should be safe to call. Or is the comment wrong and you are assuming a
> write lock is held?

Oops, it's a write lock as next_timer_interrupt gets called from
arch/*/time.c.

Tony

2006-02-24 00:50:19

by john stultz

[permalink] [raw]
Subject: Re: [PATCH] Fix next_timer_interrupt() for hrtimer

On Thu, 2006-02-23 at 16:40 -0800, Tony Lindgren wrote:
> * john stultz <[email protected]> [060223 16:37]:
> > On Thu, 2006-02-23 at 16:26 -0800, Tony Lindgren wrote:
> > > + tv = ktime_to_timespec(event);
> > > +
> > > + /* Assume read xtime_lock is held, so we can't use getnstimeofday() */
> > > + sec = xtime.tv_sec;
> > > + nsec = xtime.tv_nsec;
> > > + while (unlikely(nsec >= NSEC_PER_SEC)) {
> > > + nsec -= NSEC_PER_SEC;
> > > + ++sec;
> > > + }
> > > + tv.tv_sec = sec;
> > > + tv.tv_nsec = nsec;
> >
> > Er, I think you should be able to nest readers. Thus getnstimeofday()
> > should be safe to call. Or is the comment wrong and you are assuming a
> > write lock is held?
>
> Oops, it's a write lock as next_timer_interrupt gets called from
> arch/*/time.c.

Also the above code just overwrites tv.

Do you intend instead to add xtime to tv?

thanks
-john


2006-02-24 01:10:55

by Tony Lindgren

[permalink] [raw]
Subject: Re: [PATCH] Fix next_timer_interrupt() for hrtimer

This patch adds support for hrtimer to next_timer_interrupt()
and fixes current breakage.

Function next_timer_interrupt() got broken with a recent patch
6ba1b91213e81aa92b5cf7539f7d2a94ff54947c as sys_nanosleep() was
moved to hrtimer. This broke things as next_timer_interrupt()
did not check hrtimer tree for next event.

Function next_timer_interrupt() is needed with dyntick
(CONFIG_NO_IDLE_HZ, VST) implementations, as the system can
be in idle when next hrtimer event was supposed to happen.
At least ARM and S390 currently use next_timer_interrupt().

Signed-off-by: Tony Lindgren <[email protected]>

Index: linux-omap-dev/kernel/hrtimer.c
===================================================================
--- linux-omap-dev.orig/kernel/hrtimer.c 2006-02-23 16:19:47.000000000 -0800
+++ linux-omap-dev/kernel/hrtimer.c 2006-02-23 17:04:23.000000000 -0800
@@ -505,6 +505,80 @@
return rem;
}

+#ifdef CONFIG_NO_IDLE_HZ
+
+/**
+ * hrtimer_get_next - get next hrtimer to expire
+ *
+ * @bases: ktimer base array
+ */
+static inline struct hrtimer * hrtimer_get_next(struct hrtimer_base *bases)
+{
+ unsigned long flags;
+ struct hrtimer *timer = NULL;
+ int i;
+
+ for (i = 0; i < MAX_HRTIMER_BASES; i++) {
+ struct hrtimer_base *base;
+ struct hrtimer *cur;
+
+ base = &bases[i];
+ spin_lock_irqsave(&base->lock, flags);
+ cur = rb_entry(base->first, struct hrtimer, node);
+ spin_unlock_irqrestore(&base->lock, flags);
+
+ if (cur == NULL)
+ continue;
+
+ if (timer == NULL || cur->expires.tv64 < timer->expires.tv64)
+ timer = cur;
+ }
+
+ return timer;
+}
+
+/**
+ * ktime_to_jiffies - converts ktime to jiffies
+ *
+ * @event: ktime event to be converted
+ *
+ * Caller must take care xtime locking.
+ */
+static inline unsigned long ktime_to_jiffies(const ktime_t event)
+{
+ ktime_t now, delta;
+ struct timespec tv;
+
+ now = timespec_to_ktime(xtime);
+ delta = ktime_sub(event, now);
+ tv = ktime_to_timespec(delta);
+
+ return jiffies - 1 + timespec_to_jiffies(&tv);
+}
+
+/**
+ * hrtimer_next_jiffie - get next hrtimer event in jiffies
+ *
+ * Called from next_timer_interrupt() to get the next hrtimer event.
+ * Eventually we should change next_timer_interrupt() to return
+ * results in nanoseconds instead of jiffies. Caller must host xtime_lock.
+ */
+int hrtimer_next_jiffie(unsigned long *next_jiffie)
+{
+ struct hrtimer_base *base = __get_cpu_var(hrtimer_bases);
+ struct hrtimer * timer;
+
+ timer = hrtimer_get_next(base);
+ if (timer == NULL)
+ return -EAGAIN;
+
+ *next_jiffie = ktime_to_jiffies(timer->expires);
+
+ return 0;
+}
+
+#endif
+
/**
* hrtimer_init - initialize a timer to the given clock
*
Index: linux-omap-dev/kernel/timer.c
===================================================================
--- linux-omap-dev.orig/kernel/timer.c 2006-02-23 16:19:47.000000000 -0800
+++ linux-omap-dev/kernel/timer.c 2006-02-23 16:19:47.000000000 -0800
@@ -478,6 +478,7 @@
}

#ifdef CONFIG_NO_IDLE_HZ
+
/*
* Find out when the next timer event is due to happen. This
* is used on S/390 to stop all activity when a cpus is idle.
@@ -489,9 +490,15 @@
struct list_head *list;
struct timer_list *nte;
unsigned long expires;
+ unsigned long hr_expires = jiffies + 10 * HZ; /* Anything far ahead */
tvec_t *varray[4];
int i, j;

+ /* Look for timer events in hrtimer. */
+ if ((hrtimer_next_jiffie(&hr_expires) == 0)
+ && (time_before(hr_expires, jiffies + 2)))
+ return hr_expires;
+
base = &__get_cpu_var(tvec_bases);
spin_lock(&base->t_base.lock);
expires = base->timer_jiffies + (LONG_MAX >> 1);
@@ -542,6 +549,10 @@
}
}
spin_unlock(&base->t_base.lock);
+
+ if (time_before(hr_expires, expires))
+ expires = hr_expires;
+
return expires;
}
#endif
Index: linux-omap-dev/include/linux/hrtimer.h
===================================================================
--- linux-omap-dev.orig/include/linux/hrtimer.h 2006-02-23 16:19:43.000000000 -0800
+++ linux-omap-dev/include/linux/hrtimer.h 2006-02-23 16:19:47.000000000 -0800
@@ -115,6 +115,7 @@
/* Query timers: */
extern ktime_t hrtimer_get_remaining(const struct hrtimer *timer);
extern int hrtimer_get_res(const clockid_t which_clock, struct timespec *tp);
+extern int hrtimer_next_jiffie(unsigned long *next_jiffie);

static inline int hrtimer_active(const struct hrtimer *timer)
{


Attachments:
(No filename) (1.11 kB)
patch-hrtimer-dyntick (4.23 kB)
Download all attachments

2006-02-25 00:43:56

by Tony Lindgren

[permalink] [raw]
Subject: Re: [PATCH] Fix next_timer_interrupt() for hrtimer

This patch adds support for hrtimer to next_timer_interrupt()
and fixes current breakage.

Function next_timer_interrupt() got broken with a recent patch
6ba1b91213e81aa92b5cf7539f7d2a94ff54947c as sys_nanosleep() was
moved to hrtimer. This broke things as next_timer_interrupt()
did not check hrtimer tree for next event.

Function next_timer_interrupt() is needed with dyntick
(CONFIG_NO_IDLE_HZ, VST) implementations, as the system can
be in idle when next hrtimer event was supposed to happen.
At least ARM and S390 currently use next_timer_interrupt().

Signed-off-by: Tony Lindgren <[email protected]>

--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -505,6 +505,79 @@
return rem;
}

+#ifdef CONFIG_NO_IDLE_HZ
+
+/**
+ * hrtimer_get_next - get next hrtimer to expire
+ *
+ * @bases: ktimer base array
+ */
+static inline struct hrtimer * hrtimer_get_next(struct hrtimer_base *bases)
+{
+ unsigned long flags;
+ struct hrtimer *timer = NULL;
+ int i;
+
+ for (i = 0; i < MAX_HRTIMER_BASES; i++) {
+ struct hrtimer_base *base;
+ struct hrtimer *cur;
+
+ base = &bases[i];
+ spin_lock_irqsave(&base->lock, flags);
+ cur = rb_entry(base->first, struct hrtimer, node);
+ spin_unlock_irqrestore(&base->lock, flags);
+
+ if (cur == NULL)
+ continue;
+
+ if (timer == NULL || cur->expires.tv64 < timer->expires.tv64)
+ timer = cur;
+ }
+
+ return timer;
+}
+
+/**
+ * ktime_to_jiffies - converts ktime to jiffies
+ *
+ * @event: ktime event to be converted to jiffies
+ *
+ * Caller must take care xtime locking.
+ */
+static inline unsigned long ktime_to_jiffies(const ktime_t event)
+{
+ ktime_t now, delta;
+
+ now = timespec_to_ktime(xtime);
+ delta = ktime_sub(event, now);
+
+ return jiffies + (((delta.tv64 * NSEC_CONVERSION) >>
+ (NSEC_JIFFIE_SC - SEC_JIFFIE_SC)) >> SEC_JIFFIE_SC);
+}
+
+/**
+ * hrtimer_next_jiffie - get next hrtimer event in jiffies
+ *
+ * Called from next_timer_interrupt() to get the next hrtimer event.
+ * Eventually we should change next_timer_interrupt() to return
+ * results in nanoseconds instead of jiffies. Caller must host xtime_lock.
+ */
+int hrtimer_next_jiffie(unsigned long *next_jiffie)
+{
+ struct hrtimer_base *base = __get_cpu_var(hrtimer_bases);
+ struct hrtimer * timer;
+
+ timer = hrtimer_get_next(base);
+ if (timer == NULL)
+ return -EAGAIN;
+
+ *next_jiffie = ktime_to_jiffies(timer->expires);
+
+ return 0;
+}
+
+#endif
+
/**
* hrtimer_init - initialize a timer to the given clock
*
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -478,6 +478,7 @@
}

#ifdef CONFIG_NO_IDLE_HZ
+
/*
* Find out when the next timer event is due to happen. This
* is used on S/390 to stop all activity when a cpus is idle.
@@ -489,9 +490,15 @@
struct list_head *list;
struct timer_list *nte;
unsigned long expires;
+ unsigned long hr_expires = jiffies + 10 * HZ; /* Anything far ahead */
tvec_t *varray[4];
int i, j;

+ /* Look for timer events in hrtimer. */
+ if ((hrtimer_next_jiffie(&hr_expires) == 0)
+ && (time_before(hr_expires, jiffies + 2)))
+ return hr_expires;
+
base = &__get_cpu_var(tvec_bases);
spin_lock(&base->t_base.lock);
expires = base->timer_jiffies + (LONG_MAX >> 1);
@@ -542,6 +549,10 @@
}
}
spin_unlock(&base->t_base.lock);
+
+ if (time_before(hr_expires, expires))
+ expires = hr_expires;
+
return expires;
}
#endif
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -115,6 +115,7 @@
/* Query timers: */
extern ktime_t hrtimer_get_remaining(const struct hrtimer *timer);
extern int hrtimer_get_res(const clockid_t which_clock, struct timespec *tp);
+extern int hrtimer_next_jiffie(unsigned long *next_jiffie);

static inline int hrtimer_active(const struct hrtimer *timer)
{


Attachments:
(No filename) (158.00 B)
patch-hrtimer-dyntick (3.63 kB)
Download all attachments

2006-02-25 02:23:41

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] Fix next_timer_interrupt() for hrtimer

On Saturday 25 February 2006 11:43, Tony Lindgren wrote:
> Here's one more version. This one fixes a bug in ktime_to_jiffies()
> by removing - 1 from jiffies and cuts down some conversions too.

Tony, thanks for picking this up. I like how you keep the semantics of
next_timer_interrupt intact. As you've mentioned in your comments we should
move the users of that function to nanosecond timers, and it would be nice to
use a different function name so there's no confusion (suggest
next_hrtimer_interrupt).

Cheers,
Con