The tv_nsec is a long and when added to the shift value it can wrap
and become negative which later causes looping problems in the
getrawmonotonic(). The edge case occurs when the system has slept for
a short period of time of ~2 seconds.
A trace printk of the values in this patch illustrate the problem:
ftrace time stamp: log
43.716079: logarithmic_accumulation: raw_shift: 3d0913 tv_nsec d687faa
43.718513: logarithmic_accumulation: raw_shift: 3d0913 tv_nsec da588bd
43.722161: logarithmic_accumulation: raw_shift: 3d0913 tv_nsec de291d0
46.349925: logarithmic_accumulation: raw_shift: 7a122600 tv_nsec e1f9ae3
46.349930: logarithmic_accumulation: raw_shift: 1e848980 tv_nsec 8831c0e3
The kernel starts looping at 46.349925 in the getrawmonotonic() due to
the negative value from adding the raw_shift to tv_nsec.
A simple solution is to process the raw_shift separately from the
tv_nsec using the same type of loop in logarithmic_accumulation().
Signed-off-by: Jason Wessel <[email protected]>
CC: John Stultz <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: H. Peter Anvin <[email protected]>
---
kernel/time/timekeeping.c | 8 +++++++-
1 files changed, 7 insertions(+), 1 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index caf8d4d..1762282 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -736,6 +736,7 @@ static void timekeeping_adjust(s64 offset)
static cycle_t logarithmic_accumulation(cycle_t offset, int shift)
{
u64 nsecps = (u64)NSEC_PER_SEC << timekeeper.shift;
+ u64 raw_shift;
/* If the offset is smaller then a shifted interval, do nothing */
if (offset < timekeeper.cycle_interval<<shift)
@@ -753,7 +754,12 @@ static cycle_t logarithmic_accumulation(cycle_t offset, int shift)
}
/* Accumulate into raw time */
- raw_time.tv_nsec += timekeeper.raw_interval << shift;;
+ raw_shift = timekeeper.raw_interval << shift;
+ while (raw_shift >= NSEC_PER_SEC) {
+ raw_shift -= NSEC_PER_SEC;
+ raw_time.tv_sec++;
+ }
+ raw_time.tv_nsec += raw_shift;
while (raw_time.tv_nsec >= NSEC_PER_SEC) {
raw_time.tv_nsec -= NSEC_PER_SEC;
raw_time.tv_sec++;
--
1.6.3.3
On Thu, 2010-08-05 at 07:28 -0500, Jason Wessel wrote:
> The tv_nsec is a long and when added to the shift value it can wrap
> and become negative which later causes looping problems in the
> getrawmonotonic(). The edge case occurs when the system has slept for
> a short period of time of ~2 seconds.
Ah, good catch!
I reworked some of the variable names to make a little more sense and
simplified the accumulation. Do you mind giving this a test in your
environment that triggered the issue to make sure nothing else slipped
in?
thanks
-john
>From 512349b1f7ab0d9b6dff5e33bf4820a50e79f862 Mon Sep 17 00:00:00 2001
From: Jason Wessel <[email protected]>
Date: Thu, 5 Aug 2010 07:28:32 -0500
Subject: [PATCH] timekeeping: Fix overflow in rawtime tv_nsec on 32 bit archs
The tv_nsec is a long and when added to the shifted interval it can wrap
and become negative which later causes looping problems in the
getrawmonotonic(). The edge case occurs when the system has slept for
a short period of time of ~2 seconds.
A trace printk of the values in this patch illustrate the problem:
ftrace time stamp: log
43.716079: logarithmic_accumulation: raw: 3d0913 tv_nsec d687faa
43.718513: logarithmic_accumulation: raw: 3d0913 tv_nsec da588bd
43.722161: logarithmic_accumulation: raw: 3d0913 tv_nsec de291d0
46.349925: logarithmic_accumulation: raw: 7a122600 tv_nsec e1f9ae3
46.349930: logarithmic_accumulation: raw: 1e848980 tv_nsec 8831c0e3
The kernel starts looping at 46.349925 in the getrawmonotonic() due to
the negative value from adding the raw value to tv_nsec.
A simple solution is to accumulate into a u64, and then normalize it
to a timespec_t.
Signed-off-by: Jason Wessel <[email protected]>
Reworked variable names and sipmlified some of the code.
Signed-off-by: John Stultz <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: H. Peter Anvin <[email protected]>
---
kernel/time/timekeeping.c | 11 +++++++----
1 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index caf8d4d..6603860 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -736,6 +736,7 @@ static void timekeeping_adjust(s64 offset)
static cycle_t logarithmic_accumulation(cycle_t offset, int shift)
{
u64 nsecps = (u64)NSEC_PER_SEC << timekeeper.shift;
+ u64 raw_nsecs;
/* If the offset is smaller then a shifted interval, do nothing */
if (offset < timekeeper.cycle_interval<<shift)
@@ -752,12 +753,14 @@ static cycle_t logarithmic_accumulation(cycle_t offset, int shift)
second_overflow();
}
- /* Accumulate into raw time */
- raw_time.tv_nsec += timekeeper.raw_interval << shift;;
- while (raw_time.tv_nsec >= NSEC_PER_SEC) {
- raw_time.tv_nsec -= NSEC_PER_SEC;
+ /* Accumulate raw time */
+ raw_nsecs = timekeeper.raw_interval << shift;
+ raw_nsecs += raw_time.tv_nsec;
+ while (raw_nsecs >= NSEC_PER_SEC) {
+ raw_nsecs -= NSEC_PER_SEC;
raw_time.tv_sec++;
}
+ raw_time.tv_nsec = raw_nsecs;
/* Accumulate error between NTP and clock interval */
timekeeper.ntp_error += tick_length << shift;
--
1.6.0.4
On 08/05/2010 05:17 PM, john stultz wrote:
> On Thu, 2010-08-05 at 07:28 -0500, Jason Wessel wrote:
>> The tv_nsec is a long and when added to the shift value it can wrap
>> and become negative which later causes looping problems in the
>> getrawmonotonic(). The edge case occurs when the system has slept for
>> a short period of time of ~2 seconds.
>
> Ah, good catch!
>
> I reworked some of the variable names to make a little more sense and
> simplified the accumulation. Do you mind giving this a test in your
> environment that triggered the issue to make sure nothing else slipped
> in?
>
No problem.
This looks good to me. I even increased the delay and I can see it recovers properly.
The instrumentation shows raw_nsecs would have otherwise been negative going from 90.* to 97.* in the log.
<...>-4801 [000] 90.105084: update_wall_time: raw_nsecs: 37283ea1
<...>-4801 [000] 90.109078: update_wall_time: raw_nsecs: 376547b4
<...>-4801 [000] 97.694264: update_wall_time: raw_nsecs: b1776db4
<...>-4801 [000] 97.694270: update_wall_time: raw_nsecs: b453ffb4
<...>-4801 [000] 97.694272: update_wall_time: raw_nsecs: 7b95c7b4
Note that I had instrumented it just after:
raw_nsecs += raw_time.tv_nsec;
We should send this over to -stable when it is considered baked because this was found in the 2.6.35 and may be a problem elsewhere as well.
Thanks,
Jason.