2014-10-09 07:10:20

by Xunlei Pang

[permalink] [raw]
Subject: [PATCH 1/2] time: Fix NTP adjustment mult overflow.

The mult memember of struct clocksource should always be a large u32 number when calculated through
__clocksource_updatefreq_scale(). The value of (cs->mult+cs->maxadj) may have a chance to reach very
near 0xFFFFFFFF. For instance, 555MHz oscillator: cs->mult is 0xE6A17102, cs->maxadj is 0x195E8EFD,
cs->mult+cs->maxadj is 0xFFFFFFFF. Such oscillators would probably exist on some processors like
MIPS which use CP0 compare/count CPU clock as the clock source.

Clocksource might encounter large frequency adjustment due to the hardware unstability, environment
temperature, software deviation, NTP algorithm accuracy, etc. When NTP slewes the clock, kernel goes
through update_wall_time()->...->timekeeping_apply_adjustment(): tk->tkr.mult += mult_adj;
Unfortunately, tk->tkr.mult may overflow after this operation, though such cases are next to impossible
to happen in practice.

This patch avoids mult overflow by judging the overflow case before adding mult_adj to mult, also adds the
WARNING message when capturing such case.

Signed-off-by: pang.xunlei <[email protected]>
---
kernel/time/timekeeping.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index ec1791f..cad61b3 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1332,6 +1332,12 @@ static __always_inline void timekeeping_apply_adjustment(struct timekeeper *tk,
*
* XXX - TODO: Doc ntp_error calculation.
*/
+ if (tk->tkr.mult + mult_adj < mult_adj) {
+ /* NTP adjustment caused clocksource mult overflow */
+ WARN_ON_ONCE(1);
+ return;
+ }
+
tk->tkr.mult += mult_adj;
tk->xtime_interval += interval;
tk->tkr.xtime_nsec -= offset;
--
1.7.10.4


2014-10-09 07:10:31

by Xunlei Pang

[permalink] [raw]
Subject: [PATCH 2/2] time: Complete NTP adjustment threshold judging conditions

The clocksource mult-adjustment threshold is [mult-maxadj, mult+maxadj],
timekeeping_adjust() only deals with the upper threshold, but misses the
lower threshold.

This patch adds the lower threshold judging condition.

Signed-off-by: pang.xunlei <[email protected]>
---
kernel/time/timekeeping.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index cad61b3..469cdbf 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1403,7 +1403,7 @@ static void timekeeping_adjust(struct timekeeper *tk, s64 offset)
}

if (unlikely(tk->tkr.clock->maxadj &&
- (tk->tkr.mult > tk->tkr.clock->mult + tk->tkr.clock->maxadj))) {
+ (abs(tk->tkr.mult - tk->tkr.clock->mult) > tk->tkr.clock->maxadj))) {
printk_once(KERN_WARNING
"Adjusting %s more than 11%% (%ld vs %ld)\n",
tk->tkr.clock->name, (long)tk->tkr.mult,
--
1.7.10.4

2014-10-21 10:19:59

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 1/2] time: Fix NTP adjustment mult overflow.

On Thu, 9 Oct 2014, pang.xunlei wrote:

First of all: Please use proper line breaks in the changelog.

> The mult memember of struct clocksource should always be a large u32
> number when calculated through __clocksource_updatefreq_scale(). The
> value of (cs->mult+cs->maxadj) may have a chance to reach very near
> 0xFFFFFFFF.

And what's the actual problem with reaching a value near 0xFFFFFFFF?

> For instance, 555MHz oscillator: cs->mult is 0xE6A17102,
> cs->maxadj is 0x195E8EFD, cs->mult+cs->maxadj is 0xFFFFFFFF. Such
> oscillators would probably exist on some processors like MIPS which
> use CP0 compare/count CPU clock as the clock source.

Again, what's the problem?

> Clocksource might encounter large frequency adjustment due to the
> hardware unstability, environment temperature, software deviation,
> NTP algorithm accuracy, etc. When NTP slewes the clock, kernel goes
> through update_wall_time()->...->timekeeping_apply_adjustment():
> tk->tkr.mult += mult_adj; Unfortunately, tk->tkr.mult may overflow
> after this operation, though such cases are next to impossible to
> happen in practice.

So you adding this just for correctness reasons, not because you
observed the problem in practice?

Thanks,

tglx

2014-10-21 16:11:10

by John Stultz

[permalink] [raw]
Subject: Re: [PATCH 1/2] time: Fix NTP adjustment mult overflow.

On Tue, Oct 21, 2014 at 3:19 AM, Thomas Gleixner <[email protected]> wrote:
> On Thu, 9 Oct 2014, pang.xunlei wrote:
>
> First of all: Please use proper line breaks in the changelog.
>
>> The mult memember of struct clocksource should always be a large u32
>> number when calculated through __clocksource_updatefreq_scale(). The
>> value of (cs->mult+cs->maxadj) may have a chance to reach very near
>> 0xFFFFFFFF.
>
> And what's the actual problem with reaching a value near 0xFFFFFFFF?
>
>> For instance, 555MHz oscillator: cs->mult is 0xE6A17102,
>> cs->maxadj is 0x195E8EFD, cs->mult+cs->maxadj is 0xFFFFFFFF. Such
>> oscillators would probably exist on some processors like MIPS which
>> use CP0 compare/count CPU clock as the clock source.
>
> Again, what's the problem?

So the problem is that with an adjustment the mult value might
overflow, becoming very small, bascially causing time to stop
increasing. This is mentioned below, but I agree we're burying the
headline a bit.


>> Clocksource might encounter large frequency adjustment due to the
>> hardware unstability, environment temperature, software deviation,
>> NTP algorithm accuracy, etc. When NTP slewes the clock, kernel goes
>> through update_wall_time()->...->timekeeping_apply_adjustment():
>> tk->tkr.mult += mult_adj; Unfortunately, tk->tkr.mult may overflow
>> after this operation, though such cases are next to impossible to
>> happen in practice.
>
> So you adding this just for correctness reasons, not because you
> observed the problem in practice?

This is my understanding.

I'll work with Xunlei to make further clarifications to the changelog
to make this more explicit.

Thanks for your feedback!
-john

2014-10-21 19:35:12

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 1/2] time: Fix NTP adjustment mult overflow.

On Tue, 21 Oct 2014, John Stultz wrote:
> On Tue, Oct 21, 2014 at 3:19 AM, Thomas Gleixner <[email protected]> wrote:
> > So you adding this just for correctness reasons, not because you
> > observed the problem in practice?
>
> This is my understanding.
>
> I'll work with Xunlei to make further clarifications to the changelog
> to make this more explicit.

Ok. I have no objections to the patch itself, just the changelog made
my brain go into spiral mode ...

Thanks,

tglx

2014-10-24 04:22:42

by John Stultz

[permalink] [raw]
Subject: Re: [PATCH 1/2] time: Fix NTP adjustment mult overflow.

On Wed, Oct 22, 2014 at 5:37 AM, Xunlei Pang <[email protected]> wrote:
> The mult memember of struct clocksource should always be a large u32 number
> when calculated through
> __clocksource_updatefreq_scale(). The value of (cs->mult+cs->maxadj) may
> have a chance to reach very
> near 0xFFFFFFFF, so it may overflow when doing NTP positive adjustment, see
> the following detail:
> When NTP slewes the clock, kernel goes through
> update_wall_time()->...->timekeeping_apply_adjustment():
> tk->tkr.mult += mult_adj;
> Unfortunately, tk->tkr.mult may overflow after this operation.
>
>
> This patch avoids mult overflow by judging the overflow case before adding
> mult_adj to mult, also adds the
> WARNING message when capturing such case.
>
> Signed-off-by: pang.xunlei <[email protected]>

I reworded this a bit further, but its in my queue for 3.19.

thanks
-john

2014-10-24 04:34:56

by John Stultz

[permalink] [raw]
Subject: Re: [PATCH 2/2] time: Complete NTP adjustment threshold judging conditions

On Thu, Oct 9, 2014 at 12:04 AM, pang.xunlei <[email protected]> wrote:
> The clocksource mult-adjustment threshold is [mult-maxadj, mult+maxadj],
> timekeeping_adjust() only deals with the upper threshold, but misses the
> lower threshold.
>
> This patch adds the lower threshold judging condition.
>
> Signed-off-by: pang.xunlei <[email protected]>

Added to my 3.19 queue.

thanks!
-john