2024-03-25 13:06:58

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V2 00/19] timekeeping: Handle potential multiplication overflow

Hi

Kernel timekeeping calculates a clock value by keeping a base value and
adding the number of nanoseconds since that time. Those nanoseconds are
calculated from the clocksource delta. Then periodically, the base value is
moved forwards (refer timekeeping_advance()) which is done by the local
timer interrupt handler. It is designed such that there will always be a
timer interrupt before the delta becomes big enough to overflow the 64-bit
multiplication used in the conversion of delta to nanoseconds (refer
timekeeping_delta_to_ns()). Obviously if timer interrupts are stopped, then
the multiplication does eventually overflow.

Timekeeping multiplication overflow results in a "time loop", typically
cycling about every 15 minutes with x86 TSC, for example starting at 10:00:

10:00, 10:01, 10:02 ... 10:15, 10:00, 10:01, ... 10:15, 10:00, 10:01 ...

Because a VMM can deliberately stop timer interrupts for a guest, a virtual
machine can be exposed to this issue.

TDX maintains a monotonically increasing virtual TSC for a TDX guest, so
the overflow is allowing a backwards movement of timekeeping that would not
happen otherwise.

It is considered this could break security of cryptographic protocols that
rely on the timestamps for freshness / replay protection, and consequently
the kernel should prevent such a time loop.

Handle multiplication overflows by falling back to higher precision
calculation when the possibility of an overflow is detected.

Extend the facility also to VDSO, dependent on new config option
GENERIC_VDSO_OVERFLOW_PROTECT which is selected by x86 only, so other
architectures are not affected. The result is a calculation that has
similar performance as before. Most machines showed performance benefit,
except Skylake-based hardware such as Intel Kaby Lake which was seen <1%
worse.


Changes in V2:
vdso: Consolidate vdso_calc_delta()
Keep powerpc comment about mask
Move ifdef out of function
vdso: Consolidate nanoseconds calculation
Adjusted due to changes in "vdso: Consolidate vdso_calc_delta()"


Adrian Hunter (19):
vdso: Consolidate vdso_calc_delta()
vdso: Consolidate nanoseconds calculation
vdso: Add CONFIG_GENERIC_VDSO_OVERFLOW_PROTECT
math64: Tidy mul_u64_u32_shr()
vdso: math64: Provide mul_u64_u32_add_u64_shr()
vdso: Add vdso_data::max_cycles
vdso: Make delta calculation overflow safe
x86/vdso: Make delta calculation overflow safe
timekeeping: Move timekeeping helper functions
timekeeping: Rename fast_tk_get_delta_ns() to __timekeeping_get_ns()
timekeeping: Tidy timekeeping_cycles_to_ns() slightly
timekeeping: Reuse timekeeping_cycles_to_ns()
timekeeping: Refactor timekeeping helpers
timekeeping: Consolidate timekeeping helpers
timekeeping: Fold in timekeeping_delta_to_ns()
timekeeping: Prepare timekeeping_cycles_to_ns() for overflow safety
timekeeping: Make delta calculation overflow safe
timekeeping: Let timekeeping_cycles_to_ns() handle both under and overflow
clocksource: Make watchdog and suspend-timing multiplication overflow safe

arch/powerpc/include/asm/vdso/gettimeofday.h | 26 +++----
arch/s390/include/asm/vdso/gettimeofday.h | 7 +-
arch/x86/Kconfig | 1 +
arch/x86/include/asm/vdso/gettimeofday.h | 42 +++++++----
include/linux/math64.h | 8 +-
include/vdso/datapage.h | 4 +
include/vdso/math64.h | 38 ++++++++++
kernel/time/clocksource.c | 42 +++++------
kernel/time/timekeeping.c | 106 ++++++++++++++-------------
kernel/time/vsyscall.c | 6 ++
lib/vdso/Kconfig | 7 ++
lib/vdso/gettimeofday.c | 55 +++++++++-----
12 files changed, 208 insertions(+), 134 deletions(-)


Regards
Adrian


2024-03-25 13:08:19

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V2 17/19] timekeeping: Make delta calculation overflow safe

Kernel timekeeping is designed to keep the change in cycles (since the last
timer interrupt) below max_cycles, which prevents multiplication overflow
when converting cycles to nanoseconds. However, if timer interrupts stop,
the calculation will eventually overflow.

Add protection against that. In timekeeping_cycles_to_ns() calculation,
check against max_cycles, falling back to a slower higher precision
calculation. In timekeeping_forward_now(), process delta in chunks of at
most max_cycles.

Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
---
kernel/time/timekeeping.c | 40 ++++++++++++++++++++++++++++-----------
1 file changed, 29 insertions(+), 11 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index d17484082e2c..111dfdbd488f 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -364,19 +364,32 @@ static void tk_setup_internals(struct timekeeper *tk, struct clocksource *clock)
}

/* Timekeeper helper functions. */
+static noinline u64 delta_to_ns_safe(const struct tk_read_base *tkr, u64 delta)
+{
+ return mul_u64_u32_add_u64_shr(delta, tkr->mult, tkr->xtime_nsec, tkr->shift);
+}
+
static inline u64 timekeeping_cycles_to_ns(const struct tk_read_base *tkr, u64 cycles)
{
/* Calculate the delta since the last update_wall_time() */
u64 mask = tkr->mask, delta = (cycles - tkr->cycle_last) & mask;

- if (IS_ENABLED(CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE)) {
- /*
- * Handle clocksource inconsistency between CPUs to prevent
- * time from going backwards by checking for the MSB of the
- * mask being set in the delta.
- */
- if (unlikely(delta & ~(mask >> 1)))
- return tkr->xtime_nsec >> tkr->shift;
+ /*
+ * This detects the case where the delta overflows the multiplication
+ * with tkr->mult.
+ */
+ if (unlikely(delta > tkr->clock->max_cycles)) {
+ if (IS_ENABLED(CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE)) {
+ /*
+ * Handle clocksource inconsistency between CPUs to prevent
+ * time from going backwards by checking for the MSB of the
+ * mask being set in the delta.
+ */
+ if (unlikely(delta & ~(mask >> 1)))
+ return tkr->xtime_nsec >> tkr->shift;
+ }
+
+ return delta_to_ns_safe(tkr, delta);
}

return ((delta * tkr->mult) + tkr->xtime_nsec) >> tkr->shift;
@@ -789,10 +802,15 @@ static void timekeeping_forward_now(struct timekeeper *tk)
tk->tkr_mono.cycle_last = cycle_now;
tk->tkr_raw.cycle_last = cycle_now;

- tk->tkr_mono.xtime_nsec += delta * tk->tkr_mono.mult;
- tk->tkr_raw.xtime_nsec += delta * tk->tkr_raw.mult;
+ while (delta > 0) {
+ u64 max = tk->tkr_mono.clock->max_cycles;
+ u64 incr = delta < max ? delta : max;

- tk_normalize_xtime(tk);
+ tk->tkr_mono.xtime_nsec += incr * tk->tkr_mono.mult;
+ tk->tkr_raw.xtime_nsec += incr * tk->tkr_raw.mult;
+ tk_normalize_xtime(tk);
+ delta -= incr;
+ }
}

/**
--
2.34.1


2024-03-25 13:08:20

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V2 12/19] timekeeping: Reuse timekeeping_cycles_to_ns()

Simplify __timekeeping_get_ns() by reusing timekeeping_cycles_to_ns().

Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
---
kernel/time/timekeeping.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index c698219b152d..f81d675291e0 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -391,10 +391,7 @@ static inline u64 timekeeping_cycles_to_ns(const struct tk_read_base *tkr, u64 c

static __always_inline u64 __timekeeping_get_ns(const struct tk_read_base *tkr)
{
- u64 delta, cycles = tk_clock_read(tkr);
-
- delta = clocksource_delta(cycles, tkr->cycle_last, tkr->mask);
- return timekeeping_delta_to_ns(tkr, delta);
+ return timekeeping_cycles_to_ns(tkr, tk_clock_read(tkr));
}

static inline u64 timekeeping_get_ns(const struct tk_read_base *tkr)
--
2.34.1


2024-03-25 13:08:32

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V2 08/19] x86/vdso: Make delta calculation overflow safe

Kernel timekeeping is designed to keep the change in cycles (since the last
timer interrupt) below max_cycles, which prevents multiplication overflow
when converting cycles to nanoseconds. However, if timer interrupts stop,
the calculation will eventually overflow.

Add protection against that. Select GENERIC_VDSO_OVERFLOW_PROTECT so that
max_cycles is made available in the VDSO data page. Check against
max_cycles, falling back to a slower higher precision calculation. Take
advantage of the opportunity to move masking and negative motion check
into the slow path.

The result is a calculation that has similar performance as before. Newer
machines showed performance benefit, whereas older Skylake-based hardware
such as Intel Kaby Lake was seen <1% worse.

Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
---
arch/x86/Kconfig | 1 +
arch/x86/include/asm/vdso/gettimeofday.h | 29 +++++++++++++++++-------
2 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 03483b23a009..3a70ebb558e7 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -168,6 +168,7 @@ config X86
select GENERIC_TIME_VSYSCALL
select GENERIC_GETTIMEOFDAY
select GENERIC_VDSO_TIME_NS
+ select GENERIC_VDSO_OVERFLOW_PROTECT
select GUP_GET_PXX_LOW_HIGH if X86_PAE
select HARDIRQS_SW_RESEND
select HARDLOCKUP_CHECK_TIMESTAMP if X86_64
diff --git a/arch/x86/include/asm/vdso/gettimeofday.h b/arch/x86/include/asm/vdso/gettimeofday.h
index 5727dedd3549..0ef36190abe6 100644
--- a/arch/x86/include/asm/vdso/gettimeofday.h
+++ b/arch/x86/include/asm/vdso/gettimeofday.h
@@ -319,18 +319,31 @@ static inline bool arch_vdso_cycles_ok(u64 cycles)
*/
static __always_inline u64 vdso_calc_ns(const struct vdso_data *vd, u64 cycles, u64 base)
{
+ u64 delta = cycles - vd->cycle_last;
+
/*
+ * Negative motion and deltas which can cause multiplication
+ * overflow require special treatment. This check covers both as
+ * negative motion is guaranteed to be greater than @vd::max_cycles
+ * due to unsigned comparison.
+ *
* Due to the MSB/Sign-bit being used as invalid marker (see
- * arch_vdso_cycles_valid() above), the effective mask is S64_MAX.
+ * arch_vdso_cycles_valid() above), the effective mask is S64_MAX,
+ * but that case is also unlikely and will also take the unlikely path
+ * here.
*/
- u64 delta = (cycles - vd->cycle_last) & S64_MAX;
+ if (unlikely(delta > vd->max_cycles)) {
+ /*
+ * Due to the above mentioned TSC wobbles, filter out
+ * negative motion. Per the above masking, the effective
+ * sign bit is now bit 62.
+ */
+ if (delta & (1ULL << 62))
+ return base >> vd->shift;

- /*
- * Due to the above mentioned TSC wobbles, filter out negative motion.
- * Per the above masking, the effective sign bit is now bit 62.
- */
- if (unlikely(delta & (1ULL << 62)))
- return base >> vd->shift;
+ /* Handle multiplication overflow gracefully */
+ return mul_u64_u32_add_u64_shr(delta & S64_MAX, vd->mult, base, vd->shift);
+ }

return ((delta * vd->mult) + base) >> vd->shift;
}
--
2.34.1


2024-03-25 13:08:44

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V2 15/19] timekeeping: Fold in timekeeping_delta_to_ns()

timekeeping_delta_to_ns() is now called only from
timekeeping_cycles_to_ns(), and it is not useful otherwise. Simplify by
folding it into timekeeping_cycles_to_ns().

Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
---
kernel/time/timekeeping.c | 13 +------------
1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 1bbfe1ff8d24..749387f22f0f 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -364,23 +364,12 @@ static void tk_setup_internals(struct timekeeper *tk, struct clocksource *clock)
}

/* Timekeeper helper functions. */
-
-static inline u64 timekeeping_delta_to_ns(const struct tk_read_base *tkr, u64 delta)
-{
- u64 nsec;
-
- nsec = delta * tkr->mult + tkr->xtime_nsec;
- nsec >>= tkr->shift;
-
- return nsec;
-}
-
static inline u64 timekeeping_cycles_to_ns(const struct tk_read_base *tkr, u64 cycles)
{
/* Calculate the delta since the last update_wall_time() */
u64 delta = clocksource_delta(cycles, tkr->cycle_last, tkr->mask);

- return timekeeping_delta_to_ns(tkr, delta);
+ return ((delta * tkr->mult) + tkr->xtime_nsec) >> tkr->shift;
}

static __always_inline u64 __timekeeping_get_ns(const struct tk_read_base *tkr)
--
2.34.1


2024-03-25 15:05:27

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V2 11/19] timekeeping: Tidy timekeeping_cycles_to_ns() slightly

Put together declaration and initialization of the local variable 'delta'.

Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
---
kernel/time/timekeeping.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 63061332a75c..c698219b152d 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -383,10 +383,9 @@ static inline u64 timekeeping_delta_to_ns(const struct tk_read_base *tkr, u64 de

static inline u64 timekeeping_cycles_to_ns(const struct tk_read_base *tkr, u64 cycles)
{
- u64 delta;
+ /* Calculate the delta since the last update_wall_time() */
+ u64 delta = clocksource_delta(cycles, tkr->cycle_last, tkr->mask);

- /* calculate the delta since the last update_wall_time */
- delta = clocksource_delta(cycles, tkr->cycle_last, tkr->mask);
return timekeeping_delta_to_ns(tkr, delta);
}

--
2.34.1


2024-03-25 15:50:52

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V2 05/19] vdso: math64: Provide mul_u64_u32_add_u64_shr()

Provide mul_u64_u32_add_u64_shr() which is a calculation that will be used
by timekeeping and VDSO.

Place #include <vdso/math64.h> after #include <asm/div64.h> to allow
architecture-specific overrides, at least for the kernel.

Signed-off-by: Adrian Hunter <[email protected]>
---
include/linux/math64.h | 2 +-
include/vdso/math64.h | 38 ++++++++++++++++++++++++++++++++++++++
2 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/include/linux/math64.h b/include/linux/math64.h
index fd13622b2056..d34def7f9a8c 100644
--- a/include/linux/math64.h
+++ b/include/linux/math64.h
@@ -4,8 +4,8 @@

#include <linux/types.h>
#include <linux/math.h>
-#include <vdso/math64.h>
#include <asm/div64.h>
+#include <vdso/math64.h>

#if BITS_PER_LONG == 64

diff --git a/include/vdso/math64.h b/include/vdso/math64.h
index 7da703ee5561..22ae212f8b28 100644
--- a/include/vdso/math64.h
+++ b/include/vdso/math64.h
@@ -21,4 +21,42 @@ __iter_div_u64_rem(u64 dividend, u32 divisor, u64 *remainder)
return ret;
}

+#if defined(CONFIG_ARCH_SUPPORTS_INT128) && defined(__SIZEOF_INT128__)
+
+#ifndef mul_u64_u32_add_u64_shr
+static __always_inline u64 mul_u64_u32_add_u64_shr(u64 a, u32 mul, u64 b, unsigned int shift)
+{
+ return (u64)((((unsigned __int128)a * mul) + b) >> shift);
+}
+#endif /* mul_u64_u32_add_u64_shr */
+
+#else
+
+#ifndef mul_u64_u32_add_u64_shr
+#ifndef mul_u32_u32
+static inline u64 mul_u32_u32(u32 a, u32 b)
+{
+ return (u64)a * b;
+}
+#define mul_u32_u32 mul_u32_u32
+#endif
+static __always_inline u64 mul_u64_u32_add_u64_shr(u64 a, u32 mul, u64 b, unsigned int shift)
+{
+ u32 ah = a >> 32, al = a;
+ bool ovf;
+ u64 ret;
+
+ ovf = __builtin_add_overflow(mul_u32_u32(al, mul), b, &ret);
+ ret >>= shift;
+ if (ovf && shift)
+ ret += 1ULL << (64 - shift);
+ if (ah)
+ ret += mul_u32_u32(ah, mul) << (32 - shift);
+
+ return ret;
+}
+#endif /* mul_u64_u32_add_u64_shr */
+
+#endif
+
#endif /* __VDSO_MATH64_H */
--
2.34.1


2024-03-25 15:53:21

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V2 01/19] vdso: Consolidate vdso_calc_delta()

Consolidate vdso_calc_delta(), in preparation for further simplification.

Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
---


Changes in V2:
Keep powerpc comment about mask
Move ifdef out of function


arch/powerpc/include/asm/vdso/gettimeofday.h | 26 +++++++++-----------
arch/s390/include/asm/vdso/gettimeofday.h | 7 ++----
lib/vdso/gettimeofday.c | 9 ++++++-
3 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/include/asm/vdso/gettimeofday.h b/arch/powerpc/include/asm/vdso/gettimeofday.h
index f0a4cf01e85c..ac21a2a0c2f9 100644
--- a/arch/powerpc/include/asm/vdso/gettimeofday.h
+++ b/arch/powerpc/include/asm/vdso/gettimeofday.h
@@ -14,6 +14,17 @@

#define VDSO_HAS_TIME 1

+/*
+ * powerpc specific delta calculation.
+ *
+ * This variant removes the masking of the subtraction because the
+ * clocksource mask of all VDSO capable clocksources on powerpc is U64_MAX
+ * which would result in a pointless operation. The compiler cannot
+ * optimize it away as the mask comes from the vdso data and is not compile
+ * time constant.
+ */
+#define VDSO_DELTA_NOMASK 1
+
static __always_inline int do_syscall_2(const unsigned long _r0, const unsigned long _r3,
const unsigned long _r4)
{
@@ -105,21 +116,6 @@ static inline bool vdso_clocksource_ok(const struct vdso_data *vd)
}
#define vdso_clocksource_ok vdso_clocksource_ok

-/*
- * powerpc specific delta calculation.
- *
- * This variant removes the masking of the subtraction because the
- * clocksource mask of all VDSO capable clocksources on powerpc is U64_MAX
- * which would result in a pointless operation. The compiler cannot
- * optimize it away as the mask comes from the vdso data and is not compile
- * time constant.
- */
-static __always_inline u64 vdso_calc_delta(u64 cycles, u64 last, u64 mask, u32 mult)
-{
- return (cycles - last) * mult;
-}
-#define vdso_calc_delta vdso_calc_delta
-
#ifndef __powerpc64__
static __always_inline u64 vdso_shift_ns(u64 ns, unsigned long shift)
{
diff --git a/arch/s390/include/asm/vdso/gettimeofday.h b/arch/s390/include/asm/vdso/gettimeofday.h
index db84942eb78f..7937765ccfa5 100644
--- a/arch/s390/include/asm/vdso/gettimeofday.h
+++ b/arch/s390/include/asm/vdso/gettimeofday.h
@@ -6,16 +6,13 @@

#define VDSO_HAS_CLOCK_GETRES 1

+#define VDSO_DELTA_NOMASK 1
+
#include <asm/syscall.h>
#include <asm/timex.h>
#include <asm/unistd.h>
#include <linux/compiler.h>

-#define vdso_calc_delta __arch_vdso_calc_delta
-static __always_inline u64 __arch_vdso_calc_delta(u64 cycles, u64 last, u64 mask, u32 mult)
-{
- return (cycles - last) * mult;
-}

static __always_inline const struct vdso_data *__arch_get_vdso_data(void)
{
diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
index ce2f69552003..faccf12f7c03 100644
--- a/lib/vdso/gettimeofday.c
+++ b/lib/vdso/gettimeofday.c
@@ -6,6 +6,13 @@
#include <vdso/helpers.h>

#ifndef vdso_calc_delta
+
+#ifdef VDSO_DELTA_NOMASK
+# define VDSO_DELTA_MASK(mask) U64_MAX
+#else
+# define VDSO_DELTA_MASK(mask) (mask)
+#endif
+
/*
* Default implementation which works for all sane clocksources. That
* obviously excludes x86/TSC.
@@ -13,7 +20,7 @@
static __always_inline
u64 vdso_calc_delta(u64 cycles, u64 last, u64 mask, u32 mult)
{
- return ((cycles - last) & mask) * mult;
+ return ((cycles - last) & VDSO_DELTA_MASK(mask)) * mult;
}
#endif

--
2.34.1


2024-03-25 22:23:36

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH V2 00/19] timekeeping: Handle potential multiplication overflow

On Mon, Mar 25, 2024, at 07:40, Adrian Hunter wrote:
>
> Extend the facility also to VDSO, dependent on new config option
> GENERIC_VDSO_OVERFLOW_PROTECT which is selected by x86 only, so other
> architectures are not affected. The result is a calculation that has
> similar performance as before. Most machines showed performance benefit,
> except Skylake-based hardware such as Intel Kaby Lake which was seen <1%
> worse.

I've read through the series, and this pretty much all makes sense,
nice work!

There are a few patches that just rearrange the local variable
declarations to save a few lines, and I don't see those as an
improvement, but they also don't hurt aside from distracting
slightly from the real changes.

Arnd

2024-04-08 13:11:04

by tip-bot2 for Tony Luck

[permalink] [raw]
Subject: [tip: timers/core] timekeeping: Make delta calculation overflow safe

The following commit has been merged into the timers/core branch of tip:

Commit-ID: fcf190c369149c3b04539797cedf28741eb14164
Gitweb: https://git.kernel.org/tip/fcf190c369149c3b04539797cedf28741eb14164
Author: Adrian Hunter <[email protected]>
AuthorDate: Mon, 25 Mar 2024 08:40:21 +02:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Mon, 08 Apr 2024 15:03:08 +02:00

timekeeping: Make delta calculation overflow safe

Kernel timekeeping is designed to keep the change in cycles (since the last
timer interrupt) below max_cycles, which prevents multiplication overflow
when converting cycles to nanoseconds. However, if timer interrupts stop,
the calculation will eventually overflow.

Add protection against that. In timekeeping_cycles_to_ns() calculation,
check against max_cycles, falling back to a slower higher precision
calculation. In timekeeping_forward_now(), process delta in chunks of at
most max_cycles.

Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

---
kernel/time/timekeeping.c | 40 +++++++++++++++++++++++++++-----------
1 file changed, 29 insertions(+), 11 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index d174840..111dfdb 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -364,19 +364,32 @@ static void tk_setup_internals(struct timekeeper *tk, struct clocksource *clock)
}

/* Timekeeper helper functions. */
+static noinline u64 delta_to_ns_safe(const struct tk_read_base *tkr, u64 delta)
+{
+ return mul_u64_u32_add_u64_shr(delta, tkr->mult, tkr->xtime_nsec, tkr->shift);
+}
+
static inline u64 timekeeping_cycles_to_ns(const struct tk_read_base *tkr, u64 cycles)
{
/* Calculate the delta since the last update_wall_time() */
u64 mask = tkr->mask, delta = (cycles - tkr->cycle_last) & mask;

- if (IS_ENABLED(CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE)) {
- /*
- * Handle clocksource inconsistency between CPUs to prevent
- * time from going backwards by checking for the MSB of the
- * mask being set in the delta.
- */
- if (unlikely(delta & ~(mask >> 1)))
- return tkr->xtime_nsec >> tkr->shift;
+ /*
+ * This detects the case where the delta overflows the multiplication
+ * with tkr->mult.
+ */
+ if (unlikely(delta > tkr->clock->max_cycles)) {
+ if (IS_ENABLED(CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE)) {
+ /*
+ * Handle clocksource inconsistency between CPUs to prevent
+ * time from going backwards by checking for the MSB of the
+ * mask being set in the delta.
+ */
+ if (unlikely(delta & ~(mask >> 1)))
+ return tkr->xtime_nsec >> tkr->shift;
+ }
+
+ return delta_to_ns_safe(tkr, delta);
}

return ((delta * tkr->mult) + tkr->xtime_nsec) >> tkr->shift;
@@ -789,10 +802,15 @@ static void timekeeping_forward_now(struct timekeeper *tk)
tk->tkr_mono.cycle_last = cycle_now;
tk->tkr_raw.cycle_last = cycle_now;

- tk->tkr_mono.xtime_nsec += delta * tk->tkr_mono.mult;
- tk->tkr_raw.xtime_nsec += delta * tk->tkr_raw.mult;
+ while (delta > 0) {
+ u64 max = tk->tkr_mono.clock->max_cycles;
+ u64 incr = delta < max ? delta : max;

- tk_normalize_xtime(tk);
+ tk->tkr_mono.xtime_nsec += incr * tk->tkr_mono.mult;
+ tk->tkr_raw.xtime_nsec += incr * tk->tkr_raw.mult;
+ tk_normalize_xtime(tk);
+ delta -= incr;
+ }
}

/**

2024-04-08 13:11:28

by tip-bot2 for Tony Luck

[permalink] [raw]
Subject: [tip: timers/core] timekeeping: Fold in timekeeping_delta_to_ns()

The following commit has been merged into the timers/core branch of tip:

Commit-ID: 3094c6db1cba0bbca6ea19c777762c26fee747d7
Gitweb: https://git.kernel.org/tip/3094c6db1cba0bbca6ea19c777762c26fee747d7
Author: Adrian Hunter <[email protected]>
AuthorDate: Mon, 25 Mar 2024 08:40:19 +02:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Mon, 08 Apr 2024 15:03:08 +02:00

timekeeping: Fold in timekeeping_delta_to_ns()

timekeeping_delta_to_ns() is now called only from
timekeeping_cycles_to_ns(), and it is not useful otherwise.

Simplify the code by folding it into timekeeping_cycles_to_ns().

No functional change.

Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

---
kernel/time/timekeeping.c | 13 +------------
1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 1bbfe1f..749387f 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -364,23 +364,12 @@ static void tk_setup_internals(struct timekeeper *tk, struct clocksource *clock)
}

/* Timekeeper helper functions. */
-
-static inline u64 timekeeping_delta_to_ns(const struct tk_read_base *tkr, u64 delta)
-{
- u64 nsec;
-
- nsec = delta * tkr->mult + tkr->xtime_nsec;
- nsec >>= tkr->shift;
-
- return nsec;
-}
-
static inline u64 timekeeping_cycles_to_ns(const struct tk_read_base *tkr, u64 cycles)
{
/* Calculate the delta since the last update_wall_time() */
u64 delta = clocksource_delta(cycles, tkr->cycle_last, tkr->mask);

- return timekeeping_delta_to_ns(tkr, delta);
+ return ((delta * tkr->mult) + tkr->xtime_nsec) >> tkr->shift;
}

static __always_inline u64 __timekeeping_get_ns(const struct tk_read_base *tkr)

2024-04-08 13:12:08

by tip-bot2 for Tony Luck

[permalink] [raw]
Subject: [tip: timers/core] timekeeping: Reuse timekeeping_cycles_to_ns()

The following commit has been merged into the timers/core branch of tip:

Commit-ID: 670be12ba8f5d20ee2fb0531be6977005cd62401
Gitweb: https://git.kernel.org/tip/670be12ba8f5d20ee2fb0531be6977005cd62401
Author: Adrian Hunter <[email protected]>
AuthorDate: Mon, 25 Mar 2024 08:40:16 +02:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Mon, 08 Apr 2024 15:03:07 +02:00

timekeeping: Reuse timekeeping_cycles_to_ns()

Simplify __timekeeping_get_ns() by reusing timekeeping_cycles_to_ns().

No functional change.

Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

---
kernel/time/timekeeping.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index c698219..f81d675 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -391,10 +391,7 @@ static inline u64 timekeeping_cycles_to_ns(const struct tk_read_base *tkr, u64 c

static __always_inline u64 __timekeeping_get_ns(const struct tk_read_base *tkr)
{
- u64 delta, cycles = tk_clock_read(tkr);
-
- delta = clocksource_delta(cycles, tkr->cycle_last, tkr->mask);
- return timekeeping_delta_to_ns(tkr, delta);
+ return timekeeping_cycles_to_ns(tkr, tk_clock_read(tkr));
}

static inline u64 timekeeping_get_ns(const struct tk_read_base *tkr)

2024-04-08 13:12:21

by tip-bot2 for Tony Luck

[permalink] [raw]
Subject: [tip: timers/core] timekeeping: Tidy timekeeping_cycles_to_ns() slightly

The following commit has been merged into the timers/core branch of tip:

Commit-ID: 9af4548e828aa2ea66f54433c5747f64124a6240
Gitweb: https://git.kernel.org/tip/9af4548e828aa2ea66f54433c5747f64124a6240
Author: Adrian Hunter <[email protected]>
AuthorDate: Mon, 25 Mar 2024 08:40:15 +02:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Mon, 08 Apr 2024 15:03:07 +02:00

timekeeping: Tidy timekeeping_cycles_to_ns() slightly

Put together declaration and initialization of the local variable 'delta'.

Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

---
kernel/time/timekeeping.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 6306133..c698219 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -383,10 +383,9 @@ static inline u64 timekeeping_delta_to_ns(const struct tk_read_base *tkr, u64 de

static inline u64 timekeeping_cycles_to_ns(const struct tk_read_base *tkr, u64 cycles)
{
- u64 delta;
+ /* Calculate the delta since the last update_wall_time() */
+ u64 delta = clocksource_delta(cycles, tkr->cycle_last, tkr->mask);

- /* calculate the delta since the last update_wall_time */
- delta = clocksource_delta(cycles, tkr->cycle_last, tkr->mask);
return timekeeping_delta_to_ns(tkr, delta);
}


2024-04-08 13:13:37

by tip-bot2 for Tony Luck

[permalink] [raw]
Subject: [tip: timers/core] x86/vdso: Make delta calculation overflow safe

The following commit has been merged into the timers/core branch of tip:

Commit-ID: 7e90ffb716d289b3b82fb41892bb52a11bdadfd9
Gitweb: https://git.kernel.org/tip/7e90ffb716d289b3b82fb41892bb52a11bdadfd9
Author: Adrian Hunter <[email protected]>
AuthorDate: Mon, 25 Mar 2024 08:40:12 +02:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Mon, 08 Apr 2024 15:03:07 +02:00

x86/vdso: Make delta calculation overflow safe

Kernel timekeeping is designed to keep the change in cycles (since the last
timer interrupt) below max_cycles, which prevents multiplication overflow
when converting cycles to nanoseconds. However, if timer interrupts stop,
the calculation will eventually overflow.

Add protection against that. Select GENERIC_VDSO_OVERFLOW_PROTECT so that
max_cycles is made available in the VDSO data page. Check against
max_cycles, falling back to a slower higher precision calculation. Take
advantage of the opportunity to move masking and negative motion check
into the slow path.

The result is a calculation that has similar performance as before. Newer
machines showed performance benefit, whereas older Skylake-based hardware
such as Intel Kaby Lake was seen <1% worse.

Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

---
arch/x86/Kconfig | 1 +-
arch/x86/include/asm/vdso/gettimeofday.h | 31 ++++++++++++++++-------
2 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 4fff6ed..4e251ba 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -168,6 +168,7 @@ config X86
select GENERIC_TIME_VSYSCALL
select GENERIC_GETTIMEOFDAY
select GENERIC_VDSO_TIME_NS
+ select GENERIC_VDSO_OVERFLOW_PROTECT
select GUP_GET_PXX_LOW_HIGH if X86_PAE
select HARDIRQS_SW_RESEND
select HARDLOCKUP_CHECK_TIMESTAMP if X86_64
diff --git a/arch/x86/include/asm/vdso/gettimeofday.h b/arch/x86/include/asm/vdso/gettimeofday.h
index 5727ded..0ef3619 100644
--- a/arch/x86/include/asm/vdso/gettimeofday.h
+++ b/arch/x86/include/asm/vdso/gettimeofday.h
@@ -319,18 +319,31 @@ static inline bool arch_vdso_cycles_ok(u64 cycles)
*/
static __always_inline u64 vdso_calc_ns(const struct vdso_data *vd, u64 cycles, u64 base)
{
- /*
- * Due to the MSB/Sign-bit being used as invalid marker (see
- * arch_vdso_cycles_valid() above), the effective mask is S64_MAX.
- */
- u64 delta = (cycles - vd->cycle_last) & S64_MAX;
+ u64 delta = cycles - vd->cycle_last;

/*
- * Due to the above mentioned TSC wobbles, filter out negative motion.
- * Per the above masking, the effective sign bit is now bit 62.
+ * Negative motion and deltas which can cause multiplication
+ * overflow require special treatment. This check covers both as
+ * negative motion is guaranteed to be greater than @vd::max_cycles
+ * due to unsigned comparison.
+ *
+ * Due to the MSB/Sign-bit being used as invalid marker (see
+ * arch_vdso_cycles_valid() above), the effective mask is S64_MAX,
+ * but that case is also unlikely and will also take the unlikely path
+ * here.
*/
- if (unlikely(delta & (1ULL << 62)))
- return base >> vd->shift;
+ if (unlikely(delta > vd->max_cycles)) {
+ /*
+ * Due to the above mentioned TSC wobbles, filter out
+ * negative motion. Per the above masking, the effective
+ * sign bit is now bit 62.
+ */
+ if (delta & (1ULL << 62))
+ return base >> vd->shift;
+
+ /* Handle multiplication overflow gracefully */
+ return mul_u64_u32_add_u64_shr(delta & S64_MAX, vd->mult, base, vd->shift);
+ }

return ((delta * vd->mult) + base) >> vd->shift;
}

2024-04-08 13:13:50

by tip-bot2 for Tony Luck

[permalink] [raw]
Subject: [tip: timers/core] vdso, math64: Provide mul_u64_u32_add_u64_shr()

The following commit has been merged into the timers/core branch of tip:

Commit-ID: 1beb35ec615f676d49d68b6dc23c7418ba8ff145
Gitweb: https://git.kernel.org/tip/1beb35ec615f676d49d68b6dc23c7418ba8ff145
Author: Adrian Hunter <[email protected]>
AuthorDate: Mon, 25 Mar 2024 08:40:09 +02:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Mon, 08 Apr 2024 15:03:07 +02:00

vdso, math64: Provide mul_u64_u32_add_u64_shr()

Provide mul_u64_u32_add_u64_shr() which is a calculation that will be used
by timekeeping and VDSO.

Place #include <vdso/math64.h> after #include <asm/div64.h> to allow
architecture-specific overrides, at least for the kernel.

Signed-off-by: Adrian Hunter <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

---
include/linux/math64.h | 2 +-
include/vdso/math64.h | 38 ++++++++++++++++++++++++++++++++++++++
2 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/include/linux/math64.h b/include/linux/math64.h
index fd13622..d34def7 100644
--- a/include/linux/math64.h
+++ b/include/linux/math64.h
@@ -4,8 +4,8 @@

#include <linux/types.h>
#include <linux/math.h>
-#include <vdso/math64.h>
#include <asm/div64.h>
+#include <vdso/math64.h>

#if BITS_PER_LONG == 64

diff --git a/include/vdso/math64.h b/include/vdso/math64.h
index 7da703e..22ae212 100644
--- a/include/vdso/math64.h
+++ b/include/vdso/math64.h
@@ -21,4 +21,42 @@ __iter_div_u64_rem(u64 dividend, u32 divisor, u64 *remainder)
return ret;
}

+#if defined(CONFIG_ARCH_SUPPORTS_INT128) && defined(__SIZEOF_INT128__)
+
+#ifndef mul_u64_u32_add_u64_shr
+static __always_inline u64 mul_u64_u32_add_u64_shr(u64 a, u32 mul, u64 b, unsigned int shift)
+{
+ return (u64)((((unsigned __int128)a * mul) + b) >> shift);
+}
+#endif /* mul_u64_u32_add_u64_shr */
+
+#else
+
+#ifndef mul_u64_u32_add_u64_shr
+#ifndef mul_u32_u32
+static inline u64 mul_u32_u32(u32 a, u32 b)
+{
+ return (u64)a * b;
+}
+#define mul_u32_u32 mul_u32_u32
+#endif
+static __always_inline u64 mul_u64_u32_add_u64_shr(u64 a, u32 mul, u64 b, unsigned int shift)
+{
+ u32 ah = a >> 32, al = a;
+ bool ovf;
+ u64 ret;
+
+ ovf = __builtin_add_overflow(mul_u32_u32(al, mul), b, &ret);
+ ret >>= shift;
+ if (ovf && shift)
+ ret += 1ULL << (64 - shift);
+ if (ah)
+ ret += mul_u32_u32(ah, mul) << (32 - shift);
+
+ return ret;
+}
+#endif /* mul_u64_u32_add_u64_shr */
+
+#endif
+
#endif /* __VDSO_MATH64_H */

2024-04-08 13:16:03

by tip-bot2 for Tony Luck

[permalink] [raw]
Subject: [tip: timers/core] vdso: Consolidate vdso_calc_delta()

The following commit has been merged into the timers/core branch of tip:

Commit-ID: c8e3a8b6f2e62661d838ae222774121ae23777a4
Gitweb: https://git.kernel.org/tip/c8e3a8b6f2e62661d838ae222774121ae23777a4
Author: Adrian Hunter <[email protected]>
AuthorDate: Mon, 25 Mar 2024 08:40:05 +02:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Mon, 08 Apr 2024 15:03:06 +02:00

vdso: Consolidate vdso_calc_delta()

Consolidate vdso_calc_delta(), in preparation for further simplification.

Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

---
arch/powerpc/include/asm/vdso/gettimeofday.h | 26 ++++++++-----------
arch/s390/include/asm/vdso/gettimeofday.h | 7 +----
lib/vdso/gettimeofday.c | 9 ++++++-
3 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/include/asm/vdso/gettimeofday.h b/arch/powerpc/include/asm/vdso/gettimeofday.h
index 78302f6..c639089 100644
--- a/arch/powerpc/include/asm/vdso/gettimeofday.h
+++ b/arch/powerpc/include/asm/vdso/gettimeofday.h
@@ -13,6 +13,17 @@

#define VDSO_HAS_TIME 1

+/*
+ * powerpc specific delta calculation.
+ *
+ * This variant removes the masking of the subtraction because the
+ * clocksource mask of all VDSO capable clocksources on powerpc is U64_MAX
+ * which would result in a pointless operation. The compiler cannot
+ * optimize it away as the mask comes from the vdso data and is not compile
+ * time constant.
+ */
+#define VDSO_DELTA_NOMASK 1
+
static __always_inline int do_syscall_2(const unsigned long _r0, const unsigned long _r3,
const unsigned long _r4)
{
@@ -104,21 +115,6 @@ static inline bool vdso_clocksource_ok(const struct vdso_data *vd)
}
#define vdso_clocksource_ok vdso_clocksource_ok

-/*
- * powerpc specific delta calculation.
- *
- * This variant removes the masking of the subtraction because the
- * clocksource mask of all VDSO capable clocksources on powerpc is U64_MAX
- * which would result in a pointless operation. The compiler cannot
- * optimize it away as the mask comes from the vdso data and is not compile
- * time constant.
- */
-static __always_inline u64 vdso_calc_delta(u64 cycles, u64 last, u64 mask, u32 mult)
-{
- return (cycles - last) * mult;
-}
-#define vdso_calc_delta vdso_calc_delta
-
#ifndef __powerpc64__
static __always_inline u64 vdso_shift_ns(u64 ns, unsigned long shift)
{
diff --git a/arch/s390/include/asm/vdso/gettimeofday.h b/arch/s390/include/asm/vdso/gettimeofday.h
index db84942..7937765 100644
--- a/arch/s390/include/asm/vdso/gettimeofday.h
+++ b/arch/s390/include/asm/vdso/gettimeofday.h
@@ -6,16 +6,13 @@

#define VDSO_HAS_CLOCK_GETRES 1

+#define VDSO_DELTA_NOMASK 1
+
#include <asm/syscall.h>
#include <asm/timex.h>
#include <asm/unistd.h>
#include <linux/compiler.h>

-#define vdso_calc_delta __arch_vdso_calc_delta
-static __always_inline u64 __arch_vdso_calc_delta(u64 cycles, u64 last, u64 mask, u32 mult)
-{
- return (cycles - last) * mult;
-}

static __always_inline const struct vdso_data *__arch_get_vdso_data(void)
{
diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
index ce2f695..faccf12 100644
--- a/lib/vdso/gettimeofday.c
+++ b/lib/vdso/gettimeofday.c
@@ -6,6 +6,13 @@
#include <vdso/helpers.h>

#ifndef vdso_calc_delta
+
+#ifdef VDSO_DELTA_NOMASK
+# define VDSO_DELTA_MASK(mask) U64_MAX
+#else
+# define VDSO_DELTA_MASK(mask) (mask)
+#endif
+
/*
* Default implementation which works for all sane clocksources. That
* obviously excludes x86/TSC.
@@ -13,7 +20,7 @@
static __always_inline
u64 vdso_calc_delta(u64 cycles, u64 last, u64 mask, u32 mult)
{
- return ((cycles - last) & mask) * mult;
+ return ((cycles - last) & VDSO_DELTA_MASK(mask)) * mult;
}
#endif