Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935125AbcKWKUz (ORCPT ); Wed, 23 Nov 2016 05:20:55 -0500 Received: from Galois.linutronix.de ([146.0.238.70]:33847 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935010AbcKWKTe (ORCPT ); Wed, 23 Nov 2016 05:19:34 -0500 Date: Wed, 23 Nov 2016 09:59:32 +0100 (CET) From: Thomas Gleixner To: Joel Fernandes cc: LKML , Steven Rostedt , John Stultz , Ingo Molnar Subject: Re: [PATCH v2 1/2] timekeeping: Introduce a fast boot clock derived from fast monotonic clock In-Reply-To: Message-ID: References: <1479766370-31311-1-git-send-email-joelaf@google.com> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2193 Lines: 57 On Tue, 22 Nov 2016, Joel Fernandes wrote: > On Tue, Nov 22, 2016 at 2:29 PM, Thomas Gleixner wrote: > > What's wrong with adding a tracepoint into the boot offset update function > > and let perf or the tracer inject the value of the boot offset into the > > trace data when starting. The time adjustment can be done in > > postprocessing. > > I agree we're bloating this and probably not very acceptable. > tracepoint adds additional complexity and out of tree patches which Why are tracepoints a problem and adding complexity? Also why would they cause out of tree patches? > we'd like to avoid. Would you be Ok if we added a relatively simple > function like below that could do the job and not bloat the optimal > case? > > /* > * Fast and NMI safe access to boot time. It may be slightly out of date > * as we're accessing offset without seqcount writes, but is safe to access. > */ > u64 ktime_get_boot_fast_ns(void) > { > struct timekeeper *tk = &tk_core.timekeeper; > return __ktime_get_fast_ns(&tk_fast_mono) + tk->offs_boot; This needs a big fat comment and documentation of the possible side effects of this: 1) The timestamp might be off due: CPU 0 CPU 1 timekeeping_inject_sleeptime64() __timekeeping_inject_sleeptime(tk, delta); timestamp(); timekeeping_update(tk, TK_CLEAR_NTP...); 2) On 32bit machines the timestamp might see a half updated boot offset value. On 64bit machines it either sees the old or the new value. I don't think this is a big issue, because the event of updating boot offset is really, really rare, so postprocessing should be able to handle this. But at least it must be documented so people wont get surprised. > > That should be sufficient for tracing suspend/resume behaviour. If there is > > not a really convincing reason for having that as a real trace clock then I > > prefer to avoid that extra stuff. > > Several clocks are accessible just by simple writing of clock name to > trace_clock in debugfs. This is really cool and doesn't require any > out of tree patches or post processing complexity. I know that it's nice to have :) Thanks, tglx