Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp4570763imm; Mon, 25 Jun 2018 19:09:07 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKHQizTPbdwSH2DjssNwy6xqtCvC6VeolbIZCorKreqsG1gSuPPIfjFb4Mq7BkHuPP+2JUt X-Received: by 2002:a63:9543:: with SMTP id t3-v6mr12404345pgn.77.1529978947104; Mon, 25 Jun 2018 19:09:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529978947; cv=none; d=google.com; s=arc-20160816; b=Z9MqnA1JZ8gY26i4l2CbRlNwIpVZkuOhbviRxrUclMmdEwSMlrQ3ZqU/TU0E1yRR4U dqdEi9D5y33D1RIVbE9FpAC6EAtvoJgTgxWDeCTKpF//3dCn1uuY2hId4fbAZAKs3WK6 Z0oqExv0c8NlxF/F0PMkzdI9fi+RmwBk5rv8EgD/SpqScDmN9VL3EuV1k/8c6ALsitCJ 4hS5cPRW6sMlVJA1d60RPJ67f+cDQUWAHhmkShl9QHjpw6ZK/LNFJZnLKJzagmHHhIox PWVSBX+Ycxzt83VIkCnqMR5OQVbQ5qvLiJWVqnIQvzMMiN4KajiNuQ0jaLcwOy6dqdhg F/ug== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=cwwVmVnKR3DEkZM9t1ier25qtIX11hnesWtLLjOM3Ps=; b=CU1hELvDQQRDuaoEeJjNXtahsz+jCHeWKLeDQ6uehxyQBgqRg9IGcbmMWH9vwSN3qE Bl3cO3QMjWFvWygnQmpr6Jo8o7oEezhH1GREykQ5zLXXVmzdjE8JqW1e3PCGbjQmQyAw hcO5zZ6WcVxJ/I6XyHv8VeAdaKC722AijYieIJWVn2HcI9C5cCugNxA/KKVRD7cXXP8j gJnnZSjr1qBb3TpAeN9HPo2PcAFIlcdD879xN6tlCE4BuuusTtL6jgLlWG7zINKGuZ3s GVfsYy1u2PjXfMLK49OsYMg1TED/lWgjFjvjvE4/kI7/fkViSlZ3ccgibifENleC1Xjk eIaQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=UDxCrb1t; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r1-v6si456999plb.172.2018.06.25.19.08.52; Mon, 25 Jun 2018 19:09:07 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=UDxCrb1t; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965208AbeFZCIN (ORCPT + 99 others); Mon, 25 Jun 2018 22:08:13 -0400 Received: from mail-ot0-f194.google.com ([74.125.82.194]:40042 "EHLO mail-ot0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965188AbeFZCIL (ORCPT ); Mon, 25 Jun 2018 22:08:11 -0400 Received: by mail-ot0-f194.google.com with SMTP id f17-v6so727071otl.7 for ; Mon, 25 Jun 2018 19:08:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=cwwVmVnKR3DEkZM9t1ier25qtIX11hnesWtLLjOM3Ps=; b=UDxCrb1tzQuwiJJbT+kRXn1xoe2SpvBblgTDDVxm+51ZqbMCT6UoxUtLWfSusqSAqv IaCVZiQD/H9LJt6xelcIDlYp8AbNfQZtmBLQh4B/DLYpvRbi8MrjRrKr5eL8FSTPnByF qmPqk74ed6kzK4Ef2pxBxcylcwjaF4K6lb0j8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=cwwVmVnKR3DEkZM9t1ier25qtIX11hnesWtLLjOM3Ps=; b=KtG4SR7Ve2QiwwYOAY5k3YJF/K6SgFY90PdVr1qvFl6MhhE7QQ+nIjwoSRvImGqE+1 R+EZTh2azuVh3dHgE9/Nep1V+LTtv638RMus4r43mrJubhH0nuJY0ld5WdYdnRi3M05c Mn9+JU5JF8uAofCLdSsbzKSeIKxw1rOVatdNstjYBLaYtMNp21O93B2croF9g9oTWeTd IYmTYvylOXm/v4wrH1ZCx4p5lKgNNigN9p+XtER6dgXsiSfSIMC4qAdI2CrfT8etuye6 0O3WHOXq8iijnk+clbxUSW4YXHGkRrukPvcLo1qYDbbD3fnRTjp1GPWZI8esbn/Ab0dU 9SeQ== X-Gm-Message-State: APt69E3HeY62l87qTxxpq//gvgnZtqKByFQWFuFNN0ooCcKfaPXdfs21 Nkp+k41oY3chf5x2gCg3+UZfqfjDmYB0MQVGx+qoOQ== X-Received: by 2002:a9d:4c0f:: with SMTP id l15-v6mr9543599otf.26.1529978890728; Mon, 25 Jun 2018 19:08:10 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a9d:237a:0:0:0:0:0 with HTTP; Mon, 25 Jun 2018 19:08:10 -0700 (PDT) In-Reply-To: References: From: Baolin Wang Date: Tue, 26 Jun 2018 10:08:10 +0800 Message-ID: Subject: Re: [PATCH 1/8] time: Add persistent clock support To: Thomas Gleixner Cc: John Stultz , Daniel Lezcano , Arnd Bergmann , Tony Lindgren , aaro.koskinen@iki.fi, linux@armlinux.org.uk, Mark Rutland , Marc Zyngier , Mark Brown , Paul McKenney , mlichvar@redhat.com, Randy Dunlap , Kate Stewart , Greg KH , Philippe Ombredanne , Thierry Reding , Jon Hunter , =?UTF-8?Q?Heiko_St=C3=BCbner?= , Linus Walleij , Viresh Kumar , Ingo Molnar , "H. Peter Anvin" , peterz@infradead.org, douly.fnst@cn.fujitsu.com, len.brown@intel.com, rajvi.jingar@intel.com, Alexandre Belloni , x86@kernel.org, Linux ARM , linux-tegra@vger.kernel.org, LKML , linux-omap@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Thomas, On 24 June 2018 at 08:14, Thomas Gleixner wrote: > On Wed, 13 Jun 2018, Baolin Wang wrote: >> Moreover we can register the clocksource with CLOCK_SOURCE_SUSPEND_NONSTOP >> to be one persistent clock, then we can simplify the suspend/resume >> accounting by removing CLOCK_SOURCE_SUSPEND_NONSTOP timing. After that >> we can only compensate the OS time by persistent clock or RTC. > > That makes sense because it adds a gazillion lines of code and removes 5? > Not really, > >> +/** >> + * persistent_clock_read_data - data required to read persistent clock >> + * @read: Returns a cycle value from persistent clock. >> + * @last_cycles: Clock cycle value at last update. >> + * @last_ns: Time value (nanoseconds) at last update. >> + * @mask: Bitmask for two's complement subtraction of non 64bit clocks. >> + * @mult: Cycle to nanosecond multiplier. >> + * @shift: Cycle to nanosecond divisor. >> + */ >> +struct persistent_clock_read_data { >> + u64 (*read)(void); >> + u64 last_cycles; >> + u64 last_ns; >> + u64 mask; >> + u32 mult; >> + u32 shift; >> +}; >> +/** >> + * persistent_clock - represent the persistent clock >> + * @read_data: Data required to read from persistent clock. >> + * @seq: Sequence counter for protecting updates. >> + * @freq: The frequency of the persistent clock. >> + * @wrap: Duration for persistent clock can run before wrapping. >> + * @alarm: Update timeout for persistent clock wrap. >> + * @alarm_inited: Indicate if the alarm has been initialized. >> + */ >> +struct persistent_clock { >> + struct persistent_clock_read_data read_data; >> + seqcount_t seq; >> + u32 freq; >> + ktime_t wrap; >> + struct alarm alarm; >> + bool alarm_inited; >> +}; > > NAK! > > There is no reason to invent yet another set of data structures and yet > more read functions with a sequence counter. which are just a bad and > broken copy of the existing timekeeping/clocksource code. And of course the > stuff is not serialized against multiple registrations, etc. etc. > > Plus the utter nonsense that any call site has to do the same thing over > and over: > > register(): > start_alarm_timer(); > > Why is this required in the first place? It's not at all. The only place > where such an alarm timer will be required is when the system actually goes > to suspend. Starting it at registration time is pointless and even counter > productive. Assume the clocksource wraps every 2 hours. So you start it at > boot time and after 119 minutes uptime the system suspends. So it will > wakeup one minute later to update the clocksource. Heck no. If the timer is > started when the machine actually suspends it will wake up earliest in 120 > minutes. > > And you even add that to the TSC which does not need it at all. It will > wrap in about 400 years on a 2GHZ machine. So you degrade the functionality > instead of improving it. > > So no, this is not going anywhere. > > Let's look at the problem itself: > > You want to use one clocksource for timekeeping during runtime which is > fast and accurate and another one for suspend time injection which is > slower and/or less accurate because the fast one stops in suspend. > > Plus you need an alarmtimer which makes sure that the clocksource does > not wrap around during suspend. > > Now lets look what we have already: > > Both clocksources already exist and are registered as clocksources with > all required data in the clocksource core. > > Ergo the only sane and logical conclusion is to expand the existing > infrastructure to handle that. > > When a clocksource is registered, then the registration function already > makes decisions about using it as timekeeping clocksource. So add a few > lines of code to check whether the newly registered clocksource is suitable > and preferred for suspend. > > if (!stops_in_suspend(newcs)) { > if (!suspend_cs || is_preferred_suspend_cs(newcs)) > suspend_cs = newcs; > } > > The is_preferred_suspend_cs() can be based on rating, the maximum suspend > length which can be achieved or whatever is sensible. It should start of as > a very simple decision function based on rating and not an prematurely > overengineered monstrosity. > > The suspend/resume() code needs a few very simple changes: > > xxx_suspend(): > clocksource_prepare_suspend(); > > Note, this is _NOT_ timekeeping_suspend() because that is invoked _AFTER_ > alarmtimer_suspend(). So if an alarm timer is required it needs to be > armed before that. A trivial solution might be to just call it from > alarmtimer_suspend(), but that a minor detail to worry about. > > timekeeping_suspend() > { > clocksource_enter_suspend(); > ... > > timekeeping_resume() > { > ... > if (clocksource_leave_suspend(&nsec)) { > ts_delta = ns_to_timespec64(nsec); > sleeptime_injected = true; > } else if (...... > > > Now lets look at the new functions: > > void clocksource_prepare_suspend(void) > { > if (!suspend_cs) > return; > > if (needs_alarmtimer(suspend_cs)) > start_suspend_alarm(suspend_cs); > } > > void clocksource_enter_suspend(void) > { > if (!suspend_cs) > return; > suspend_start = suspend_cs->read(); > } > > bool clocksource_leave_suspend(u64 *nsec) > { > u64 now, delta; > > if (!suspend_cs) > return false; > > now = suspend_cs->read(); > delta = clocksource_delta(now, suspend_start, suspend_cs->mask); > *nsec = mul_u64_u32_shr(delta, suspend_cs->mult, suspend_cs->shift); > return true; > } > > See? It does not need any of this totally nonsensical stuff in your > registration function and not any new read functions and whatever, because > it simply can use the bog standard mult/shift values. Why? Because the > conversion above can cope with a full 64 bit * 32 bit multiply without > falling apart. It's already there in timekeeping_resume() otherwise > resuming with a NONSTOP TSC would result in bogus sleep times after a few > minutes. It's slower than the normal clocksource conversion which is > optimized for performance, but thats completely irrelevant on resume. This > whole blurb about requiring separate mult/shift values is just plain > drivel. > > Plus any reasonably broad clocksource will not need an alarmtimer at > all. Because the only reason it is needed is when the clocksource itself > wraps around. And that has absolutely nothing to do with mult/shift > values. That just depends on the frequency and the bitwidth of the counter, > > So it does not need an update function either because in case of broad > enough clocksources there is absolutely no need for update and in case of > wrapping ones the alarmtimer brings it out of suspend on time. And because > the only interesting thing is the delta between suspend and resume this is > all a complete non issue. > > The clocksource core already has all the registration/unregistration > functionality plus an interface to reconfigure the frequency, so > clocksources can come and go and be reconfigured and all of this just > works. > > Once the extra few lines of code are in place, then you can go and cleanup > the existing mess of homebrewn interfaces and claim that this is > consolidation and simplification. I am very grateful for your suggestion, and I am sorry I made a stupid mistake that I missed the mul_u64_u32_shr() function which can avoid introducing extra mult/shift for suspend clocksource. I will use that and follow your other suggestions too. Thanks again. > > > > What's wrong with you people? Didn't they teach you in school that the > first thing to do is proper problem and code analysis? If they did not, go > back to them and ask your money back, > > I'm really tired of these overengineered trainwrecks which are then > advertised with bullshit marketing like the best invention since sliced > bread. This might work in random corporates, but not here. > > > > Thanks, > > tglx --- Baolin Wang Best Regards