Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933936AbcKIQBW (ORCPT ); Wed, 9 Nov 2016 11:01:22 -0500 Received: from mout.kundenserver.de ([212.227.126.131]:53110 "EHLO mout.kundenserver.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932762AbcKIQBV (ORCPT ); Wed, 9 Nov 2016 11:01:21 -0500 From: Arnd Bergmann To: "Dilger, Andreas" Cc: James Simmons , Greg Kroah-Hartman , "devel@driverdev.osuosl.org" , "Drokin, Oleg" , Linux Kernel Mailing List , Lustre Development List Subject: Re: [lustre-devel] [PATCH] staging: lustre: ldlm: pl_recalc time handling is wrong Date: Wed, 09 Nov 2016 17:00:42 +0100 Message-ID: <1594836.LAbS3nYoXQ@wuerfel> User-Agent: KMail/5.1.3 (Linux/4.4.0-34-generic; KDE/5.18.0; x86_64; ; ) In-Reply-To: <4C1355CC-5CF8-4AB3-95F1-B356EC357D00@intel.com> References: <1478573240-12850-1-git-send-email-jsimmons@infradead.org> <4C1355CC-5CF8-4AB3-95F1-B356EC357D00@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Provags-ID: V03:K0:FhYWZM+3Aptq/dSm7vE58cMCoVmUAuHCTpcbwr0hqW1fXwLLBi0 Bk/MvQ/5tF90tm8km/apylPYWpAhJUYVgIEhFfkqQN9DQSJnFJXkKnohfkZAxyx5uNdMza/ Ig4ZX0wePWoubLO82dAU7W5OnLmvH5rX0trbRTT27h0Z14wrBDHgxH3Cqwi20LWYz9l2S3Y yt3/k8hyug19zKVQ96XZA== X-UI-Out-Filterresults: notjunk:1;V01:K0:pzv6ux6/MSs=:2cwQJv4rcJb1T8DoifdP1b EJvwMvbvNe7T6SXp5I2nwRaJz7vffrpezmJlJmEJSPGTu8gzPdaVoUHDatlXk6tDEpIAPaSQr ViYucldJIlQmPHo8fp7kvOvnft1mXl8L4Kc16YsPSoMz7G0Waimzp9BNHJUY592VgU1pGDly+ e67bA5GVv9062VCLpxLPqcfx1wvfh7InpWzRHbhryWVq3dM1Lqc8kMCk3KkmyYz9c2DfNQTnc E+VEiB/4mMnTV9YKhDjm3ylbiyCKtbCjGZAsxTRHl4vMTLqsEQNe5qWl+fYyhtP5DdioLFlle ymynfabOUr0Lm1iQsd+BTWIWoWgzAWKI6Fs2QRgDVGHFbT5XESxQ4C8NEXr0ETMnU+SGVPeMN ClnBHpXsP0YZhy6Aun7R+JfIFAJMNEsVpy+maz8qIAR5rWLnw2fN4d/38xAlYpVB9eyCMnp1U 70RYAwAHJvZyJ3ymwVXT6ja/hJQKbC01Zh3EJwEuH4mtBe3fao1rlY9eCC8N57udpP5Xwy/GK VU9HZCdair5JL42dEMg1st9pUx7urygzZTzjC+lMPGRjodXWdpTIvL2cWqohkrpdUJWamn/uu 6il/P54M9a8EcX3A/5x2Ns/u1GLuYbg9A/JGpHPswvB10YEfutIj5i0g08WQRcUB41Pqos7wW SOQBRyYu4grgur6n4abbEdQP/RKXsUSlH/Pek84MiJ0I/7MGNDh+d6X6vTZfxj4Vledy3aKyY 3tFohWSTZjbmXNtj Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2670 Lines: 54 On Wednesday, November 9, 2016 3:50:29 AM CET Dilger, Andreas wrote: > On Nov 7, 2016, at 19:47, James Simmons wrote: > > > > The ldlm_pool field pl_recalc_time is set to the current > > monotonic clock value but the interval period is calculated > > with the wall clock. This means the interval period will > > always be far larger than the pl_recalc_period, which is > > just a small interval time period. The correct thing to > > do is to use monotomic clock current value instead of the > > wall clocks value when calculating recalc_interval_sec. > > It looks like this was introduced by commit 8f83409cf > "staging/lustre: use 64-bit time for pl_recalc" but that patch changed > get_seconds() to a mix of ktime_get_seconds() and ktime_get_real_seconds() > for an unknown reason. It doesn't appear that there is any difference > in overhead between the two (on 64-bit at least). It was meant to use ktime_get_real_seconds() consistently, very sorry about the mistake. I don't remember exactly how we got there, I assume I had started out using ktime_get_seconds() and then moved to ktime_get_real_seconds() later but missed the last three instances. > Since the ldlm pool recalculation interval is actually driven in response to > load on the server, it makes sense to use the "real" time instead of the > monotonic time (if I understand correctly) if the client is in a VM that > may periodically be blocked and "miss time" compared to the outside world. > Using the "real" clock, the recalc_interval_sec will correctly reflect the > actual elapsed time rather than just the number of ticks inside the VM. > > Is my understanding of these different clocks correct? No, this is not the difference: monotonic and real time always tick at exactly the same rate, the only difference is the starting point. monotonic time starts at boot and can not be adjusted, while real time is set to reflect the UTC time base and gets initialized from the real time clock at boot, or using settimeofday(2) or NTP later on. In my changelog text, I wrote keeping the 'real' instead of 'monotonic' time because of the debug prints. the intention here is simply to have the console log keep the same numbers as "date +%s" for absolute values. The patch that James suggested converting everything to ktime_get_seconds() would result in the same intervals that have always been there (until I broke it by using time domains inconsistently), but would mean we could use a u32 type for pl_recalc_time and pl_recalc_period because that doesn't overflow until 136 years after booting. (signed time_t overflows 68 years after 1970, i.e 2038). Arnd