Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754564AbbBTOMv (ORCPT ); Fri, 20 Feb 2015 09:12:51 -0500 Received: from mx1.redhat.com ([209.132.183.28]:38090 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753700AbbBTOMu (ORCPT ); Fri, 20 Feb 2015 09:12:50 -0500 Message-ID: <54E740C4.1060205@redhat.com> Date: Fri, 20 Feb 2015 09:12:20 -0500 From: Prarit Bhargava User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20131028 Thunderbird/17.0.10 MIME-Version: 1.0 To: John Stultz CC: lkml , Thomas Gleixner , Miroslav Lichvar , Peter Zijlstra Subject: Re: [PATCH] time, ntp: Do not update time_state in middle of leap second [v3] References: <1423749499-18520-1-git-send-email-prarit@redhat.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2954 Lines: 76 On 02/17/2015 06:16 PM, John Stultz wrote: > On Thu, Feb 12, 2015 at 5:58 AM, Prarit Bhargava wrote: >> >> which was intended to mimic the insertion of a leap second. A >> successful run of the test would result in the time_state transitioning >> from TIME_OK to TIME_INS, then to TIME_OOP when the leap second was >> inserted, and then to TIME_WAIT when the leap second was completed. While >> running this code failures were seen in which the time_state remained TIME_INS, >> even though the leap second had occurred. >> > > > Ok, thanks for the more verbose explanation. Although this is more a > history of what you've seen rather then the crux of the change. > > To distill this down just a bit, the point is the usual mode for NTP > time_state machine looks like: > > TIME_OK -> TIME_INS -> TIME_OOP > | | > v v > TIME_DEL ------------> TIME_WAIT -(back)-> TIME_OK > > (hopefully the ascii art survives here) > > Now, from any of these states, currently if adjtimex is called w/ the > STA_PLL bit cleared (after STA_PLL was set), we reset back to TIME_OK, > effectively cancelling any transitions. (You'll have to imagine a line > from any of the states back to TIME_OK, since that's going to be too > ugly to do in ascii) > > Your patch is trying to remove the line back from TIME_OOP back to > TIME_OK. Basically stopping the ability to reset the ntp state during > a leapsecond. Correct. > > I do get that the behavior seen was strange due to a bug in the test > code which caused unexpected cancellation of state, but I'm not sure > if we should change the behavior to enforce that cancellation not be > possible. I could imagine some logic which really wants to reset the > state, which just by chance lands during a leap second, and the > application is confused since the state change didn't occur as > expected. I think setting it in the middle of the leap second should be a NOOP. We all know how fragile this code has been in the past and allowing a state transition at that particular time isn't a good idea given the outcome that the state may remain TIME_INS. > > So I guess I'm not seeing that the state machine is actually "broken" > in this case that you've outlined. If you can articulate better why > the OOP -> OK transition is truly invalid, I'd be interested in > hearing, but I'm not sure I want to risk a behavioral change unless > there's wide agreement. I understand -- After thinking about it from your point of view I agree that calling it "broken" is not right. Perhaps a better way of looking at it is, as you also point out, if OOP -> OK is truly valid. P. > > thanks > -john > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/