Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754874AbbBQXQV (ORCPT ); Tue, 17 Feb 2015 18:16:21 -0500 Received: from mail-ie0-f181.google.com ([209.85.223.181]:36227 "EHLO mail-ie0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751649AbbBQXQT (ORCPT ); Tue, 17 Feb 2015 18:16:19 -0500 MIME-Version: 1.0 In-Reply-To: <1423749499-18520-1-git-send-email-prarit@redhat.com> References: <1423749499-18520-1-git-send-email-prarit@redhat.com> Date: Tue, 17 Feb 2015 15:16:18 -0800 Message-ID: Subject: Re: [PATCH] time, ntp: Do not update time_state in middle of leap second [v3] From: John Stultz To: Prarit Bhargava Cc: lkml , Thomas Gleixner , Miroslav Lichvar , Peter Zijlstra Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6529 Lines: 150 On Thu, Feb 12, 2015 at 5:58 AM, Prarit Bhargava wrote: > During leap second insertion testing it was noticed that a small window > exists where the time_state could be reset such that > time_state = TIME_OK, which then causes the leap second to not occur, or > causes the entire leap second state machine to fail with time_state = > TIME_INS at the end of the leap second. > > The test did the following in userspace: > > tx.modes = ADJ_STATUS; > tx.status = STA_INS; > > /* send leap second request */ > ret = adjtimex(&tx); > > /* Check adjtimex output every half second */ > now = tx.time.tv_sec; > while (now < next_leap+2) { > char buf[26]; > ret = adjtimex(&tx); > > ctime_r(&tx.time.tv_sec, buf); > buf[strlen(buf)-1] = 0; /*remove trailing\n */ > > printf("%s + %6ld us\t%s\n", > buf, > tx.time.tv_usec, > time_state_str(ret)); > now = tx.time.tv_sec; > /* Sleep for another half second */ > ts.tv_sec = 0; > ts.tv_nsec = NSEC_PER_SEC/2; > clock_nanosleep(CLOCK_MONOTONIC, 0, &ts, NULL); > } > > which was intended to mimic the insertion of a leap second. A > successful run of the test would result in the time_state transitioning > from TIME_OK to TIME_INS, then to TIME_OOP when the leap second was > inserted, and then to TIME_WAIT when the leap second was completed. While > running this code failures were seen in which the time_state remained TIME_INS, > even though the leap second had occurred. > > After some investigation it was noted that the test contained a small error: > the test does not reinitialize tx.status and reissues the STA_INS every > 1/2 second. As a result of this broken test, the following failure was noticed > (the output below is a mix of kernel messages and the output from the test > program, the remaining annotations are printk's in the code and my own > additional notes): > > [ 942.952833] time_state [1] change from TIME_OK to TIME_INS > > Fri Feb 13 18:59:51 2015 + 318126 us TIME_INS > Fri Feb 13 18:59:51 2015 + 818167 us TIME_INS > Fri Feb 13 18:59:52 2015 + 318208 us TIME_INS > Fri Feb 13 18:59:52 2015 + 818248 us TIME_INS > Fri Feb 13 18:59:53 2015 + 318290 us TIME_INS > Fri Feb 13 18:59:53 2015 + 818331 us TIME_INS > Fri Feb 13 18:59:54 2015 + 318372 us TIME_INS > Fri Feb 13 18:59:54 2015 + 818413 us TIME_INS > Fri Feb 13 18:59:55 2015 + 318454 us TIME_INS > Fri Feb 13 18:59:55 2015 + 818495 us TIME_INS > Fri Feb 13 18:59:56 2015 + 318534 us TIME_INS > Fri Feb 13 18:59:56 2015 + 818575 us TIME_INS > Fri Feb 13 18:59:57 2015 + 318617 us TIME_INS > Fri Feb 13 18:59:57 2015 + 818660 us TIME_INS > Fri Feb 13 18:59:58 2015 + 318702 us TIME_INS > Fri Feb 13 18:59:58 2015 + 818744 us TIME_INS > Fri Feb 13 18:59:59 2015 + 318785 us TIME_INS > Fri Feb 13 18:59:59 2015 + 818837 us TIME_INS > > [ 952.953143] time_state [4] change from TIME_INS to TIME_OOP > [ 952.953150] Clock: inserting leap second 23:59:60 UTC > [ 953.299905] process_adj_status: insert_leap_sec[1223] setting time_state back > to TIME_OK [1, 1] <<< adjtimex() call every 1/2 second > [ 953.299913] time_state [9] change from TIME_OOP to TIME_OK > > Fri Feb 13 18:59:59 2015 + 318878 us TIME_OK > Fri Feb 13 18:59:59 2015 + 818931 us TIME_OK > > [ 954.064237] time_state [1] change from TIME_OK to TIME_INS > > Fri Feb 13 19:00:00 2015 + 318972 us TIME_INS > Fri Feb 13 19:00:00 2015 + 819012 us TIME_INS > Fri Feb 13 19:00:01 2015 + 319051 us TIME_INS > Fri Feb 13 19:00:01 2015 + 819089 us TIME_INS > Fri Feb 13 19:00:02 2015 + 319128 us TIME_INS > > As previously stated, the time_state remains TIME_INS even though the leap > second has already occurred @ 952.953150. > > The test was changed to reset tx.status to 0 in the loop, and the test then > succeeded with a 100% rate with the time state ending in TIME_WAIT. > > While this is highly unlikely to ever happen in the real world it is > still something we should protect against, as breaking the state machine > is bad. > > If the time_state == TIME_OOP (ie, the leap second is in progress) do not > allow an external update to time_state in process_adj_status(). This will > prevent external adjtimex() calls from breaking the leap second state > machine. > > [v2]: Only block time_state change when TIME_OOP > [v3]: Write a much more detailed explanation of the bug. Ok, thanks for the more verbose explanation. Although this is more a history of what you've seen rather then the crux of the change. To distill this down just a bit, the point is the usual mode for NTP time_state machine looks like: TIME_OK -> TIME_INS -> TIME_OOP | | v v TIME_DEL ------------> TIME_WAIT -(back)-> TIME_OK (hopefully the ascii art survives here) Now, from any of these states, currently if adjtimex is called w/ the STA_PLL bit cleared (after STA_PLL was set), we reset back to TIME_OK, effectively cancelling any transitions. (You'll have to imagine a line from any of the states back to TIME_OK, since that's going to be too ugly to do in ascii) Your patch is trying to remove the line back from TIME_OOP back to TIME_OK. Basically stopping the ability to reset the ntp state during a leapsecond. I do get that the behavior seen was strange due to a bug in the test code which caused unexpected cancellation of state, but I'm not sure if we should change the behavior to enforce that cancellation not be possible. I could imagine some logic which really wants to reset the state, which just by chance lands during a leap second, and the application is confused since the state change didn't occur as expected. So I guess I'm not seeing that the state machine is actually "broken" in this case that you've outlined. If you can articulate better why the OOP -> OK transition is truly invalid, I'd be interested in hearing, but I'm not sure I want to risk a behavioral change unless there's wide agreement. thanks -john -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/