2013-09-04 09:56:15

by Arun Sharma

[permalink] [raw]
Subject: clock_gettime_ns

A couple of years ago Andy posted this patch series:

http://thread.gmane.org/gmane.linux.kernel/1233209/

These patches have been in use at facebook for a couple of years and
along with a vDSO implementation of thread_cpu_time(), they have proven
useful for our profilers.

I didn't see any arguments against this patch series. Did I miss some
discussion on the topic?

-Arun


2013-09-04 18:52:01

by Andy Lutomirski

[permalink] [raw]
Subject: Re: clock_gettime_ns

I think that most of the hangup was a lack of agreement on how the API
should work wrt leap seconds.

I've always thought that the Right Way to represent a UTC time is
nanoseconds since some epoch, where every potential leap second
counts.

Pros:
- Unambiguously convertible to and from
year/month/day/hour/minute/second/nanosecond.
- Monotonic
- Compact

Cons:
- Computing differences between timestamps requires a table. (Note:
y/m/d/h/m/s/ns has the same problem.)
- Weird: no one does this
- If you naively subtract times, you end up with jumps forward. (But
jumps forward are much less likely to break things than jumps
backwards.)
- Almost, but not quite, compatible with timespec, so it could cause confusion.

If someone wants a hard problem, find a way to implement clock_gettime
that almost never spins or otherwise block and is continuous. I've
thought about it a bit and have something that almost works.

--Andy

On Wed, Sep 4, 2013 at 2:18 AM, Arun Sharma <[email protected]> wrote:
> A couple of years ago Andy posted this patch series:
>
> http://thread.gmane.org/gmane.linux.kernel/1233209/
>
> These patches have been in use at facebook for a couple of years and along
> with a vDSO implementation of thread_cpu_time(), they have proven useful for
> our profilers.
>
> I didn't see any arguments against this patch series. Did I miss some
> discussion on the topic?
>
> -Arun



--
Andy Lutomirski
AMA Capital Management, LLC

2013-09-04 19:17:54

by John Stultz

[permalink] [raw]
Subject: Re: clock_gettime_ns

On Wed, Sep 4, 2013 at 2:18 AM, Arun Sharma <[email protected]> wrote:
> A couple of years ago Andy posted this patch series:
>
> http://thread.gmane.org/gmane.linux.kernel/1233209/
>
> These patches have been in use at facebook for a couple of years and along
> with a vDSO implementation of thread_cpu_time(), they have proven useful for
> our profilers.
>
> I didn't see any arguments against this patch series. Did I miss some
> discussion on the topic?

(I've got a new email address, just fyi)

So, looking at the thread, I think Richard brought up the issue that
the net performance gain with the new interface wasn't significant
after the optimizations were applied to both interfaces.

If we're going to add a new interface that uses something other then a
timespec, we likely need to put some serious thought into that new
type, and see how it could be used across a number of syscalls. Some
of the discussion around dealing with the 2038 issue touched on this.

But getting those optimizations to the existing interface merged would
be nice, though. Anyone want to resend the patch?

thanks
-john

2013-09-04 19:21:04

by John Stultz

[permalink] [raw]
Subject: Re: clock_gettime_ns

On Wed, Sep 4, 2013 at 11:51 AM, Andy Lutomirski <[email protected]> wrote:
> I think that most of the hangup was a lack of agreement on how the API
> should work wrt leap seconds.

I don't recall this objection. The interface uses existing clockids,
so it probably should keep the existing leap-second behavior of those
clockids.

> I've always thought that the Right Way to represent a UTC time is
> nanoseconds since some epoch, where every potential leap second
> counts.

Check out the CLOCK_TAI clockid merged in 3.10.

thanks
-john

2013-09-04 20:24:17

by Andy Lutomirski

[permalink] [raw]
Subject: Re: clock_gettime_ns

On Wed, Sep 4, 2013 at 12:17 PM, John Stultz <[email protected]> wrote:
> On Wed, Sep 4, 2013 at 2:18 AM, Arun Sharma <[email protected]> wrote:
>> A couple of years ago Andy posted this patch series:
>>
>> http://thread.gmane.org/gmane.linux.kernel/1233209/
>>
>> These patches have been in use at facebook for a couple of years and along
>> with a vDSO implementation of thread_cpu_time(), they have proven useful for
>> our profilers.
>>
>> I didn't see any arguments against this patch series. Did I miss some
>> discussion on the topic?
>
> (I've got a new email address, just fyi)
>
> So, looking at the thread, I think Richard brought up the issue that
> the net performance gain with the new interface wasn't significant
> after the optimizations were applied to both interfaces.
>
> If we're going to add a new interface that uses something other then a
> timespec, we likely need to put some serious thought into that new
> type, and see how it could be used across a number of syscalls. Some
> of the discussion around dealing with the 2038 issue touched on this.
>
> But getting those optimizations to the existing interface merged would
> be nice, though. Anyone want to resend the patch?

It's already in. See 5f293474c4c6c4dc2baaf2dfd486748b5986de76, etc.

--Andy

2013-09-04 20:33:25

by Andy Lutomirski

[permalink] [raw]
Subject: Re: clock_gettime_ns

On Wed, Sep 4, 2013 at 12:20 PM, John Stultz <[email protected]> wrote:
> On Wed, Sep 4, 2013 at 11:51 AM, Andy Lutomirski <[email protected]> wrote:
>> I think that most of the hangup was a lack of agreement on how the API
>> should work wrt leap seconds.
>
> I don't recall this objection. The interface uses existing clockids,
> so it probably should keep the existing leap-second behavior of those
> clockids.
>
>> I've always thought that the Right Way to represent a UTC time is
>> nanoseconds since some epoch, where every potential leap second
>> counts.
>
> Check out the CLOCK_TAI clockid merged in 3.10.
>

I never really liked that -- CLOCK_TAI doesn't tell what time it is in
any format that normal people understand.

I'd advocate for going whole hog and returning, atomically:

- TAI (nanoseconds from epoch)
- UTC - TAI (seconds or nanoseconds) *
- TAI - CLOCK_MONOTONIC (nanoseconds)
- a leap second flag.

* There are various ways to define this. My fancy UTC - TAI wouldn't
actually need the leap-second flag, since the UTC time would indicate
leap seconds directly. With the conventional approach, someone would
have to decide whether the leap second count increments at the
beginning or the end of the leap second.

--Andy

2013-09-04 20:50:35

by John Stultz

[permalink] [raw]
Subject: Re: clock_gettime_ns

On 09/04/2013 01:23 PM, Andy Lutomirski wrote:
> On Wed, Sep 4, 2013 at 12:17 PM, John Stultz <[email protected]> wrote:
>> On Wed, Sep 4, 2013 at 2:18 AM, Arun Sharma <[email protected]> wrote:
>>> A couple of years ago Andy posted this patch series:
>>>
>>> http://thread.gmane.org/gmane.linux.kernel/1233209/
>>>
>>> These patches have been in use at facebook for a couple of years and along
>>> with a vDSO implementation of thread_cpu_time(), they have proven useful for
>>> our profilers.
>>>
>>> I didn't see any arguments against this patch series. Did I miss some
>>> discussion on the topic?
>> (I've got a new email address, just fyi)
>>
>> So, looking at the thread, I think Richard brought up the issue that
>> the net performance gain with the new interface wasn't significant
>> after the optimizations were applied to both interfaces.
>>
>> If we're going to add a new interface that uses something other then a
>> timespec, we likely need to put some serious thought into that new
>> type, and see how it could be used across a number of syscalls. Some
>> of the discussion around dealing with the 2038 issue touched on this.
>>
>> But getting those optimizations to the existing interface merged would
>> be nice, though. Anyone want to resend the patch?
> It's already in. See 5f293474c4c6c4dc2baaf2dfd486748b5986de76, etc.
Great!
-john

2013-09-04 20:55:00

by John Stultz

[permalink] [raw]
Subject: Re: clock_gettime_ns

On 09/04/2013 01:33 PM, Andy Lutomirski wrote:
> On Wed, Sep 4, 2013 at 12:20 PM, John Stultz <[email protected]> wrote:
>> On Wed, Sep 4, 2013 at 11:51 AM, Andy Lutomirski <[email protected]> wrote:
>>> I think that most of the hangup was a lack of agreement on how the API
>>> should work wrt leap seconds.
>> I don't recall this objection. The interface uses existing clockids,
>> so it probably should keep the existing leap-second behavior of those
>> clockids.
>>
>>> I've always thought that the Right Way to represent a UTC time is
>>> nanoseconds since some epoch, where every potential leap second
>>> counts.
>> Check out the CLOCK_TAI clockid merged in 3.10.
>>
> I never really liked that -- CLOCK_TAI doesn't tell what time it is in
> any format that normal people understand.
>
> I'd advocate for going whole hog and returning, atomically:
>
> - TAI (nanoseconds from epoch)
> - UTC - TAI (seconds or nanoseconds) *
> - TAI - CLOCK_MONOTONIC (nanoseconds)
> - a leap second flag.
>
> * There are various ways to define this. My fancy UTC - TAI wouldn't
> actually need the leap-second flag, since the UTC time would indicate
> leap seconds directly. With the conventional approach, someone would
> have to decide whether the leap second count increments at the
> beginning or the end of the leap second.

Well, adjtimex() gives you UTC & tai offset & leapsecond flag in one go.

thanks
-john

2013-09-04 22:29:19

by H. Peter Anvin

[permalink] [raw]
Subject: Re: clock_gettime_ns

On 09/04/2013 01:54 PM, John Stultz wrote:
>>
>> I'd advocate for going whole hog and returning, atomically:
>>
>> - TAI (nanoseconds from epoch)
>> - UTC - TAI (seconds or nanoseconds) *
>> - TAI - CLOCK_MONOTONIC (nanoseconds)
>> - a leap second flag.
>>
>> * There are various ways to define this. My fancy UTC - TAI wouldn't
>> actually need the leap-second flag, since the UTC time would indicate
>> leap seconds directly.

Not so (see below).

> With the conventional approach, someone would
>> have to decide whether the leap second count increments at the
>> beginning or the end of the leap second.
>
> Well, adjtimex() gives you UTC & tai offset & leapsecond flag in one go.
>

But not fractional-second information,right? I believe it would be
desirable if we can create a small structure (<= 16 bytes) for this.

UTC - TAI is always an integral number of seconds, possibly negative
(unlikely, but...)

Something like:

struct time_ns {
u64 tai_s;
u32 tai_ns;
s16 utcdelta; /* TAI - UTC */
u8 leap; /* Positive leap second in progress */
u8 pad; /* Something useful here maybe? */
};

Why the leap second flag? It is necessary to represent the 61st second
in a minute during a positive leap second. Consider the below
(artificial) cases:

(leap second)
TAI 31536000 31536001 31536002 31536003
Delta 2 2 ? 3
UTC 23:59:58 23:59:59 23:59:60 00:00:00

(no leap second)
TAI 31536000 31536001 31536002 31536003
Delta 2 2 2 2
UTC 23:59:58 23:59:59 00:00:00 00:00:01

(no leap second)
TAI 31536000 31536001 31536002 31536003
Delta 3 3 3 3
UTC 23:59:57 23:59:58 23:59:59 00:00:00

There simply is no sufficiently meaningful value that can be put on the
delta during a positive leap second. Both 2 and 3 would be wrong in the
above example, giving UTC of either 00:00:00 or 23:59:59.

There is a way to do without the leap second flag by making UTC the main
time; this does have the advantage of higher compatibility with time_t,
struct timespec, etc:

struct timespecx {
time_t tx_sec; /* POSIX UTC seconds */
u32 tx_ns; /* Nanoseconds */
s32 tx_taidelta; /* TAI - UTC */
};

The trick here is that tx_ns can grow all the way up to 1,999,999,999
during a positive leap second.

(Note that while planning these sorts of things it is worth noting that
it is at least theoretically possible that another shift in the rotation
of the Earth could one day mean needing multiple leap seconds, so at
least allowing for them would be a good idea. Both proposals above
would handle that -- up to 255 leap seconds for the former and 4 leap
seconds for the latter, either of which should be way more than necessary.)

-hpa


-hpa

2013-09-04 22:59:39

by John Stultz

[permalink] [raw]
Subject: Re: clock_gettime_ns

On 09/04/2013 03:29 PM, H. Peter Anvin wrote:
> On 09/04/2013 01:54 PM, John Stultz wrote:
>>> I'd advocate for going whole hog and returning, atomically:
>>>
>>> - TAI (nanoseconds from epoch)
>>> - UTC - TAI (seconds or nanoseconds) *
>>> - TAI - CLOCK_MONOTONIC (nanoseconds)
>>> - a leap second flag.
>>>
>>> * There are various ways to define this. My fancy UTC - TAI wouldn't
>>> actually need the leap-second flag, since the UTC time would indicate
>>> leap seconds directly.
> Not so (see below).
>
>> With the conventional approach, someone would
>>> have to decide whether the leap second count increments at the
>>> beginning or the end of the leap second.
>> Well, adjtimex() gives you UTC & tai offset & leapsecond flag in one go.
>>
> But not fractional-second information,right? I believe it would be
> desirable if we can create a small structure (<= 16 bytes) for this.

Well, depending on if STA_NANO is set, adjtimex returns either nsec or
usec precision via the timex.time field.

> UTC - TAI is always an integral number of seconds, possibly negative
> (unlikely, but...)
>
> Something like:
>
> struct time_ns {
> u64 tai_s;
> u32 tai_ns;
> s16 utcdelta; /* TAI - UTC */
> u8 leap; /* Positive leap second in progress */
> u8 pad; /* Something useful here maybe? */
> };
>
> Why the leap second flag? It is necessary to represent the 61st second
> in a minute during a positive leap second. Consider the below
> (artificial) cases:
>
> (leap second)
> TAI 31536000 31536001 31536002 31536003
> Delta 2 2 ? 3
> UTC 23:59:58 23:59:59 23:59:60 00:00:00
>
> (no leap second)
> TAI 31536000 31536001 31536002 31536003
> Delta 2 2 2 2
> UTC 23:59:58 23:59:59 00:00:00 00:00:01
>
> (no leap second)
> TAI 31536000 31536001 31536002 31536003
> Delta 3 3 3 3
> UTC 23:59:57 23:59:58 23:59:59 00:00:00
>
> There simply is no sufficiently meaningful value that can be put on the
> delta during a positive leap second. Both 2 and 3 would be wrong in the
> above example, giving UTC of either 00:00:00 or 23:59:59.
>
> There is a way to do without the leap second flag by making UTC the main
> time; this does have the advantage of higher compatibility with time_t,
> struct timespec, etc:
>
> struct timespecx {
> time_t tx_sec; /* POSIX UTC seconds */
> u32 tx_ns; /* Nanoseconds */
> s32 tx_taidelta; /* TAI - UTC */
> };


And again, most of the detail above is already there w/ adjtimex (though
admittedly not in a very tight format).

My concern with adding these details to the timespec-like structure this
is with most clockids I'm not sure taidelta would make sense.

Also, there's been talk of a slewed-leap-second clockid, basically UTC
but around the leapsecond it slows down to absorb the extra second. This
means that clockid would have a subsecond offset from TAI.

thanks
-john

2013-09-04 23:04:30

by H. Peter Anvin

[permalink] [raw]
Subject: Re: clock_gettime_ns

On 09/04/2013 03:59 PM, John Stultz wrote:
>
> Also, there's been talk of a slewed-leap-second clockid, basically UTC
> but around the leapsecond it slows down to absorb the extra second. This
> means that clockid would have a subsecond offset from TAI.
>

Most of what I have heard seem to center around abolishing leap seconds
entirely. Now, I know that some users do slewed leap seconds as a
unofficial policy to avoid rare events.

-hpa

2013-09-04 23:20:58

by John Stultz

[permalink] [raw]
Subject: Re: clock_gettime_ns

On 09/04/2013 04:04 PM, H. Peter Anvin wrote:
> On 09/04/2013 03:59 PM, John Stultz wrote:
>> Also, there's been talk of a slewed-leap-second clockid, basically UTC
>> but around the leapsecond it slows down to absorb the extra second. This
>> means that clockid would have a subsecond offset from TAI.
>>
> Most of what I have heard seem to center around abolishing leap seconds
> entirely. Now, I know that some users do slewed leap seconds as a
> unofficial policy to avoid rare events.
Well, Google does their own slewed leap-seconds internally (using a
modified ntp server to slow CLOCK_REALTIME on clients), and I believe
AIX also provides similar behavior w/ their CLOCK_REALTIME clockid (they
also provide CLOCK_UTC for those who have the need for UTC/leapseconds).
And there's also some occasional talk of trying to standardizing a
leap-second free UTC.

I suspect we have to have an all-of-the-above policy with the kernel. So
we now (as of 3.10) support CLOCK_TAI, as well as the UTC-based
CLOCK_REALTIME. If we can get some agreement on what the
slewed-leapsecond adjustment should look like (have to decide what the
slewing rate/range is: do we absorb the second over the last-hour,
half-hour, 15-minutes before and after?), then we can add such a clockid
(CLOCK_UTC_SLS?) to the kernel as well.

thanks
-john


2013-09-04 23:39:07

by Andy Lutomirski

[permalink] [raw]
Subject: Re: clock_gettime_ns

On Wed, Sep 4, 2013 at 3:29 PM, H. Peter Anvin <[email protected]> wrote:
> On 09/04/2013 01:54 PM, John Stultz wrote:
>>>
>>> I'd advocate for going whole hog and returning, atomically:
>>>
>>> - TAI (nanoseconds from epoch)
>>> - UTC - TAI (seconds or nanoseconds) *
>>> - TAI - CLOCK_MONOTONIC (nanoseconds)
>>> - a leap second flag.
>>>
>>> * There are various ways to define this. My fancy UTC - TAI wouldn't
>>> actually need the leap-second flag, since the UTC time would indicate
>>> leap seconds directly.
>
> Not so (see below).
>
>> With the conventional approach, someone would
>>> have to decide whether the leap second count increments at the
>>> beginning or the end of the leap second.
>>
>> Well, adjtimex() gives you UTC & tai offset & leapsecond flag in one go.
>>
>
> But not fractional-second information,right? I believe it would be
> desirable if we can create a small structure (<= 16 bytes) for this.
>
> UTC - TAI is always an integral number of seconds, possibly negative
> (unlikely, but...)
>
> Something like:
>
> struct time_ns {
> u64 tai_s;
> u32 tai_ns;
> s16 utcdelta; /* TAI - UTC */
> u8 leap; /* Positive leap second in progress */
> u8 pad; /* Something useful here maybe? */
> };
>
> Why the leap second flag? It is necessary to represent the 61st second
> in a minute during a positive leap second. Consider the below
> (artificial) cases:
>
> (leap second)
> TAI 31536000 31536001 31536002 31536003
> Delta 2 2 ? 3
> UTC 23:59:58 23:59:59 23:59:60 00:00:00
>
> (no leap second)
> TAI 31536000 31536001 31536002 31536003
> Delta 2 2 2 2
> UTC 23:59:58 23:59:59 00:00:00 00:00:01
>
> (no leap second)
> TAI 31536000 31536001 31536002 31536003
> Delta 3 3 3 3
> UTC 23:59:57 23:59:58 23:59:59 00:00:00
>
> There simply is no sufficiently meaningful value that can be put on the
> delta during a positive leap second. Both 2 and 3 would be wrong in the
> above example, giving UTC of either 00:00:00 or 23:59:59.
>
> There is a way to do without the leap second flag by making UTC the main
> time; this does have the advantage of higher compatibility with time_t,
> struct timespec, etc:
>
> struct timespecx {
> time_t tx_sec; /* POSIX UTC seconds */
> u32 tx_ns; /* Nanoseconds */
> s32 tx_taidelta; /* TAI - UTC */
> };
>
> The trick here is that tx_ns can grow all the way up to 1,999,999,999
> during a positive leap second.
>
> (Note that while planning these sorts of things it is worth noting that
> it is at least theoretically possible that another shift in the rotation
> of the Earth could one day mean needing multiple leap seconds, so at
> least allowing for them would be a good idea. Both proposals above
> would handle that -- up to 255 leap seconds for the former and 4 leap
> seconds for the latter, either of which should be way more than necessary.)

I suspect that nearly every program will screw this up -- leap second
are rare, and the amount of branchy logic needed here is large.

Let me clarify my proposal:

A UTC time is year,month,day,hour,minute,second,fractional seconds.
So 2013/12/31 23:59:60.100 is a valid UTC time, assuming that there's
a leap second then.

Suppose the epoch is 2013/12/31 00:00:00 UTC. Then time 86399.000 is
2013/12/31 23:59:59.000 UTC. Time 86400.000 is 2013/12/31
23:59:60.000 UTC, 86400.100 is 2013/12/31 23:59:60.000 UTC, and
86401.000 is 2014/01/01 00:00:00.000 UTC. This encoding happens
regardless of whether 2013/12/31 actually has a leap second.

So, for the purposes of the encoding, the last day of each month is
86401 seconds long. One of those seconds will most likely not occur.

The benefits are that every possible UTC time has a unique
representation as a single number. That number increases
monotonically with time. The special case happens *every month*, so
any program that screws it up will be obviously wrong.

The main downside I can see is that it's a little strange.

--Andy

2013-09-05 01:22:49

by H. Peter Anvin

[permalink] [raw]
Subject: Re: clock_gettime_ns

I think it would be crazy encoding UTC with a non-POSIX scheme.

Andy Lutomirski <[email protected]> wrote:
>On Wed, Sep 4, 2013 at 3:29 PM, H. Peter Anvin <[email protected]> wrote:
>> On 09/04/2013 01:54 PM, John Stultz wrote:
>>>>
>>>> I'd advocate for going whole hog and returning, atomically:
>>>>
>>>> - TAI (nanoseconds from epoch)
>>>> - UTC - TAI (seconds or nanoseconds) *
>>>> - TAI - CLOCK_MONOTONIC (nanoseconds)
>>>> - a leap second flag.
>>>>
>>>> * There are various ways to define this. My fancy UTC - TAI
>wouldn't
>>>> actually need the leap-second flag, since the UTC time would
>indicate
>>>> leap seconds directly.
>>
>> Not so (see below).
>>
>>> With the conventional approach, someone would
>>>> have to decide whether the leap second count increments at the
>>>> beginning or the end of the leap second.
>>>
>>> Well, adjtimex() gives you UTC & tai offset & leapsecond flag in one
>go.
>>>
>>
>> But not fractional-second information,right? I believe it would be
>> desirable if we can create a small structure (<= 16 bytes) for this.
>>
>> UTC - TAI is always an integral number of seconds, possibly negative
>> (unlikely, but...)
>>
>> Something like:
>>
>> struct time_ns {
>> u64 tai_s;
>> u32 tai_ns;
>> s16 utcdelta; /* TAI - UTC */
>> u8 leap; /* Positive leap second in progress
>*/
>> u8 pad; /* Something useful here maybe? */
>> };
>>
>> Why the leap second flag? It is necessary to represent the 61st
>second
>> in a minute during a positive leap second. Consider the below
>> (artificial) cases:
>>
>> (leap second)
>> TAI 31536000 31536001 31536002 31536003
>> Delta 2 2 ? 3
>> UTC 23:59:58 23:59:59 23:59:60 00:00:00
>>
>> (no leap second)
>> TAI 31536000 31536001 31536002 31536003
>> Delta 2 2 2 2
>> UTC 23:59:58 23:59:59 00:00:00 00:00:01
>>
>> (no leap second)
>> TAI 31536000 31536001 31536002 31536003
>> Delta 3 3 3 3
>> UTC 23:59:57 23:59:58 23:59:59 00:00:00
>>
>> There simply is no sufficiently meaningful value that can be put on
>the
>> delta during a positive leap second. Both 2 and 3 would be wrong in
>the
>> above example, giving UTC of either 00:00:00 or 23:59:59.
>>
>> There is a way to do without the leap second flag by making UTC the
>main
>> time; this does have the advantage of higher compatibility with
>time_t,
>> struct timespec, etc:
>>
>> struct timespecx {
>> time_t tx_sec; /* POSIX UTC seconds */
>> u32 tx_ns; /* Nanoseconds */
>> s32 tx_taidelta; /* TAI - UTC */
>> };
>>
>> The trick here is that tx_ns can grow all the way up to 1,999,999,999
>> during a positive leap second.
>>
>> (Note that while planning these sorts of things it is worth noting
>that
>> it is at least theoretically possible that another shift in the
>rotation
>> of the Earth could one day mean needing multiple leap seconds, so at
>> least allowing for them would be a good idea. Both proposals above
>> would handle that -- up to 255 leap seconds for the former and 4 leap
>> seconds for the latter, either of which should be way more than
>necessary.)
>
>I suspect that nearly every program will screw this up -- leap second
>are rare, and the amount of branchy logic needed here is large.
>
>Let me clarify my proposal:
>
>A UTC time is year,month,day,hour,minute,second,fractional seconds.
>So 2013/12/31 23:59:60.100 is a valid UTC time, assuming that there's
>a leap second then.
>
>Suppose the epoch is 2013/12/31 00:00:00 UTC. Then time 86399.000 is
>2013/12/31 23:59:59.000 UTC. Time 86400.000 is 2013/12/31
>23:59:60.000 UTC, 86400.100 is 2013/12/31 23:59:60.000 UTC, and
>86401.000 is 2014/01/01 00:00:00.000 UTC. This encoding happens
>regardless of whether 2013/12/31 actually has a leap second.
>
>So, for the purposes of the encoding, the last day of each month is
>86401 seconds long. One of those seconds will most likely not occur.
>
>The benefits are that every possible UTC time has a unique
>representation as a single number. That number increases
>monotonically with time. The special case happens *every month*, so
>any program that screws it up will be obviously wrong.
>
>The main downside I can see is that it's a little strange.
>
>--Andy

--
Sent from my mobile phone. Please pardon brevity and lack of formatting.

2013-09-05 05:12:10

by Arun Sharma

[permalink] [raw]
Subject: Re: clock_gettime_ns

On 9/5/13 12:47 AM, John Stultz wrote:
> If we're going to add a new interface that uses something other then a
> timespec, we likely need to put some serious thought into that new
> type, and see how it could be used across a number of syscalls. Some
> of the discussion around dealing with the 2038 issue touched on this.

[ I know you're not asking for perf data, but may be useful for new
readers ]

Here's the benchmarking I did in 2011:

http://thread.gmane.org/gmane.linux.kernel/1233758/focus=1233781

Switching from timespec to s64 was worth 21%. My experience over the
years is that this performance delta causes userspace guys to implement
their own TSC based timers, against the advice from kernel developers.

http://code.ohloh.net/search?s=wall%20now%20tsc%20hz&pp=0&fl=C&fl=C%2B%2B&ff=1&mp=1&ml=1&me=1&md=1&filterChecked=true

I worry that trying to solve other clock problems will cause the kernel
to continue to pass the time in memory instead of registers, giving the
userspace TSC based implementations a reason to exist.

-Arun

2013-09-09 17:47:23

by Andy Lutomirski

[permalink] [raw]
Subject: Re: clock_gettime_ns

On Wed, Sep 4, 2013 at 6:22 PM, H. Peter Anvin <[email protected]> wrote:
> I think it would be crazy encoding UTC with a non-POSIX scheme.

The whole point is to find a good way to return the time that solves
the problems with the POSIX scheme. Some of the problems, as I see
them, are:

- Performance: seconds + nanoseconds is expensive to compute and
expensive to use.
- Leap seconds, part 1: Times like 23:59:60.1 are not representable.
- Leap seconds, part 2: The limited leap-second support that already
exists (via the NTP APIs) is so obscure that it's frequently broken.
- Offsets between clocks can't be read without using complicated
calls like adjtimex.

I think that coming up with something that's both non-POSIX and
half-arsed is a bad idea, but doing something that's non-POSIX and
well thought-through could be valuable.

--Andy

2013-09-11 18:50:29

by Richard Cochran

[permalink] [raw]
Subject: Re: clock_gettime_ns

On Mon, Sep 09, 2013 at 10:47:01AM -0700, Andy Lutomirski wrote:
>
> I think that coming up with something that's both non-POSIX and
> half-arsed is a bad idea, but doing something that's non-POSIX and
> well thought-through could be valuable.

I know Harlan Stenn of the Network Time Foundation is working on a new
timestamp API and presented a paper at the conference:

Requirements for UTC and Civil Timekeeping on Earth
A Colloquium Addressing a Continuous Time Standard
University of Virginia, Charlottesville, May 29-31, 2013.

http://www.cacr.caltech.edu/futureofutc/

The slides are on that site, and I would bet that the paper could be
made available. In any case, since I think any kind of new time API idea
would benefit from review and acceptance from the NTP and BSD people.

Thanks,
Richard