Computer dates and times

From Helpful
Revision as of 11:30, 17 June 2024 by Helpful (talk | contribs) (→‎Timezones)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Some fragmented programming-related notes, not meant as introduction or tutorial

Data: Numbers in computers ·· Computer dates and times ·· Data structures

Wider abstractions: Programming language typology and glossary · Generics and templating ·· Some abstractions around programming · · Computational complexity theory notes · Synchronous, asynchronous · First-class citizen

Syntaxy abstractions: Constness · Memory aliasing · Binding, assignment, and such · Hoisting · Closures · Context manager · Garbage collection

Language specific: Python notes ·· C and C++ notes · Compiling and linking ·· Lua notes

Algorithms: Dynamic programming · Sorting · String search · Sequence alignment and diffs

Sharing stuff: Communicated state and calls · Locking, data versioning, concurrency, and larger-scale computing notes ·· Dependency hell

Design concepts: Entanglement, Decoupling; Information Hiding, Inversion · Design patterns

Teams and products: Programming in teams, working on larger systems, keeping code healthy · Benchmarking, performance testing, load testing, stress testing, etc. · Maintainability

More applied notes: Optimized number crunching · OS-level notes · File polling, event notification · Webdev · GUI toolkit notes · StringBuilder


Mechanics of duct taping software together: Automation, remote management, configuration management · Build tool notes · Packaging · Installers


Broadcasted time synchronization

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.


  • GPS - the signal is just there anyway, and basically worldwide
you still need a receiver, and while you can get ~EUR15 modules, that's not the cheapest option to just get time synch.
you can't guarantee it works indoors, inside metal cases, etc.


Network time (NTP)

In electronics, keeping time in the I can keep counting very regularly sense (keeping multiple components in sync is another matter) is often done by using an oscillator made to go at a specific frequency.


Such crystal oscillators can be pretty good for pretty cheap, but even if they are off by only one count in a hundred thousand counts, that still amounts to the order of a second per day, a few minutes per year. And making oscillators better certainly helps, but it's a diminishing returns thing, so it doesn't really solve the problem.

Minutes per year is good enough for a lot of things, like your PC's internal clock Yet there are certainly cases where that matters.


NTP (Network Time Protocol) allows synchronization of clocks over a network, by pointing it at a reference server.

The accuracy of NTP (how close it gets to that reference) varies most with the network connection - network latency (RTT) and the variation in such (mostly due to congestion). RTT may be relatively large to what accuracy we want yet can be estimated, so it turns out that mainly the variation in latency matters - because that part is much less predictable so can't be modeled well. We can detect that it's bad by detecting that these numbers vary, but can't really correct for it any better than a good statistical guess.

Over the modern internet, you can usually expect it to correct to within a few milliseconds of its reference, which is more than good enough for "make my computer time not drift", "make all our computers agree on the time quite well" and a lot of other uses.



There is also SNTP, which is essentially a simplified client. The protocol is the same, they often connect to the same NTP servers(verify), but the client is much simpler to implement because it ignores some aspects (e.g. NTP's drift(verify)), and often has a cruder method of clock adjustment.

It's useful where it is more interesting to keep clocks say basically the same thing, and/or around embedded devices that have network but not RTC, and where it is perfectly fine if it's dozens or even hundreds of milliseconds off.



Some terminology

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.
  • Stratum means "how many hops to a very good time reference", because if there are too many servers doing a few millisecond errors to others, that adds up.
Stratum 0 are clock devices, which reference from an atomic clock, GPS, CDMA, WWV, DCF77, or such,
you could get and use such devices yourself
and are not themselves networked
Stratum 1 are the networked servers that are directly connected to stratum 0 devices (implicitly: with very low and stable latency). They are the best networked reference you can get.
Stratum 2 are servers that connect to stratum 1 servers
Stratum 3 are servers that connect to stratum 2 servers
...and so on, up to 15.
NTP is an effectively hierarchical system, where each that connects to another adds one
(though same-stratum servers can also connect, for other reasons)


So we want to find a low-stratum server.
It's gotten simpler and cheaper to run stratum 1 servers, so there are now more of them around.
This also means there are now enough stratum 2 servers that you get pretty precise time without thinking much - in particular with things like the NTP pool project are great for this, because they effectively send you a shortlist of probably-decent choices probably near you. That means you can get that ~millisecond sync without much thought (down from maybe a dozen milliseconds from a thoughtless choice without such help).



  • delay - networking round-trip time (a.k.a. RTT, ping time) to an NTP server
this tends to be at least 7..10ms just because broadband's latency
not itself the most important metric of a server, but higher latency tends to mean further away and correlate with higher jitter


  • offset (sometimes 'phase')
the difference between the reference time and your system clock(verify)
...which can only be estimated, as it varies with your networking performance
offset involves measurement and calculation, and is a relatively instantaneous figure - if you were to plot it it would be jittery, not smooth.
This is why you don't want to indiscriminately adjust to every offset you find. It's relevant when considering some behaviour and implementation details, and a major reason behind the slow adjustment. Read the NTP documentation if you want to know all the hairy details.
  • jitter (sometimes 'dispersion') - variation in received/calculated offset (verify)
which is related to network jitter but also NTP's cleverness



What happens to your computer's clock

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.


Since a lot of programs do timing of some sort, correcting time disrupts the accuracy.

For example, stepping refers to just changing the time. But that means it is likely to make measured times larger than sensible, or negative. For things like timeouts, this can cause some odd behaviour.

Stepping may be sensible when your current time is very inaccurate - including hardware that has no RTC, which can still have accurate time by running NTP (and stepping to correct timeas part of boot).


Slewing means intentionally tweaking the clock speed a little so that the time will crawl towards the intended reference time within reasonable time.

This is the least disruptive for programs that do timing, as you can guarantee the time is monotonous (never changes to a past value), and the length of measured intervals while slewing are inaccurate by only a tiny amount.

As such, slewing is preferred whenever it is possible, and ideally ntpd will only ever need to slew (once it is actively correcting).



NTP will by default refuse to work when you have a time, but it is more than fifteen minutes incorrect. This because it's assumed to be a strange situation that an admin needs to look at.

Servers with huge offsets will never be selected, and if a selected server's offset becomes huge, ntpd will quit.


When looking at servers with small-to-moderate offset, it will take some time to estimate the quality of the time source.

Once (and while) a server is selected, NTP seems to keep a local-to-server offset in mind for a while (rejecting this time) before doing anything (why?(verify)).

Once it wants to correct, ntpd will either step to it (for offsets >128ms), or slew to it (if smaller).

Ideally, once NTP is running, the offset will never become high enough for a step to be necessary, but it can happen.

You can continuously monitor offset, to get an idea of the accuracy you're getting.


NTP accuracy

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Stratum-1 hosts, can themselves be within ~10 microseconds of UTC.


Each stratum above, being a network connection, may add on the order of perhaps a millisecond.

Exactly how much depends on a bunch of factors, that you cannot know precisely.


Since stratum 1 servers are relatively scarce, and each server can only serve so many hosts, it's much easier to find stratum 2 and 3 servers.


You can expect your clock to become correct to within a millisecond on average

For a real example, a server on my home broadband has a ~15ms RTT to its NTP server, but the jitter in that RTT averages to approx 1ms in general (less than half of that when the dozen people in it serves aren't watching video), which allows for synchronization up to ~0.3ms or so(verify).



Note that it may take hours to days for the offset to the reference to become as low as it can become, not because we don't know the right time, but because NTP typically tries to not jump the clock back and forth in time (which might confuse code that does short-term timeouts), instead opting to spread the adjustment to many tiny ones if it can.

Basic ntpd setup (unices)

(Optional) Make sure your system's timezone is correct.

Not strictly a requirement, but probably useful to you.


(Optional) Configure the time servers you want to use.

Optional in that the default pools are usually fine choices.
See #Servers below.


Set your system clock to within a few minutes of accurate time OR tell the NTP client the first correction is allowed to be huge

The reason is that, when time difference is more than about fifteen minutes, NTP clients may refuse outright, figuring it's a problem that a person needs to look at.
Platforms without a battery backed clock (e.g. raspberry pi) will typically need to say "the first correction is allowed to be a very large jump" on every boot.
You can set date and time manually, but since you've probably just set up NTP to work, ntpd -g -q is probably simpler (-g allows big changes, -q will quit immediately instead of running as a daemon.).
Platforms with a battery backed clock tend to only be off by second or minutes, and the slow and steady correction is preferred.
(if you just powered it on for the first time in years, you may still want to correct with a big jump, once ever. It should never drift enough to require that ever again)


Enable and start the ntpd service,

so that, after you finish this list, you can feel free to forget about it.


Check that that ntpd service it works

you can query localhost about its peers, using ntpq, which will show its current status with various peers
e.g. with watch -n 1 ntpq -pn localhost
It may take a few minutes to show useful statistics for all peers, and to start synchronizing with one (which will be indicated with * as the first character on the line)

If ntpq gives you:

  • localhost: timed out, nothing received, then you probably have an overly strict firewall keeping you from connecting to your ntpd (e.g. not trusting localhost, and dropping the packets), though it can also indicate rejection by ntpd itself via its configuration.(verify)
  • ntpq: read: Connection refused, this may mean ntpd isn't running (possibly because it quit because you started it with a clock more than 15 minutes off), or that it is not configured to allow you.
  • strata 16 means servers we haven't synced with - ...including servers we cannot contact at all)
  • it will take a while for the offset to become as low as it is going to be.
because in regular operation, ntpd corrects time very slowly, so expect the initial adjustment to take (order of magnitude) half an hour for a few minutes of difference.

Servers

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

The exchange of time can be done as

  • a client to a listed server (probably the most common)
  • within a network via broadcasts or multicasts
  • between a set of peers that will synchronize with each other (useful for fallback/redundancy)


Assuming you want to pick a server to synchronize with:

You can hand-pick low-stratum servers near you. Or you can be lazy and use the ntp.org pool servers for almost the same effect - ntp.org uses DNS tricks to resolve the names to geographically close hosts. Because they keep track of actual hosts, this means less time worrying about NTP servers that went offline.

In the case of ntpd, having the following is a simple start:

server 0.pool.ntp.org
server 1.pool.ntp.org
server 2.pool.ntp.org
server 3.pool.ntp.org

You can also use specific subdomains to narrow down the options to things that are on the same continent, or in the same country. This may help it find better options sooner.

See pages like http://support.ntp.org/bin/view/Servers/NTPPoolServers

Note that these numbers are not strata.

See also


Standards:

  • NTP v4 is in development
  • RFC 1305, "Network Time Protocol (Version 3) - Specification, Implementation and Analysis" (1992) (current)
  • RFC 1119, ""Network Time Protocol (Version 2) - Specification and Implementation"" (1989) (now obsoleted)
  • RFC 1059, "Network Time Protocol (Version 1) - Specification and Implementation" (1988) (now obsoleted)
  • RFC 958, "Network Time Protocol (NTP)" (1985) (now obsoleted)

...well, it's a little more interesting



Common date formatting

mm/dd versus dd/mm

The tendency to write things like 02/06/2009 is internationally problematic.

A lot of the world would read it as June 2nd, 2009 (dd/mm/yyyy), yet the US, Canada and a handful of other nations may easily read it as february 6th, 2009 (mm/dd/yyyy).

This is only nonambiguous when day is the 13th or later, so it's ambiguous for ~40% of all dates (worse if the year is also 2-digit) when you don't know the nationality of the person writing it. So please don't do this.

ISO 8601

ISO 8601, often referred to as ISO date format, is one solution to this last problem. It uses an order previously mostly unused, and requires a four-digit year. These ISO dates are always formatted YYYY-MM-DD, so identifiable as this format and non-ambiguous.

It is fairly generally accepted. For example, Canada uses it on official documents, and modern programming languages (e.g. .NET date functions) support it out of the box, and various people seem to like its date formatting clarity.


ISO 8601 allows varied levels of detail:

# dates: years, months, days
YYYY
YYYY-MM
YYYY-MM-DD
#date&time, with time, to minute, to second, or using second with fraction:
YYYY-MM-DDThh:mmTZD      (e.g. 1997-07-16T19:20+01:00)
YYYY-MM-DDThh:mm:ssTZD   (e.g. 1997-07-16T19:20:30+01:00)
YYYY-MM-DDThh:mm:ss.sTZD (e.g. 1997-07-16T19:20:30.45+01:00)

...and, in fact, more, including a compact form like 19940203T141529Z


Notes:

  • The T is a literal T, which signals that a time follows.
    • When displaying these dates, the T is sometimes a space. (Can also be done in storage/communication, if "partners in information interchange" mutually agree)


  • TZD refers to being a Time Zone Designator, and should be one of:
    • Z ('Zulu') meaning UTC
    • +hh:mm
    • -hh:mm


Note that lexically sorting ISO 8601 is also useful sorting

...except for the timezone information


RFC3339

A variation of ISO8601, close enough that you could consider it a restricted profile of ISO8601.


The idea seems simplicity: because it has fewer forms, it's easier to conform to RFC3339 than to all of (rather than the usual form of) ISO8601.

And that's useful when sending complete date-times through APIs and such.


The largest difference is that it doesn't have the shortened forms. Only the fractional seconds are optional.

There are some other subtleties, though, like how it allows a negative sign on a timezone offset of zero, which ISO8601 defines must be a +

W3C Date and Time Format, a.k.a. W3C Datetime / W3CDTF

A W3 note/overview/profile of the basic parts and uses of ISO 8601, omitting some of the complex details and focusing on just the date-and-time part.

Since it allows most of the shorter forms, this is technically less strict than RFC3339.

https://www.w3.org/TR/NOTE-datetime

RFC 822/1233

For example:

Thu, 11 Oct 07 12:38:29 GMT 
Thu, 11 Oct 2007 12:38:29 GMT 

RFC 822 first specified the format, and allowed 2-digit and 4-digit years.

RFC 1123 updated this to require the year to always be four-digit.


You can leave off the weekday.

Timezones can be specified in a number of ways:

  • 4-digit offset: +0330, -0100 (preferred format)
  • pre-defined zones: UT (refers to UTC), GMT, EST, EDT, CST, CDT, MST, MDT, PST, PDT
  • Military: Z for 0, and A-Z except for J


In terms of strftime(), assuming you've already converted to GMT:

"%a, %d %b %Y %H:%M:%S GMT"

(Or, if you're insane enough to like two-digit years, "%a, %d %b %y %H:%M:%S GMT")

RFC 2822

Looks equivalent to RFC1233. (verify)

Mostly a documentation thing because 2822 updates 822, but picks up the definition from 1233.

RFC 850/1036

Looks like (weekday optional):

Sunday, 06 Nov 94 08:49:37 GMT

Defined by RFC 1036 (which obsoletes RFC 850 where the format was originally defined).

While the format sees relatively little use in standards, it is not uncommon to see real-world dates that should be formatted in the 822 way that look rather like this instead -- in part because this format is valid 822 format as well.

In strftime (assuming you've converted to GMT) (verify)

%A, %d %h %y %H:%M:%S GMT


Observed variation:

  • dashes in the date
  • four-digit year

asctime/ctime

The C library's asctime() and ctime() output: (verify)

Sat May 20 15:21:51 2000
Thu Feb  3 17:03:55 GMT 1994


Common logfile format

common log format is used by various webservers (e.g. apache) and includes a date like:

03/Feb/1994:17:03:55 -0700

In strftime terms:

%d/%b/%Y:%H:%M:%S %z

More notes

Timezones

Suggestion: work in UTC

GMT / UTC, daylight savings

Many date formats do not allow for specification of daylight savings details.


If you care about showing times correctly to different people (rather than just logging, where 'what the system saw at the time' is often enough), then you probably in general want to consider storing times after converting to UTC, as daylight savings does not apply to it - and not GMT, where daylight savings applies.

You can often count on a particular date library to then correctly convert to any specific format on the fly. This may also make it easier to keep up to date with country-specific changes in date/time, because that happens.


The military reference to Zulu time refers to UTC.


(Technically speaking, Universal Time (UT) contains a number of reference definitions. The most interesting is UT1 and its practical approximation, UTC.)

Used by...

📃 These are primarily notes, intended to be a collection of useful fragments, that will probably never be complete in any sense.


HTTP

HTTP has historically allowed three formats:

RFC 822/1233 style
RFC 850/1036 style
and asctime style.

It seems HTTP1.1 restricted that to 822/1233 (it seems to add two details, though), but be prepared to parse/accept 850/1036.


MIME is also primarily 822/1233, though a stroll through spam will reveal dozens of types of abuse.


.NET seems to use the longest form of ISO8601

Some mentioned standards


  • RFC 3339 'Date and Time on the Internet: Timestamps'


  • RFC 822, 'Standard For The Format Of ARPA Internet Text Messages'
  • RFC 1123, 'Requirements for Internet Hosts - Application and Support'


  • RFC 850, 'Standard for Interchange of USENET Messages'
  • RFC 1036, 'Standard for Interchange of USENET Messages'


Date serialization/storage formats

Text

Human-readable, some are also easily computer-parsed, and some are unambiguous timezonewise (ISO8601 is probably best if you want all of that).

See e.g. #Common date formatting.

Unix time

  • Counting one-second seconds elapsed since the defined epoch, namely January 1, 1970 (UTC)
    • Initially a signed 32-bit number, which will overflow in 2038 (and extends back to ~1902)
    • modern systems are moving to 64-bit, which is epoch ± 293 billion years
    • http://en.wikipedia.org/wiki/Unix_time
    • note that using float means resolution varies with actual time. The resolution drops below a second earlier than the value range (read up on storing integers in floats). If you do this, do it in 64-bit floats (order of a hundred million years before res is sub-second(verify))


FILETIME

Windows FILETIME is a 64-bit int representing 100-nanosecond steps since 1601-01-01T00:00:00Z

So basically

filetime = (unixtime * 10000000) + 116444736000000000
unixtime = filetime/10000000. - 11644473600.


Where

that constant is the amount of nanoseconds between 01-01-1601 and 01-01-1970
you want to consider the details of floats and int64

Semi-sorted

  • Microsoft FILETIME
    • a 64-bit value counting 100-nanosecond intervals since January 1, 1601, UTC.
    • [1]


  • time value in UUIDs
    • a 60-bit time value counting 100-nanosecond intervals since 15 October 1582, midnight, UTC (the date of Gregorian reform)


strftime

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

strftime is a C function that takes

  • a const struct tm * (which contains date and time)
  • a format string

and returns that time formatted in the specified way, e.g. "%Y-%m-%d" or "%H:%M:%S", choosing from:

                                                                              example
%a	Abbreviated weekday name                                                Sun
%A	Full weekday name                                                       Sunday
%b	Abbreviated month name	                                                Mar
%B	Full month name	                                                        March
%c	Date and time representation	                                        Sun Aug 19 02:56:02 2012
%d	Day of the month (01-31)                                                19
%H	Hour in 24h format (00-23)                                              14
%I	Hour in 12h format (01-12)                                              05
%j	Day of the year (001-366)                                               231
%m	Month as a decimal number (01-12)                                       08
%M	Minute (00-59)                                                          55
%p	AM or PM designation                                                    PM
%S	Second (00-61)                                                          02
%U	Week number with the first Sunday as the first day of week one (00-53)	33
%w	Weekday as a decimal number with Sunday as 0 (0-6)                      4
%W	Week number with the first Monday as the first day of week one (00-53)  34
%x	Date representation	                                                08/19/12
%X	Time representation	                                                02:50:06
%y	Year, last two digits (00-99)	                                        01
%Y	Year	                                                                2012
%Z	Timezone name or abbreviation	                                        CDT
%%	A % sign                                                                %


Other languages have the same-named function, but

  • take a different sort of value
  • have additional format defined


e.g. Python has a similar function that takes a datetime and adds:

%-d	Day of the month as a decimal number (not zero padded like %d)
%-j	Day of the year as a decimal number.        1..366
%-m	Month as a decimal number                   1..12
%-y	Year without century as a decimal number    0..99
%-H	Hour (24-hour clock) as a decimal number    0..23
%-I	Hour (12-hour clock) as a decimal number    1..12
%-M	Minute as a decimal number                  0..59
%-S	Second as a decimal number                  0..59

%f	Microsecond as a decimal number, zero-padded on the left.	000000..999999
%z	UTC offset in the form +HHMM or -HHMM.	 


while PHP has a similar function (though deprecated in favour of date()) that takes a timestamp and adds:

%e Day of the month, with a space preceding single digits. Not implemented as described on Windows. See below for more information. 	1 to 31
%u ISO-8601 numeric representation of the day of the week	
   1 (for Monday) through 7 (for Sunday)
%V ISO-8601:1988 week number of the given year, 
   starting with the first week of the year with at least 4 weekdays, 
   with Monday being the start of the week	01 through 53 
   (where 53 accounts for an overlapping week)
%h Abbreviated month name, based on the locale (an alias of %b)
   Jan through Dec
%C Two digit representation of the century (year divided by 100, truncated to an integer)	
   19 for the 20th Century
%g Two digit representation of the year going by ISO-8601:1988 standards (see %V)	
   e.g. 09 for the week of January 6, 2009
%G The full four-digit version of %g	
   e.g. 2008 for the week of January 3, 2009
%k Hour in 24-hour format, with a space preceding single digits	
   0 through 23
%l (lower-case 'L')	Hour in 12-hour format, with a space preceding single digits	
   1 through 12
%P lower-case 'am' or 'pm' based on the given time	
   e.g. am for 00:31, pm for 22:23. Not supported by all Operating Systems.
%r Same as "%I:%M:%S %p"	Example: 09:34:17 PM for 21:34:17
%R Same as "%H:%M"	Example: 00:35 for 12:35 AM, 16:44 for 4:44 PM
%T Same as "%H:%M:%S"	Example: 21:34:17 for 09:34:17 PM
%z The time zone offset. Not implemented as described on Windows. 
   Example: -0500 for US Eastern Time
%D Same as "%m/%d/%y"	Example: 02/05/09 for February 5, 2009
%F Same as "%Y-%m-%d" (commonly used in database datestamps)	
   e.g. 2009-02-05 for February 5, 2009
%s Unix Epoch Time timestamp (same as the time() function)	
   e.g. 305815200 for September 10, 1979 08:40:00 AM

Unsorted

In some variants of strftime, %Z outputs your current timezone. This may not be what you want.

If you want to output time with a timezone (e.g. for short-term cookies), it's often easiest/laziest to find a "current time in GMT" function and hardcode 'GMT' into the string.

Incorrect assumptions about time

https://infiniteundo.com/post/25326999628/falsehoods-programmers-believe-about-time

https://infiniteundo.com/post/25509354022/more-falsehoods-programmers-believe-about-time

http://www.creativedeletion.com/2015/01/28/falsehoods-programmers-date-time-zones.html