Python notes - date and time: Difference between revisions

From Helpful
Jump to navigation Jump to search
mNo edit summary
 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{#addbodyclass:tag_tech}}
{{#addbodyclass:tag_prog}}
{{Pythonstuff}}
{{Pythonstuff}}


Line 29: Line 31:


==from datetime==
==from datetime==
WARNING: discards timezone.
 
WARNING: '''strftime discards timezone'''.


<syntaxhighlight lang="python">
<syntaxhighlight lang="python">
Line 135: Line 138:
You can do with with a dozen lines of string manipulation, which may be slightly faster{{verify}}
You can do with with a dozen lines of string manipulation, which may be slightly faster{{verify}}
(TODO: add that)
(TODO: add that)
=Timezones=
<!--
To start with the warning: timezones are mess. Minimize your exposure.
It is a fairly sane approach to
* store times in UTC
* when presenting it, to not do any conversion until the last moment.
This won't save you from legal changes making a mess - nothing can.
But when confusion happens, it should help minimize it because there should be no legal changes,
daylight savings does not apply. Leap seconds are still a thing but there's talk of getting rid of that.
Timezones are often indicated by their offset.
But that's deceptive when you have daylight savings as well.
Say, just-offset timezones will fail to calculate ''now() - 180 days'' correctly.
Because timezones (and daylight savings) are changed by enacting laws.
This happens more than you think.
That law only makes it into computer localization later.
Timezones are a mess, for a few reasons
Python functions will generally let you as "what is local time" and "what is UTC time, assuming you set your timezone correctly?"{{verify}}
A datetime object can exist
* without timezone information, a.k.a. timezone-naive
* with timezone information, a.k.a. timezone-aware
The two are incompatible at a ''concept'' level, so they are also incompatible at a python object level.
unless you force an "um I guess treat that as... UTC? Or whatever timezone my computer has been set to?"
Timezone-aware can also be converted to timezone-aware in other timezone
-->
==python's timezone model==
<!--
A basic-python timezone-aware datetime amounts to having an offset from UTC.
datetime.tzinfo is the abstract class, datetime.timezone implements it.
Python's time zone model,
a .tzinfo is effectively a set of rules to evaluate the time zone.
This is a pretty solid choice, in the sense that if you model it as time zone ''offsets'',
when "datetime plus half a year" cannot be expected to work. Daylight saving.
By default, you work without timezone information:
datetime.datetime.now() is None
from datetime import datetime, timedelta
from dateutil import tz
amsterdam = tz.gettz('Europe/Amsterdam')
print( datetime.now() )
print( datetime.now( amsterdam ) )
might print
When you add a timezone to that, you're just asking to
2024-05-05 23:05:28.237711
2024-05-05 23:05:28.238091+02:00
local to the computer you're working on
However, as e.g. https://blog.ganssle.io/articles/2018/03/pytz-fastest-footgun.html
points out, pytz has its own, similar-but-different
that makes it ''really'' easy to make mistakes.
-->


=Unsorted=
=Unsorted=

Latest revision as of 23:58, 5 May 2024

Syntaxish: syntax and language · changes and py2/3 · decorators · importing, modules, packages · iterable stuff · concurrency

IO: networking and web · filesystem

Data: Numpy, scipy · pandas, dask · struct, buffer, array, bytes, memoryview · Python database notes

Image, Visualization: PIL · Matplotlib, pylab · seaborn · bokeh · plotly


Tasky: Concurrency (threads, processes, more) · joblib · pty and pexpect

Stringy: strings, unicode, encodings · regexp · command line argument parsing · XML

date and time


Notebooks

speed, memory, debugging, profiling · Python extensions · semi-sorted


Conversions

Some code I've copy-pasted more than once:


from seconds-since-unix-epoch

# to '''datetime object'''
datetime.datetime.fromtimestamp( value ) # int or float

from elapsed seconds

# to timedelta
td = datetime.timedelta(seconds=int_or_float)

from timedelta

### to time difference in seconds
# py>=2.7 added
seconds = timedelta.total_seconds()
# py2.6 and earlier:
seconds = (td.microseconds + (td.seconds + td.days * 24 * 3600) * 10**6) / 10**6

from datetime

WARNING: strftime discards timezone.

# to ISO8601-like string
isostr = dtval.strftime('%Y-%m-%dT%H:%M:%S%z')

### to unix timestamp (float) 
# some variations. If you care about microseconds:
time.mktime(dtval.timetuple()) + (1e-6)*dtval.microsecond

from seconds-since-epoch (a.k.a. unix time, a.k.a. "what time.time() gives")

# to datetime
dt = datetime.datetime.fromtimestamp(int_or_float)


# to ISO8601 style string
isostr = datetime.datetime.fromtimestamp(int_or_float).strftime('%Y-%m-%dT%H:%M:%S%z')


From ISO8601 string

### to datetime
# easier and better than the below, but you have to install/include it
dateutil.parser.parse(s)

# quick and dirty independent hack
def iso8601_dt_hack(s):
    ''' This massages a string for consumption by strptime, 
        which is a "parse this string precisely according to this date string" thing.
          we strip timezone (if present) to make it easier to deal with
          but only sensible within the same timezone.
        If you want to deal with timezones correctly, 
         or want to deal with the compact format at all,
         then you probably want to use dateutil. '''
    d, t = s.split('T',1)
    if '-' in t:
        t=t[:t.index('-')]
    if 'Z' in t:
        t=t[:t.index('Z')]
    if '+' in t:
        t=t[:t.index('+')]
    if '.' in t: # also chops off the above. Separate because I could add fractional-sec handling later                                                                                         
        t=t[:t.index('.')]
    return datetime.datetime.strptime('%sT%s'%(d,t),
                                      "%Y-%m-%dT%H:%M:%S")

from standardish strings, and less standard things

If a specific set of data has a date formatted in a different but completely consistent way, then standard functions like strptime can help (comes from C, but also exposed in python and other higher level languages because it's quite useful).

Consider:

dt = datetime.datetime.strptime( dt_txt, "%Y-%m-%d-%H-%M-%S")


dateutil

When you may get structured-but-not-necessarily-standard strings, or varied free-form strings, then dateutil is nice, as a fallback or in general. (it also has various date-based logic you may not want to do yourself).

You probably want the dateutil.parser.parse() function

For example:

>>> dateutil.parser.parse('2015-06-26 23:00:41')
datetime.datetime(2015, 6, 26, 23, 0, 41)

>>> dateutil.parser.parse("Thu Sep 25 2003")
datetime.datetime(2003, 9, 25, 0, 0)

It seems to match on known patterns so will will work on many commonish/standardish things.

It doesn't like seeing parts it doesn't understand, apparently being cautious. It will be more permissive with fuzzy=True (e.g. it groks apache's date format only with fuzziness, apparently because of the unexpected : between date and time)

On ambiguous dates like 02-04-2012 you may have to guide it, see e.g. [1]

from apache log time

Apache uses date-time-with-timezone format like:

29/Nov/2013:14:21:20 +0100

dateutil.parser.parse with fuzzy=True will deal.

You can do with with a dozen lines of string manipulation, which may be slightly faster(verify) (TODO: add that)


Timezones

python's timezone model

Unsorted

See also: