“Eppur si muove!” * Or working with timezone in Python
On our planet Earth, at the same time, in different geographic points of the planet there may be different times of day. This is a consequence of the fact that our world is a rotating geoid, not a flat disk, but that our solar system has only one star, the sun. Since school, we all know about time zones, and we all met with their manifestations in real life (“Moscow time - 15 hours, in Petropavlovsk-Kamchatsky - midnight”, jet-lan for long-haul flights, etc.). Unfortunately, time zones are only partially based on the physical features of our world, and in computer calculations we have to take into account other, sometimes unexpected, nuances.
* "And yet it turns!" - a catch phrase, which Galileo Galilei allegedly said, leaving the process of the Inquisition, after the abdication of his conviction that the Earth revolves around the sun. In our case, alas, this rotation leads to all these “wonderful” problems with time zones.
What does this article have in common with Galileo? Yes, in general, nothing. I'm afraid that if our world were the center of the universe, we would still have to deal with time zones. Let us take the title as my misstep, which I can no longer fix (although I can).
What is a “Time Zone”?
What is your time zone? If you answer "UTC + 3" - this will be the correct answer only for the current time, but in general this statement is incorrect. If you look at the time zone database, you will see, for example, that Berlin and Vienna, despite the UTC + 1 offset, have different time zones (Europe / Berlin and Europe / Vienna). Why is that? The reason is that they had different summer time (DST) in different periods of history. Even if today these two countries and these two cities have the same DST rules, a hundred years ago this was not the case. For example, in Austria and Germany at different times there was no transition to summer time: in Austria since 1920, and in Germany since 1918. During the Second World War, both countries had the same DST rules (which is not surprising), but after its endings are again out of sync. Germany canceled daylight saving time in 1949 and introduced it again in 1979, Austria also canceled DST in 1948 and introduced it again in 1980. The worst thing is that they did not even agree on the same daylight saving time. ')
And it happens all over the world. For computer calculations, daylight saving time is a huge problem, because we assume that time has a continuous moton move. With the transition to summer time, we have an hour every year, which is repeated twice, and there is an hour that we just skip. If you specify local time when writing to the log, you may lose the order of the log lines when sorting.
Quote from pytz documentation:
So, for example, in the US / Eastern timezone in 2002 during the end of the DST, October 27, the time 01:30 came twice, and during the start of the DST, April 7, the time 02:30 did not come, because at 02:00 the clock moved forward an hour.
But in the timezone stored not only the rules of daylight saving time. Some countries change time zones, sometimes even without changing the DST. So, for example, in 1915, Warsaw passed to Central European Time. As a result, at midnight on August 5, 1915, the clock was switched to 24 minutes ago (while summer time was in effect in Warsaw). In general, even more hell is going on with time zones. There is at least one country whose timezone was different during the day due to the 0:00 time synchronization with the time of sunrise.
Where is common sense?
Common sense is and it is called Coordinated Universal Time (UTC). UTC is a timezone without daylight saving time and without any changes in the past. However, due to the fact that our Earth is a rotating geoid and there are things in the world that we cannot control, there is a problem of adjustment seconds (leap seconds). Whether UTC will take into account correction seconds (which are irregular and therefore difficult to take them into account when calculating), or not (then each timezone will have a difference of a few seconds from UTC) - as far as I know, it has not been decided yet.
Despite this, right now UTC is the safest option. From UTC you can convert the time to a local time for any time zone. The inverse transformation, given the above, is impossible.
So, here is the main rule of thumb that will never let you down:
Always store and work over time in UTC . If you need to keep the original data - write them separately. Never store local time and timezone!
What is the problem?
In general, this article would have to end. But unfortunately, there are a couple of things you need to keep in mind when you are programming in Python. This is a legacy of architectural solutions of those ancient times, when no one thought about the practical use of language. Motivation mattered, common sense did not.
One day, the following decisions were made about the archive of the datetime module of the standard Python library:
The datetime module should not store information about the timezone, because the timezone changes too often.
On the other hand, the datetime module should provide the ability to add information about the timezone (tzinfo).
The following objects must be implemented in the datetime module: date, time, date + time, timedelta.
Unfortunately, something went wrong. The main problem is that the datetime object, which has been added information about the timezone (tzinfo), will not interact with the datetime object without the timezone:
>>> import pytz, datetime >>> a = datetime.datetime.utcnow() >>> b = datetime.datetime.utcnow().replace(tzinfo=pytz.utc) >>> a < b Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: can't compare offset-naive and offset-aware datetimes
If you close your eyes to that horrible API, with which you have to add information about the timezone to the datetime object, there are still problems. When you work with datetime objects in Python, you will sooner or later have to add or delete tzinfo in all places of your program.
Another problem is that you have two ways to create a datetime object with the current time in Python:
One returns the time in UTC, the other returns the local time. However, the datetime object will not tell you what “local time” is (because it does not have information about the time zone, at least until Python 3.3), and there is no way to find out which of these objects stores time in UTC.
If you convert the UNIX timestamp to a datetime object, you should also be careful when using the datetime.datetime.utcfromtimestamp method, because it accepts the timestamp in local time.
The datetime library also provides date and time objects, in which it is absolutely useless to add tzinfo. The time object cannot be transferred to another timezone, because for this you need to know the date. The date object in general only makes sense for the local time zone, because “today” for me may be “yesterday” or “tomorrow” for you — say, thanks to the wonderful world of time zones.
So what are the recommendations of experts?
Now we know who is to blame. But what to do? If we ignore theoretical problems that manifest themselves only in the case of working with historical dates in the past, then here are a number of recommendations. In case you have to work with historical dates, there is an alternative mxDateTime module, well-designed enough and even supporting various calendars (Gregorian and Julian).
Use UTC inside the program
If you need to get the current time, always use datetime.datetime.utcnow (). If you get local time from a user, always convert it to UTC right away. If you can’t do a single-valued conversion, report it to the user, do not try to guess his time blindly. During the transition to summer time and back, my iPhone several times failed to correctly translate time. I know when it needs to be done, because I have to translate the analogue clock.
Never use time with time zone
It may seem like a good idea to you - always add time zone information to datetime objects, but in fact a much better idea is not to do this. A good solution would be to use a datetime object without tzinfo and with a UTC time. Consider the fact that you cannot compare time with a timezone with time without it, just as you cannot mix bytes and unicode in Python 3. Use this API flaw for your own purposes.
Inside the program, always use datetime objects without tzinfo with a UTC time.
When you interact with a user, always convert his local UTC time and back.
Why don't you need to add tzinfo to a datetime object? First, because the overwhelming majority of libraries expect tzinfo to be None. Secondly, it’s a terrible idea to always work with tzinfo, given the crooked API of working with it. The pytz library has alternative functions for converting timezone, because the API implemented in the standard library for converting tzinfo is not flexible enough to work with most real timezone. If we do not use tzinfo objects, there is a chance that in the future everything will change for the better.
Another reason for not using time with the time zone is that the tzinfo object is very specific and depends heavily on its implementation. There is no standard way to transfer information about the timezone (except, perhaps, the UTC timezone) to other languages, via HTTP, etc. In addition, datetime objects with information about the timezone often become too huge when serialized using the pickle module, or even impossible to serialize them (this depends on the implementation of the tzinfo object).
Conversion for formatting
If you need to show the time in the user's timezone, take a datetime object with a UTC time, add the UTC timezone to it, convert the time to the user's local time and format it. Do not use the timezone conversion tzinfo methods, because they do not work correctly, use pytz. Then transfer the time to the “naive” by dropping the timezone offset from the resulting datetime object that you created for formatting and continue to live happily.