📜 ⬆️ ⬇️

How to stop thinking about time zones and start living

Does time play an important role in your system? Are your users / components distributed throughout the globe, or at least our vast country? So you need time zones. Well, it's simple. The hardest thing you have to do is not to get confused. We will talk about this with you. First you need to learn how to think correctly. Thinking right, everything else will be either self-evident or fairly simple for you.

Let's start with the clock. We are all accustomed to determine the time, looking at the clock on the wall. When working with time zones, this time is called Wall clock time . In principle, there is nothing bad in it, only in different parts of the globe at the same moment in time the clock shows different times. If you set a goal, you can come up with an algorithm for translating wall clock time from one time zone to wall clock time of another. Usually it is necessary to add / subtract the difference in hours between time zones, except (attention), the transition to summer / winter time. That's when the transition begins, the calculations become really complicated.

We need something simple and bulletproof, like ... an integer. Thus, the notion of a moment in time (instant in time, Unix time, POSIX time, time since (unix) epoch), which represents the number of seconds (in Java - milliseconds), elapsed from January 1, 1970, 00:00:00 to GMT. The moment of time is the same across the globe - if you imagine that someone pressed a “pause” and the flow of time stopped, the number corresponding to the moment of time across the Earth will be the same, regardless of the time zone. If someone paused an hour after the new year of 1970 came to Greenwich, the moment in time across the Earth would show 3,600,000. And now, for example, this is the number 1,280,720,431.859.
')
So, the moment in time is the universal convertible currency of temporary computing. It depends only on, hmm, time, moments can be compared (respectively, to determine which event occurred earlier and which later), and no nonsense related to geographic location, time zones, or watch translations takes part in it, which drastically increases reliability of such calculations. Actually, this is how work with Java is implemented (from version 1.1), where java.util.Date is a wrapper over a long-time point (dates before 1970 correspond to negative long ones), is Comparable, and all human-calendar transformations rendered into separate classes Calendar and DateFormat.

About conversions. An ordinary person will have little to say the number 1,280,720,431.859 (although an inquisitive reader can figure out the time when I wrote these lines), so you need to be able to translate a moment in time to a wall clock time, and, accordingly, parse back the wall clock time in time. For these transformations, you already need to know the time zone, and these calculations are not trivial at all. The fact is that in different countries / territories / places it’s not only different displacements relative to GMT (GMT), the rules of these displacements have historically changed several times and continue to change (they introduce / cancel summer time, combine belts - they heard about this initiative, perhaps ?, or recall, for example, my hometown of Novosibirsk, which in the early nineties was transferred from GMT + 7 to GMT + 6, and at the beginning of the century there were two belts in it - the border of the belt ran along the Ob River , and on different shores were different belts). In short, in order not to go crazy, all this historical information is neatly kept in the form of the Olson tz database , named after the founder of Arthur David Olson, although the editor is Paul Eggert. In this database, each large locality has a code (Novosibirsk, for example, Asia / Novosibirsk is called this database) and a list of all its adventures in time zones, starting in 1970. This database is used in many (all?) Linux / Unix / BSD systems, I won’t say about Windows in the Java Runtime Environment (for example, it had some updates related exclusively to the tz database update), and so on see, in general, Wikipedia . We will not consider the time conversion algorithm to / from this database, we will assume that we have it ready. He, in fact, almost everywhere is ready.

So, we formulate the rules for dealing with time for programs operating in several time zones:
The last item from all of the above does not follow, so we analyze it separately. When working with databases (I have experience only with Oracle and sqlite3), namely when saving / reading from a database, for some reason, a time zone is required. And this means that after saving / reading, you can spoil the data. How to spoil? There is one unpleasant feature associated with the transition to the summer (+1) winter time (02:00 - 03:00 October 27, 2002, for example): during the transition, for an hour, two time points correspond to each wall clock time (clock twice show 02:01, 02:02, 02:03, etc., while these are different points in time). That is, we cannot unambiguously determine from the wall clock time 02:30 10.27.2002, what is this moment in time, because we do not know if summer is still time or winter. If we get a certain moment in time and convert it at 02:30 October 27, 2002, we will definitely not be able to perform the inverse transformation.

It is possible to come up with various solutions to this problem - to keep the l / c flag as a separate column, to store moments in time in a NUMBER column, but storing the date / time in UTC seems to be the least radical and simple. In the UTC time zone there is no daylight saving time, therefore, the wall clock time instant in in time transformations are always performed unambiguously. In addition, this approach allows you to securely store all points in time in the database, including the transfer of hours, it also:
  1. disciplines (if you forget to specify a time zone somewhere in the transformations, you will immediately see that something is not right, at least if you do not live in UTC);
  2. allows you not to get confused in the dates / times when the information in the database comes from different time zones - in the database, the time is always in UTC;
  3. simplifies the code, because when converting time to / from the database you can not think about the time zone, it is always the same.
Finally a few words about working with time zones in python. In python, the datetime.datetime class is used to work with dates, and the pytz module based on the same Olson tz database is used to work with time zones. They do not have time points directly, instead there are two concepts: timezone-aware datetime and naive datetime. The first is, of course, the date-time with the specified time zone, the second is the naive date-time, in its pure form, wall clock time without specifying where these wall clocks hang. Datetime is stored in the form of structures “year month day hour minute second microsecond” plus a tzinfo object for tz-aware datetime. The moment of time can be obtained only through time.time (), it will be a float and it will be limited to something like [1970, 2038], that is, it may easily not be enough for some calculations. That is, (as far as I understand, maybe they will correct me?) In their datetime, something like that very algorithm of translation directly from one time zone to another is implemented, without time points, but theoretically, everyone can understand dates from 1 to 9999 years.

Translates naive to tz-aware datetime using the method:

tzaware_datetime = some_timezone.localize(some_naive_datetime, is_dst=True)
(pay attention to the second parameter, it is needed just because of ambiguous conversion), or

another_tzaware_datetime = tzaware_datetime.astimezone(another_tz)
(transfer tz-aware date-time to a different time zone).

Since this is all implemented through the same datetime.datetime class and the whole difference in the availability of the tzinfo property, you need to be damn careful not to confuse where we have dates with the time zone, and where not. Here, Python is worse than Java in the sense that when you print, you want it or not, but you need to create a DateFormat and specify the time zone, while in Python, many operations, including and printing can be performed for naive dates. It is clear that in a somewhat complex application it is desirable to take care that all dates are with the time zone, because if in some place of the application it turns out that it is not there, then you can figure out the figs and what it should be there. And with the belt and the date will be compared correctly, and printed, and in general. In addition, since only the naive part (year, month, day, minute, second, microsecond) is saved in the reading / reading from the database, the only sensible way to work with this is to have a naive view in UTC in the database.

Bonuses


The rules of the person working with calendar dates. Remember, that:
  1. not every year 365 days;
  2. not every day 24 hours;
  3. fortunately, every hour has 60 minutes;
  4. not every minute 60 seconds (it may be 59 and 61. The 61st is called leap second , added either on June 30 or December 31, at which time the clock in UTC should show 23:59:60. The addition of the 61st second is caused by slowing rotation of the Earth. The ability to take away one second is provided for cases if the Earth starts to rotate faster, but this possibility has never been required).
GMT time is calculated not by the time when the sun crosses Meredian, but by some average time of this event. The real intersection of the meridian may differ from it up to 16 minutes due to the ellipticity of the Earth's orbit.

Although UTC and GMT are very similar, they are still a little different. If GMT is determined by the solar time at the Royal Observatory in Greenwich, then UTC is measured by atomic clocks (weighted average time of two hundred atomic clocks in seventy laboratories around the world, synchronized via satellites). The difference between GMT and UTC should not exceed 0.9 seconds and is compensated just by adding leap seconds.

It is expected that storing the date in 32 signed int on UNIX systems will lead to the problem of 2038 , when 31 bits will overflow and subsequent moments in time will correspond to negative numbers, which will break all the comparison methods. New 64-bit systems and programs already use 64 bits to store time, but will such systems have time to completely replace 32-bit by 2038?

Source: https://habr.com/ru/post/100741/


All Articles