📜 ⬆️ ⬇️

Accuracy through inaccuracy: Improving Time-objects

When creating a value object for storing time, I recommend choosing with the experts in and around the subject area with what accuracy it will be stored.

Modeling work with numbers is considered good form to indicate accuracy. It does not matter what is at stake - about money, size or weight; round to the specified decimal point. The presence of rounding makes the data more predictable for processing and storage, even if this number is only for display to the user.


Unfortunately, this is not often done, and when the moment comes, the problem makes itself felt. Consider the following code:


$estimatedDeliveryDate = new DateTimeImmutable('2017-06-21'); // ,    2017-06-21 $now = new DateTimeImmutable('now'); if ($now > $estimatedDeliveryDate) { echo 'Package is late!'; } else { echo 'Package is on the way.'; } 

It is expected that on June 21 this code will print Package is on the way. Because the day is not over yet, for example, the package will be delivered in the evening.


Despite this, the code does not. Since the part with the time is not specified, PHP carefully submits the zero values ​​and results in the $estimatedDeliveryDate by 2017-06-21 00:00:00 .
On the other hand, $now calculated as ... now. Now includes the current moment in time, which is most likely not midnight, so it will be 2017-06-21 15:33:34 or something like that will be later than 2017-06-21 00:00:00 .


Solution 1


“Oh, it's easy to fix.” Many will say, and update the necessary part of the code.


 $estimatedDeliveryDate = new DateTimeImmutable('2017-06-21'); $estimatedDeliveryDate = $estimatedDeliveryDate->setTime(23, 59); 

Cool, we changed the time until midnight. But now the time is completed by 23:59:00 , so if you run the code in the last 59 seconds of the day, you will get the same problems.


“Brr, okay.” - the answer will follow.


 $estimatedDeliveryDate = new DateTimeImmutable('2017-06-21'); $estimatedDeliveryDate = $estimatedDeliveryDate->setTime(23, 59, 59); 

Great, now it's fixed.


... until you upgrade to PHP 7.1, which adds microseconds to DateTime objects. So now the problem will arise in the last second of the day. Perhaps I have become too biased when working with high traffic systems, but the user or process will definitely come across this. Good luck in finding this bug. : - /


OK, let's add microseconds.


 $estimatedDeliveryDate = new DateTimeImmutable('2017-06-21') $estimatedDeliveryDate = $estimatedDeliveryDate->modify('23:59:59.999999'); 

And now it works.


Until we get nanoseconds.


In PHP 7.2.


Okay, okay, we CAN reduce the error further and further to the moment when the appearance of an error becomes unrealistic. At this moment it is clear that this approach is wrong: we are chasing an infinitely divisible value and are getting closer and closer to a point that we cannot reach . Let's try a different approach.


Solution 2


Instead of calculating the last moment before our boundary, let's check the comparison of the boundaries instead.


 $estimatedDeliveryDate = new DateTimeImmutable('2017-06-21'); //    ,   ,       $startOfWhenPackageIsLate = $estimatedDeliveryDate->modify('+1 day'); $now = new DateTimeImmutable('now'); //    >  >= if ($now >= $startOfWhenPackageIsLate) { echo 'Package is late!'; } else { echo 'Package is on the way'; } 

This option works and will work throughout the day. Such code looks more complicated. If you do not encapsulate this logic in a value object or something similar, you will definitely miss it somewhere in your application.


Even if you do this, only one type of operation (> =) will be logical and consistent, for the rest it does not work. If we implement support for equality checking, we will have to make another data type, and then juggle them to work correctly. Heh.


Finally (perhaps only for me) this solution has unpleasant moments in the form of a potentially missed domain concept. "Is there a LatePeriodRange? A DeliveryDeadline?" could you ask. “The package was late and ... there will be something? The expert didn’t talk about deadlines, it seems there is no deadline. How does this differ from EstimatedDeliveryDate? What then?” The package is not going anywhere. This is just a strange feature of the constructed logic, which is now stuck in the head.


This is the best solution in providing the right answer ... but it is not a very good solution. Let's see what else you can do.


Solution 3


Our goal is to compare two days. Imagine a DateTime object with now as a set of numbers (year, month, day, hour, minute, second, etc.), then the part before the day will work fine. Problems begin due to additional indicators: hour, minute, second. One can argue about tricky ways to solve a problem, but the fact remains that a component of time harms our checks.


If only the part with the day is important to us, then why put up with these additional values? additional hours or minutes should not change the logic of business rules if only the transition to the next day is important.


Just throw out the trash away.


 //    ,   $estimatedDeliveryDate = day(new DateTimeImmutable('2017-06-21')); $now = day(new DateTimeImmutable('now')); //     if ($now > $estimatedDeliveryDate) { echo 'Package is late!'; } else { echo 'Package is on the way.'; } // ,      //   , PHP      function day(DateTimeImmutable $date) { return DateTimeImmutable::createFromFormat( 'Ym-d', $date->format('Ym-d') ); } 

This simplifies the comparison or calculation of what is in solution 1, with an accuracy from solution 2. But ... this is the ugliest option, plus, with such an implementation it is very easy to forget to call day() .


This code is easy to turn into an abstraction. Now that the situation with the subject area has cleared up, it is clear: when we talk about the delivery time, we are talking about the day, not the time. Both of these things make the code a good candidate for encapsulation inside the type.


Encapsulation


In other words, let's make this value object.


 $estimatedDeliveryDate = EstimatedDeliveryDate::fromString('2017-06-21'); $today = EstimatedDeliveryDate::today(); if ($estimatedDeliveryDate->wasBefore($today)) { echo 'Package is late!'; } else { echo 'Package is on the way.'; } 

See how the code reads. Now we implement the value object:


 class EstimatedDeliveryDate { private $day; private function __construct(DateTimeInterface $date) { $this->day = DateTimeImmutable::createFromFormat( 'Ym-d', $date->format('Ym-d') ); } public static function fromString(string $date): self { //    YYYY-MM-DD   .. return new static(new DateTimeImmutable($date)); } public static function today(): self { return new static(new DateTimeImmutable('now')); } public function wasBefore(EstimatedDeliveryDate $otherDate): bool { return $this->day < $otherDate->day; } } 

Having the class available, we automatically get a useful limitation: the comparison of EstimatedDeliveryDate goes only with another EstimatedDeliveryDate , now the accuracy will converge.


Processing with the necessary accuracy is located in one place. Consumer code does not apply to this work.


It is easy to test.


And you have a great place for the centralized storage of processing time zones (not discussed in the article, but important).


One tip: I used the today() method to show that it is possible to create several constructors. In practice, I would recommend adding a system clock class and getting instances of now from there. So much easier to write unit tests. The "real" version will look like this:


 $today = EstimatedDeliveryDate::fromDateTime($this->clock->now()); 

Accuracy through inaccuracy


It is important to understand, we managed to remove several different types of errors by reducing the accuracy of DateTime , which we were processing. If we didn’t do this, we would have to handle all the problematic sides and, most likely, in some case everything would have gone wrong.


Reducing the quality of the data to get the right result may seem illogical, but in fact it’s a more realistic look at the system we are trying to model. Our computers can run in picoseconds, but our business is (most likely) not. Plus, the computer probably lies .


Perhaps as developers, we feel. It is better to be more flexible and promising, while retaining all possible information. In the end, who are you to decide what information to throw out? The truth is that information can potentially cost money in the future, in the present it is certainly worth the cost of maintaining it before a possible future. This is not only the cost of hard disk space, it is the cost of problems, people, time, and, in case of error, reputation. Sometimes working with the data in its most complete form will justify itself, but sometimes blindly saving everything you can, just because you can, is not worth it.


To make it clearer: I do not recommend that you just thoughtlessly delete the available time information.


I recommend : Clearly choose the accuracy for your time stamps along with subject matter experts. If you can get more accuracy than you expect, this can lead to errors and additional complexity. If you get less than the required accuracy - it will also cause problems and the failure of business logic. It is important that we determine the expected and necessary level of accuracy.


Also, choose accuracy separately for each use case. Rounding is usually implemented within the value-object, and not at the level of the system clock. In some places, nanosecond accuracy is needed, but someone may need only a year. Proper accuracy will make your code more clear.


It is everywhere


It is worth noting that we talked about only one specific type of error: the discrepancy between the required accuracy for checks. But this advice applies to a much wider range of errors. I will not go into all of them, but I still want to mention my favorite, “residual” error.


 // ,   21 ,       28  $oneWeekFromNow = new DateTimeImmutable('+7 days'); //  28     $explicitDate = new DateTimeImmutable('2017-06-28'); // ,    ? var_dump($oneWeekFromNow == $explicitDate); 

No, they are not the same, because $oneWeekFromNow also stores the current time, while $explicitDate is 00:00:00 . Delightful.


The above examples talked about accuracy, in the first place, when comparing time versus date, but precision modeling extends to any unit of time. Imagine how many applications for planning need only time, and how many financial applications need precision support by quarter.


As soon as you begin to look at the problem, you understand how many errors over time can be explained by uncertainty. They may look like incorrect checks or poorly designed logical frames, but when you dive into this, you will begin to see how the picture emerges.


My experience shows that this class of errors is often overlooked when testing. Objects with a system clock are not familiar things (yet), so testing code that uses the current time is a bit more complicated. Data for tests is often not provided in the format that is obtained in the system, which leads to errors.


And this is not the problem of a specific DateTime library in PHP. When I wrote about this last week, Anthony Ferrara mentioned that the accuracy of time in Ruby varies depending on the operating system, but the database library has a fixed level. It's fun to debug


Work with time is difficult. Compare the time - doubly.


Accuracy level selection


Saying that the choice of the accuracy level is important, we did not talk about how to choose the right one. As a rule, you need to have sufficient accuracy of time stamps for technical needs and at the same time set the level of accuracy for domain objects.


For logs, event markers, metrics, select a drilldown as desired. Such data is primarily necessary for technicians, for them additional accuracy is often necessary when debugging. It is also likely that high accuracy will be required for system or serial data.


In the case of business problems, talk to subject matter experts about the accuracy of the information you need. They can help balance what is being used now and what is needed in the future. Business logic is often an area where you have to operate with borrowed knowledge, so reducing complexity will be a smart move. Remember, you are not building a model right-as-in-real-life, you are building a utility model.


In life, this leads to the need for varying degrees of accuracy, even within the same class. Consider this application class:


 class OrderShipped { //   - (),     private $estimatedDeliveryDate; //   - (),     private $shippedAt; // Event sourcing ,     private $eventRecordedAt; } 

If the presence of several levels of accuracy seems strange, I remind you that these time stamps are used differently. Even $shippedAt and $eventRecordedAt point to the same "time", but they relate to completely different parts of the code.


You may come across a business that works with blocks of time that you may not expect: quarters, financial calendars, shifts, division by morning, afternoon, or evening. A lot of interesting experience will work when working with these additional units.


Changing requirements


Another good part of the discussion is that if business rules change in the future, more accuracy is needed than initially agreed, this will be a joint decision and it will become clear what to do with the data already accumulated. Ideally, this will save the business from technical problems.


It's easy: "Initially, it only required the registration date, but now it takes time to see the registration before the office closes time." A simple solution would be to set the time until the next business day, perhaps a small number of accounts would be incorrect, but acceptable for most. Or just zeros. Or the company has additional business rules, when after 18-00 the subscription end date is set to tomorrow +1 year instead of today +1 year . Discuss this with them. People are more active and loyal to change, if they are included in the discussion from the very beginning.


In more complex cases, refer to data recovery based on other data in the system. Perhaps the registration time is stored in logs or metrics. In some cases, it will be impossible to do this and you will have to create new logic to transfer legacy cases. But it is impossible to plan everything, and, most likely, you do not know what will change. That's life.


My conclusion about the accuracy of time: use what you need, no more.


Appendix: The Perfect Solution


Moving forward, I feel that there is a practical benefit from the choice of fixed precision and the use of classes. My ideal PHP library for working with time would look like this: a set of abstract classes denoting accuracy, from which I inherit in my value-objects and use it when comparing.


 class ExpectedDeliveryDate extends PointPreciseToDate { } class OrderShippedAt extends PointPreciseToMinute { } class EventGenerationTime extends PointPreciseToMicrosecond { } 

Moving the question of accuracy to the class, we take responsibility for the decision. You can limit the methods, such as setTime() to the required accuracy, round DateInterval , do everything that makes sense when working with time. We encapsulate most methods of value-objects and expose only the ones necessary for the domain. In addition, in this way we will encourage people to create value-objects themselves. Highly. Lot. Value objects. Daaaaaa.


The bonus will be if the library allows you to easily define custom time units.


Has anyone done this? Does anyone have no time?


')

Source: https://habr.com/ru/post/335494/


All Articles