What needs to be considered when designing a system so that it does not hurt painfully?

The article describes the problems in the design of databases and a bit of the entire application, which then with the growth of the project more and more difficult to solve. Moments that are important to consider at the design stage, and not to think about them later. Well, or think over a cup of tea and the phrase “Do you remember how we decided to do it right away? How much time we saved ourselves with this! ”And not with the sensation of a toothache and a painful flinch with each memory. As the system and the number of users grow, the design of the database is getting harder and harder to change, and the scale of changes becomes more and more global and time consuming.

Now many successful projects have grown out of small startups, which later gained commercial success and became large international companies. This growth opportunity has appeared in the last 20 years, mainly due to the Internet and the effect of “erasing borders”. There are global Internet applications and mobile applications that can be used in any country. Previously, most often, if the application was supposed to be an international project, it was already designed right away with such a requirement. Of course, you can take an evolutionary approach, and as the project grows add the necessary functions and scaling to it. But to facilitate the introduction of further changes, it is necessary to immediately take into account the scale of some basic functions, which are difficult to change in the future.

I worked in 2 start-up projects that shot and grew into large companies with millions of users from small regional projects, and now they are heavily loaded. To my surprise, I saw that there were many common problems, although applications were written by different teams and for different users. One can see common problems in databases that are the legacy of a startup, such children's growth problems, which show that the project was originally planned to be small.
')

There is no need to immediately create a project of international scale, but it is important to lay out the basic functionality, which, if necessary, will facilitate and help to transform the project into an international high-load system.

And so, the main mistakes:

Lack of localization.

Our world has long become global, and for software applications there are no longer boundaries. When a project becomes popular not only in one country, but throughout the world, the ability to localize is very important. At the stage of developing and laying the localization foundation, the correct choice of data types for storing information, which will vary from country to country, is important. If there is no such foundation, then it will be difficult to localize the application when it is already actively used. There are 4 main points to consider when localizing.
- Time Zones
  
  It is necessary to make the storage of the date and time either in UTC format, or taking into account the client's time zone. At the same time, it is important to remember that all server dates will also need to be transferred to the client's time zone when displayed. It looks ridiculous and obvious, but initially, one of the projects, when we offered the owner to lay support for other regions, he said that he does not plan such growth. And then there was growth. And torn hair about the fact that it is not laid immediately too.
- Different languages
  
  When choosing text fields, you should use the Unicode format or the type NVarchar, later this will facilitate the work with the application. It is necessary to pay attention to the rules for sorting and comparing strings in your database. Make sure that you choose not the default Collation, but a sort that will work correctly when comparing accents, hieroglyphs, and non-standard width characters, provided that the end user enters any information. Will you say your application will work only for the USA or only for Europe? Then you need to ask yourself whether there will be data that the end user can enter. At least something comment line or some information for yourself? If there is such an opportunity - make storage of strings in Unicode. The world has long been mixed up and you are not insured that a person using Japanese will contribute something to your database as a user. The ultimate goal is to ensure that user data is correctly preserved and then correctly output.
- Currency
  
  If information about monetary funds is stored in your database, it would be great to clarify in which currency they should be stored and set up a mechanism that allows you to enter another currency without changing the application. This refers not to the implementation of full functionality in the form of conversion rates, their loading and other tools, redundant in the first stage, but storage of the feature of monetary units in the table, so that when the user's country changes, it is possible to distinguish amounts in different currencies stored in the database. It is important to remember that some countries allow transactions in multiple currencies.
- Natural keys
  
  Virtually all canonical textbooks on databases describe how inefficient the artificial keys are and how beautiful the natural keys are. But the world, as a rule, has not yet reached such a level of globalization that everyone can use the natural key, since the existence of a natural key requires a single classification of the object for which the key is chosen, which remains unified and global throughout the world.
  
  For example, in one of the projects we thought to use the TIN as a natural key for counterparties, but refused this idea. I cannot describe in words how we later rejoiced in this, as the company began to develop and cooperate with legal entities outside Russia.
Data types

The correct choice of data types is the key to successful application operation, and, consequently, health and quiet sleep of the developer. When choosing the dimension of the data type, it is necessary to focus on the largest possible variant of the possibly large number of users and operations.

For example, when creating a table with a list of employees of your company only, it is permissible to use a primary key of the INT type, since the probability of a company growing to 2 million people is very small. However, if you create a database of employees of companies of your customers, you must use a larger data type, for example, BIGINT, since there is a probability that 500 thousand employees will create 2 million records in your database four times, and a table for storing a list of employees of client companies, with a large number of clients, may well grow and exceed 2 billion. The same rule applies to tables that store user logs, and other business operations that can increase with the growth of the application. I'm a million times.
Synchronized response to user actions

When designing a system, it is important to consider which of the user actions will concern the modification or deletion of a large number of objects in the system if the system grows. For example, if the user has 2000 contacts and he sends them an invitation, the system will send this invitation immediately, or display a message to the user that his request is being processed, and then update the status to completed when the action is processed.

If the processing is immediate, then with the growth of the system in the future there may be problems with server performance. Also, the workload of the system when processing all requests synchronously will directly depend on the actions of users. For example, with standard loading, the system has 10% of resources to handle all requests in the normal mode. When a user request appears, such as inviting 500 thousand users at the same time, the system load can increase to 100%. The example is somewhat synthetic, but I think the point is clear. At the same time, the system will no longer properly process requests from other individual users until it has completed the processing of this huge request.

Think about what user actions can be large-scale, and make them asynchronous. In the example with invitations, the system does not start sending immediately, but creates a new task “sending invitations to users”. When the task is created, the system in the background sends invitations and informs the user about the completion of the task. For the convenience of the user, his notification of the actions of the system, which operates in asynchronous mode, is also quite important.

Having laid such opportunities in the system, you will save it from overloading in the case of large-scale user actions, which is important in working with corporate systems when such large-scale actions come from your key customers.
The accumulation of unnecessary data

At the beginning of work on a project, developers, as a rule, do not think about how much free space and disk space is spent in the system, since disk space prices are quite affordable. No one thinks about saving disk space until the system dials its first million users. Until then, you can afford the luxury of keeping the logs of every user action forever.

For example, it is important to keep logs of user actions for 1-2 weeks in case he contacts technical support. This information can then be deleted. Therefore, it is important to consider and immediately enter the process of deleting old data before there is too much such data. If you do not immediately register this process, then there is a greater likelihood of encountering a problem when a huge stream of new data is added to the database and the system does not cope with deleting unnecessary data. It is necessary to consider an effective system for deleting old data, such that it not only exceeded the speed of the new data stream, but also was able to remove a huge amount of information from past years of work in the near future.

The rule of deleting old unnecessary data should apply to all new features.
Lack of tests and documentation

Usually, start-up projects are in a hurry to start faster. Also, usually startups are implemented by a relatively small development team that knows the entire business logic of the application by heart and does not understand why to waste time writing documentation, and sometimes testing. The work goes according to the principle “let's start now and then write”. However, as soon as the system starts up, a huge number of new tasks appear, and there is no time left to write documentation.

If the project develops successfully, the company begins to grow very quickly and recruit new employees. Inclusion of new developers in the team without documentation is very difficult and time consuming. There may come a time when all the working time of the members of the original team will be spent on explaining the functionality of the system to new employees.

We strongly recommend writing documentation immediately. Subsequently, it will pay off in full.
With regard to tests, as soon as the development team increases, it is necessary to make sure that the development of new functions or the revision / modification of existing ones do not adversely affect the rest of the existing functionality of the system. Tests are the easiest and most effective way to verify functionality.
Scaling

With increasing load on the system, there is a need to scale it. Consequently, during the design of the system, it is necessary to think about how to scale the system with increasing load, for example, 10 times. The design, thought over under a possibility of scaling, will facilitate the subsequent changes under new realities of the growing system.

It would be nice to ask yourself and the developers the following questions:
- How to increase system performance, if you increase the capacity of the equipment?
- Is it possible to create a copy of the system on another server and enable it to work? If not, what needs to be changed in the design to make it possible?
- How large is the scale of change needed to split the system into several parts and rebuild the architecture using the principles of SOA or microservices?
This does not mean that the system should be designed immediately large-scale, but it is important to lay the foundation for the further scaling of the system. Lay a straw on a thin layer, do not immediately fold the sheaves everywhere.

So, at the initial stages of system design it is important to keep a balance between work for the future and work for the present. And the present is more important, because there are specific requirements and tasks that need to be implemented, usually in a short time. The future may not come, or appear completely different, initially expected. If you have been asked to make a city bus, it is not at all necessary to immediately make it possible to transform it into a spacecraft, because this future may not happen to the city bus. Therefore, it is ideal to lay the foundation for the further separation of the system, which will later be used for separation into services and modules.

Source: https://habr.com/ru/post/329076/

All Articles

What needs to be considered when designing a system so that it does not hurt painfully?

More articles: