Software architecture and systems design: a big picture and resource guide

Hello colleagues.

Today, a translation of the article by Tugberk Ugurlu, which took on a relatively small amount to set forth the principles of designing modern software systems, is proposed to your court. Here is what the author says about himself in the bottom line:

Since it is absolutely impossible to cover such a colossal topic as architectural patterns + design patterns in 2019, we recommend not only the text of Mr. Uruglu, but also the numerous links that he kindly placed in it. If you like it, we will publish a more specialized text on the design of distributed systems.
')

Isaac Smith Shot from Unsplash

If you have never had to face such challenges as designing a software system from scratch, then when you start such work, sometimes it’s not even clear where to start. I believe that first you need to outline the boundaries in order to more or less confidently imagine what exactly you are going to design, and then roll up your sleeves and work without going beyond these boundaries. As a starting point, you can take some product or service (ideally - one that you really like) and understand its implementation. You might be amazed at how simple this product looks, and what a huge complexity it is actually hidden. Do not forget: simple is usually difficult , and this is normal.

I think the best advice that I can give to those who begin to design the system is this: do not make any assumptions! From the very beginning, it is necessary to specify the facts known about this system and the expectations associated with it. Here are some good questions that can help you get started designing:

What is the problem we are trying to solve?
What is the peak number of users who will interact with our system?
What patterns of writing and reading data will be used with us?
What are the expected failures, how are we going to deal with them?
What are the expectations for consistency and system availability?
Do you have to take into account when working any requirements related to external verification and regulation?
What types of confidential data are we going to store?

These are just a few questions that came in handy both for me and for those teams in which I had a chance to participate over the years of professional activity. If you know the answers to these questions (and to any others that are relevant in the context in which you have to work), then you can gradually delve into the technical details of the task.

Set the initial level

What do I mean here by “baseline”? Actually, in our time, most of the problems in the software industry can be "solved" using existing methods and technologies. Accordingly, being guided in this landscape, you get a certain head start, faced with tasks that someone had to solve before you. Do not forget that programs are written to solve the problems of business and users, so we strive to solve the problem in the most straightforward and simple (from the point of view of the user) way. Why is this necessary to remember? Maybe in your coordinate system you like to look for unique solutions for all tasks, since you think, “what kind of programmer am I if I follow patterns everywhere?” In fact, the art here is to make decisions about where and what to do . Of course, each of us from time to time has to face unique challenges, each of which is a real challenge. However, if our initial level is clearly delineated, then we know what to spend energy on: looking for ready-made solutions to the problem posed before us, or for its further study and deeper understanding.

I think I managed to convince you that if a specialist confidently understands what the architectural component of some wonderful software systems is, then this knowledge will be indispensable for mastering the art of an architect and developing a solid basis in this area.

OK, so where do you start? Donna Martin has a repository on GitHub called system-design-primer , from which you can learn how to design large-scale systems, as well as prepare for interviews on this topic. The repository has a section with examples of real architectures , where, in particular, it is examined how some well-known companies , for example, Twitter, Uber, etc., approach the design of their systems.

However, before moving on to this material, let's take a closer look at the most important architectural challenges that we have to deal with in practice. This is important, because you have to specify MANY aspects of the intractable and multifaceted problem, and then solve it within the framework of the regulation in force in this system. Jackson Gabbard , a former Facebook employee, recorded a 50-minute video about system design interviews , where he shared his own experience in reviewing hundreds of job seekers. Despite the fact that the video expressly refers to the design of large systems and success criteria that are important when looking for a candidate for such a position, it will nevertheless serve as an exhaustive resource about what things are most important when designing systems. I also offer a summary of this video.

Gain knowledge about data storage and retrieval

As a rule, your decision about how you will store and display your data for a long time critically affects system performance. Therefore, you must first understand the expected characteristics of writing and reading data in your system. Then you need to be able to evaluate these indicators and make a choice based on the estimates made. However, you can effectively manage this work only if you understand the existing patterns of data storage. In principle, this implies reliable knowledge related to the selection of the database .

Databases can be considered data structures that are characterized by exceptional scalability and durability. Therefore, knowledge of data structures should be very useful to you when choosing a particular database. For example, Redis is a data structure server that supports various kinds of values. It allows you to work with data structures such as lists and sets, read data using well-known algorithms, for example, LRU , organizing such work in a durable and highly accessible style.

Samuel Zeller's snapshot from Unsplash

When you have a good understanding of the various patterns of data storage, go on to study data consistency and availability. First of all, you will need to learn the CAP-theorem, at least in general terms, and then polish this knowledge by examining in more detail the established patterns of consistency and accessibility . Thus, you will develop your horizons in this area and understand that reading and writing data are actually two very different problems, and each of them has its own special challenges. Armed with several patterns to ensure consistency and accessibility, you can significantly increase system performance, while ensuring uninterrupted supply of data to your applications.

Finally, concluding the discussion about data storage issues, mention should also be made of caching. Should it run both on the client and on the server? What data will be in your cache? And why? How do you organize cache invalidation? Will it be done regularly, at regular intervals? If so, how often? I recommend starting these topics with the next section of the aforementioned Primer on System Design.

Communication patterns

Systems consist of various components; it can be different processes running inside the same physical node, or different machines operating in different parts of your network. Some of these resources within your network may be private, but others must be public and open to consumers accessing them from outside.

It is necessary to ensure the communication of these resources with each other, as well as the exchange of information between the entire system and the outside world. In the context of system design, here, again, we are faced with a set of new unique challenges. We figure out how asynchronous task flows can be useful, and what various communication patterns are available .

Tony Stoddard's snapshot from Unsplash

When organizing communication with the outside world, security is always very important, which must also be approached with seriousness and actively involved.

Connection Distribution

I’m not sure that putting this topic in an independent section will seem justified to everyone. Nevertheless, I will expound this concept here, and I believe that the material in this section is most accurately described by the term “connection distribution”.

Systems are formed by correctly connecting many components, and their communication with each other is often organized on the basis of established protocols, for example, TCP and UDP. However, these protocols as such are often not enough to satisfy all the needs of modern systems, which are often operated under high load, and also greatly depend on the needs of users. Often it is necessary to find ways to distribute compounds to cope with such high loads in the system.

This distribution is based on the well-known domain name system (DNS). Such a system allows you to transform a domain name, for example, weighted round robin and methods based on delays, which help to distribute the load.

Load balancing is fundamentally important, and almost any large system on the Internet that we have to deal with today is located behind one or more load balancers. Load balancers help you distribute client requests across the many available instances. Load balancers can be either hardware or software, however, in practice, you often have to deal with software, for example, HAProxy and ELB . Reverse proxies are conceptually very similar to load balancers, although there are a number of distinct differences between the first and second. These differences must be taken into account when designing the system according to your needs.

You should also be aware of content delivery networks (CDNs). CDN is a global distributed network of proxy servers that delivers information from those nodes that are geographically located closer to a specific user. CDNs are preferred if you work with static files written in JavaScript, CSS, and HTML. In addition, such cloud services are popular today that provide traffic dispatchers, for example, Azure Traffic Manager , which provide you with global distribution and reduced delays when working with dynamic content. However, such services are usually useful in cases where you have to work with web services without saving state.

Let's talk about business logic. Structuring business logic, task flows, and components

So, we managed to discuss various infrastructure aspects of the system. Most likely, the user doesn’t even think about all these elements of your system and, frankly, is not worried about them at all. The user is interested in how to interact with your system, what can be achieved by doing so, and how the system executes user commands, what and how to do with user data.

As the name of this article implies, I was going to talk about software architecture and system design in it. Accordingly, I did not plan to cover software design patterns that describe how software components are created. However, the more I think about it, the more it seems to me that the border between software design patterns and architectural patterns is very blurred, and these two concepts are closely related. Take, for example, event sourcing. If you adopt this architectural pattern, it will affect almost all aspects of your system: long-term data storage, the level of consistency adopted in your system, the outlines of the components in it, etc., etc. So I decided to mention some architectural patterns directly related to business logic. Even if in this article you have to limit yourself to a simple list, I recommend that you familiarize yourself with it and think about ideas related to these patterns. Here you are:

Collaborative approaches

It is extremely unlikely that you will end up on the project as the participant who is solely responsible for the system design process. On the contrary, most likely you will have to interact with colleagues working both within your task and beyond. In this case, you may need to evaluate the selected technological solutions together with colleagues, isolate business needs and understand how to better parallelize tasks.

Kaleidico Snapshot from Unsplash

First of all, it will be necessary to develop an accurate and universally accepted idea of what is the business goal that you are trying to achieve, and what moving elements you will have to deal with. Group modeling techniques, in particular, event storming, can significantly speed up this process and increase your chances of success. You can do this work before or after you outline the boundaries of your services , and then deepen it as the product matures. Based on the level of consistency that will be achieved here, you can also formulate a single language for the limited context in which you work. When you need to talk about the architecture of your system, you can use the C4 model proposed by Simon Brown for this , especially when you need to understand how much you will have to delve into the details of the problem by visualizing the things you want to report.

Probably, there is another mature technology in this topic, no less useful than subject-oriented design. However, one way or another, we return to understanding the subject area, so knowledge and experience in the field of subject-oriented design should be useful to you.

Source: https://habr.com/ru/post/461283/

All Articles

Software architecture and systems design: a big picture and resource guide

More articles: