📜 ⬆️ ⬇️

On the MegaFon Card - technical details

In previous posts we have already discussed the MegaFon card as a financial product and talked about its capabilities for the end user. But, of course, behind such a project is a huge amount of work done by a team of professionals. This time we will tell you more about the technical features of this project and about the software design.



Given the scale of the project, it will not be possible to describe all the subtleties within one post, so we will start with the story about the backend of the RBS system (remote banking service). The task of the backend is to provide a working bundle of all specialized systems within a single logic, as well as the functioning of a variety of background processes. And most importantly - a convenient and functional personal account for users.

Megafon has chosen TYME as the technical partner of the project. All the time, the specialists of the two companies worked closely with the suppliers of banking software, billing, payment systems and other suppliers, assembling separate functional pieces of the mosaic of the future project into a single whole.
')
“We had tight deadlines and a lot of work. There were no extra days for long discussions and the right to make a mistake when choosing technology and approaches. It’s great that together with TYME we managed to realize a complex and innovative project. ”
Yan Kukhalsky, CEO MegaLabs

In 2013, the project “Terminals” was launched, and since that time we have continuously improved the solution, integrating it with our services and new service providers and adding new features for customers.

TYME has quite a lot of experience in the FINTECH industry, they successfully overcame all the difficulties and we, together with technology partners, successfully launched the MegaFon map a. Further on the pages of the MegaFon blog we give the floor to the TYME team, let the guys themselves tell about the details of their part of the project.

Details of a large project


After the implementation of each large-scale project, we look back to evaluate the work done. In a very short time we managed to launch an extremely complex system, which is located in the very heart of a federal-scale financial product.

Some numbers for example:


It is possible to estimate the true scope of tasks only when the development of the project ends. The volume of tasks during the work is constantly increasing. If you look at it now, it becomes even a bit scary - how could we, in principle, subscribe to such a volume in such a time frame.



November 2015 The project is at the concept stage. Translated into human - we have only a clearly defined launch date and an approximate TK from the business.

We cut the marble




The opportunity to study the customer and not get it with the constant questions we have thanks to properly built relationships and several years of working together.

Here are some principles, following which we really helped:


Of course, with this approach, a risk for the developer inevitably arises - after all, resources for analytics have to be spent very substantial, and on their own initiative. This way is suitable in those projects in which you are completely confident in your relationship with the customer, and if you are well-versed in the industry itself.

Agile and long-term planning


Much has already been said about the merits of Agile methodology. We will not repeat and concentrate on those points that are usually harder for the customer to accept.


Since we worked on SCRUM, the problem was standard - the customer needed a waterfall project plan for the coming year. A detail of the project was ready only for a few sprints, for which the team has committed itself to perform the tasks described.

You can often see the following recommendation : if you want to get a long-term plan while working on Agile, start each sprint or iteration as a task in Project. As a result, the output will be approximately the following picture:



The recommendation is so general that we have not taken root. This version of the plan does not show product milestones that are significant to the customer, and cannot be fully used to discuss the long-term project schedule.

The 1986 silver bullet was invented by Barry Boeham, an American military. His idea is a spiral model of development . Many IT professionals already know what it is, but in practice, extremes are very often observed - either Agile without long-term plans, or a waterfall with deadlines and budgets that are constantly changing.

Artemy Lebedev in “Kovodstvo” spoke well on this subject.

Thanks to the spiral model, we solved two problems at once:


Project work was built like this:






As a result, we arrived at such a task mapping between the plan in MS Project for the customer and Jira for the development team.



Jira was responsible for managing the workflow itself (but this could be any convenient product for Agile), and the MS Project plan was needed to control the global status of the work and visualize it for the customer.

Macro effect from microservices


The project turned into a bunch of subprojects that were highly related.
Since we first chose an approach that provides such fragmentation - we could afford it.

The trend of microservices has existed in the development world for quite some time. A huge number of discussions on this topic take place at relevant conferences. Someone fundamentally denies the benefits of such a construction of systems, there are those who have taken the exact opposite position and translate all their complex systems to microservice architecture.

Microservice architecture is an approach to development, in which instead of creating one large application, it can be split layers (GUI, Business logic, DB), a lot of small, isolated components are created, which are called microservices. Without going into theoretical details that can be found quite easily on Habré , I would like to dwell on how this approach turned out to be useful in our project.

Here are the main advantages of this approach (in our opinion):

  1. Each service solves a specific set of tasks, it has an API, by which it is possible to access the service. We perfectly isolate responsibility within one service.

  2. One microservice, if desired, can be easily replaced with a new version or quickly securely refactor.

  3. Horizontal scaling microservice in the presence of difficulties with speed. This is a killer-feature for systems that should work in 24 * 7 mode. Scaling always goes along with monitoring the speed of each service, we collect these statistics and decide to launch additional instances.

  4. The features of corporate networks are such that we are obliged to work in a closed loop on IS, but at the same time part of our platform has access to the Internet, other services are isolated and work with dozens of external systems within separate subnets. We have identified segments that operate on the public Internet, internal services and integration services, which are located in a special zone with the most restricted access. In the case of a monolith, one would have to combine several networks on one server, which is not always liked by the staff providing information security.

Of course, it was not possible to avoid some difficulties:

  1. The most difficult decision is about the limits of microservice. It is necessary to answer unmistakably the question - which of the existing microservices should perform this task? At the start, we had to duplicate some solutions in several microservices in order to maintain their isolation. Yes, for the developer, this is a slightly unusual situation, since reusing the code is the main task.

  2. A radically different approach to updating our application. Without automating this process, the administrator becomes more and more complicated, since he would have to perform several times more operations in the maintenance window in order to deliver the update to the industrial environment.

Dividing the system into a set of microservices + a small team + development on SCRUM: here’s a recipe that helped us reduce dependencies and make the most effective use of all our capabilities and competencies, to develop simultaneously in several directions simultaneously, minimizing the impact of each service on the rest of the system.

Be in the middle of the action


Our system not only provides the business logic for presenting bank data to a client, it is also an integration bus that combines the front-end and all the external systems involved.

When our backend fails, it automatically means that the user sees a crash. When any other system fails, we have the opportunity to mitigate the problem and maintain the efficiency of most functions.

This means that our system must have performance characteristics that exceed these indicators of any adjacent platform:


We initially understood that our system will become the main hub from which to start the diagnosis of incidents, and we must have comprehensive information for any technical investigations.

These inputs have predetermined additional conditions for the development:


Refresh in 90 seconds


When working with a platform built on microservices, it is important to understand:


At the same time, we could not afford to introduce any technological windows for updates - any, even small system downtime, in our case, is extremely undesirable.

Combining the efforts of the exploitation service and the developers, we got a solution that has all the necessary properties:





There are a number of points that can be improved in the upgrade procedure - for example, updating the database of each of the services and providing backward compatibility for the old and new versions of the service at the same time.

If it will be interesting, we will gladly release a separate article about the approaches that we use in ourselves.

Quality philosophy


A service that does not have to stop, and at the same time dynamically develops, is always between two fires. In case of downtime, financial and reputational losses will be enormous. And if we recall the extremely tight deadlines and changing requirements in the development process, the degree of complexity continued to rise. Therefore, the quality department faced an ambitious task to build product testing in such a way that it was:





Python was chosen as the basis for autotests along with the py.test framework. Python is a fast (in terms of development) and powerful language, and py.test with its wonderful subsystem of fixtures and parametrization is a flexible tool that allows you to widely reuse test code.

As an aggregator of results: a TeamCity build server with installed plugins for interpreting the check results from py.test.

The tests themselves are written as isolated as possible, the execution of each test does not depend on the result of the remaining tests. If the test needs to connect to the system database and receive data, then the test fixture must provide this data there before the test is performed. If the test can be affected by the value in the cache, then another fixture should reset the cache before performing this test. Yes, this led to the fact that it took a lot of time to develop a systematic grid of fixtures, but very quickly these investments paid off with the speed of adding new tests and, most importantly, the stability of their results. Full control over the system under test means a minimum of false test positives.

Integration with the TeamCity build server made it possible to simply press one button so that the tests check all the processes of the platform. At the same time, no preparations are needed, which means that any member of the team can do this. The test report is displayed in a detailed and clear way on the web-interface of the build-server.

We did not regret the complete abandonment of the automation of API tests through specialized solutions. Yes, such tools give a set of functionality right out of the box. But, firstly, it is not cheap, and secondly, we still need more opportunities.

For example, in certain test cases of our API, it was necessary to receive an SMS confirmation code for an operation, forward it to tests and observe the behavior of the system. This is where the power of coded tests comes to the fore, let them develop more expensive than, for example, collecting test-steps in SoapUI.

As a result, now the process is structured in such a way that Postman or the same SoapUI is used by testers only at the initial verification stage. In the end, the process should still be covered by auto-tests in Python and implemented in a common repository. This is the law.

Not a single change in the functionality of the system goes without testing in general and without autotests in particular. Story is simply not considered complete until it is covered by autotests. This approach requires high self-discipline from the team and performance from testers, but the result is worth it: if the build machine tests are green, we are confident in the quality of our system.

Now the number of functional tests exceeded 2500 and continues to grow. Execution takes 25 minutes.

The right choice of tools and full coverage of autotests from the early stages of development allowed us to maintain a high rate of implementation of new functions throughout the project, remain flexible to changing requirements, and not sacrifice product quality.

One more thing


Documentation is an aspect of IT projects that often remains in the shadow of stories about architecture, design and communication with the customer. But this is a very important detail, the lack of attention to which can complicate the operation and development of even the most remarkable, from the point of view of internal organization, system.

We have chosen an approach in which documentation evolves along the development cycle:


This approach gave several advantages at once:




The most important thing is that at any time we can provide the customer with documentation both on the current version of the platform and on the planned changes in order to ensure their readiness for updates.

Thanks to a couple of days that were spent on general editing and setting up a download from Confluence, the download of the final document can now be completed within an hour across all the components of the system.

Instead of conclusion


We tried to describe the general principle of the two companies and reveal the internal kitchen of the development.

It would be great to hear from you complex questions that we could use as the basis for new articles!



Thank you guys for such a detailed story! This is only the first part of the technical component of the Megaphone map, there will be more stories. Stay in touch.

Related Topics



Links


Source: https://habr.com/ru/post/317740/


All Articles