Badoo Worldwide Billing QA Eyes

Hi, Habr! For more than four years, I have been doing manual and automated testing of Badoo billing systems. And Badoo billing is one of the most developed (and complex) in the world, and testing it is often an interesting and extraordinary task. Today I want to tell you why these systems are so interesting and powerful, what I have learned over the years and why testing billing is not ( very ) scary. And at the same time I will share with you another batch of interesting stories ( yes, I love this business very much) . Most of the things will be applicable not only to our particular case, but also to any other complex payment system ( and not only the payment system, to be honest ).

What is our billing? This is a payment processing system in a social network with more than 330 million registered users. We accept payments in all countries of the world, we support over thirty active payment methods (and for all the time they were implemented about a hundred) and we process about 1500 requests per second. Badoo billing is an independent dedicated service that works with a dozen different clients (different platforms, different applications). A curious enough base for testing development, isn't it?

Test object

So, for a start I will briefly tell you exactly what we have to test. All our clients (web, mobile applications and some back-end services) communicate with billing using the API. The billing itself is located on a separate cluster in each of our data centers and communicates with various payment systems (sends requests for payment, receives notifications with the result of processing requests, etc.). The cluster houses machines for handling customer requests and payment systems, machines for running CLI scripts (for example, for updating expiring subscription services), our own server for processing payments by bank cards and a database.

')
Billing developers are engaged in solving several types of tasks:

development of new functionality : new paid services, promotional campaigns, various features for subscribers;
development of new integration with payment services ( you can always find someone with lower commissions or a higher conversion );
actualization of existing integrations ( our partners are also developing );
fix bugs ( let's admit - they all happen! );
problems of optimization and solution of technical debt ( you can always make the service a little bit better );
solving technical support tasks ( we love the most tricky users, who manage to create dozens of subscriptions for different payment services in different countries and get confused, how to cancel unnecessary ones ).

And all this “good” ultimately comes to testing our small team. In addition to tasks directly from billing developers, we receive tasks from other teams as well, if they relate to payments in any way: for example, changes and new features on clients or server of mobile applications.

What exactly are we testing? You can break it all into three categories:

user interfaces : various payment windows ( we call them "wizards" ) on different platforms, settings windows, advertising banners, promotional windows, etc .;
“Admin” and configuration tools : price settings, promotional campaigns, experiments and tools for technical support (which we also actively use when testing);
billing back end : payment processing, service queues and carrying out various pending operations (the most difficult and “juicy” part ).

I will try to tell you about all this in order.

User Interfaces

So, the main part here is payment wizards. Performing the same function (getting information from the user about how much service he wants to buy and how he wants to do it), wizards look different on different platforms. It depends primarily on the features of the platform, but also on the various requirements of the regulators and on the countless A / B tests that are carried out on our applications.

What can be tested here? Yes, the sea of all! Each payment method should be displayed correctly for any chosen service option. The list of options should correspond to the desired one, each of them should contain the price specified in the billing settings, and the price and currency format should correspond to the standard adopted in the country: for example, $ 6.49 , 125.00 MXN or 17.64BYN .

Each payment window must be accompanied by detailed terms of service. Each promotional window should also contain everything you need or lead to the next step with a complete description of the conditions ( this, by the way, is one of the most frequent problems that is so easy to forget about ).

Any user action in such windows should be accompanied by correct messages, not only about successful payments, but also about errors (you must be able to distinguish situations when the user canceled the payment on the partner’s side, and when he actually entered incorrect information).

It would seem that it is possible to compile for all this a set of basic checks and limit it to them. Not here it was! Each country has its own mandatory requirements that must be met in order to be able to conduct business. For example, when paying via SMS in Belgium, a short number for payment must be drawn with large white numbers in a black rectangle ( I'm not joking ). In France, one time on each page of the site should have been a button to unsubscribe from the existing subscription service, and the reply itself must still be made in one click, without any confirming steps. In some countries it is necessary to report that the price includes taxes, and in some others even separately indicate the cost of the service and additional taxes ( and never write them in the amount ).

How to check such a "zoo" payment methods? It’s impossible to travel to all countries of the world for the most honest testing ( and I would like to test payments in Brazil, sobs ), exactly like starting accounts in all existing payment systems. Therefore, we have to be content with various sandboxes. Some partners provide us with their own very convenient sandboxes, for example, bank card aggregators or PayPal. Some of them are not so functional: at one of the partners, it is a screenshot of their usual payment window with the “Pay” button superimposed on it.

In other cases, we have to build sandboxes ourselves, emulating different answers and notifications. But even this does not work everywhere, and you have to collect notifications with your hands, make some substitutions in the code and send them yourself with https requests.

Very separate headaches are wizards in mobile applications. Here the user communicates with the billing even more indirectly. The application sends a request to the payment system (AppleStore or GoogleWallet, for example), the response immediately sends to the mobile application server, which in turn processes the information and sends a new request to the billing cluster, and the billing response goes all the way back to the payment system. User Experience may break anywhere in this chain! A cloud of error may mean that the request did not reach the billing and the payment was not made, but it could also mean that everything went fine, but the mobile application server responded to the payment system not quite in the format that it expected. Mess!

And let's not even talk about how awkward Apple and Google’s sandboxes are, especially when trying to test subscriptions.

By the way, the very fact of working with external partners brings with it a lot of problems. Their payment windows can open for a long time and slow down testing, they can contain the most common bugs ( which you, as a self-respecting tester, first of all attribute to your own developers ). Any actions that require cooperation from their side (fixes for the same bugs, protocol extension) are also often delayed, and they can make some changes on their own without informing us ( which we know only from the increased error schedules ), and provide We have incomplete or even incorrect documentation.

Admin panel

No less important component of billing is completely hidden from the eyes of our users. This is all that allows our management to regulate prices and availability of services and launch promotional campaigns, and technical support workers to identify the causes of user problems ( and make sure that they are not trying to just get the service for free ) and as simple as possible. and a safe way to solve them. In addition, all these tools help us to test (to play many cases exclusively by actions in the user interface is either quite difficult or very long).

In addition to direct performance checks, you need to spend a lot of time on ensuring that developers and managers perceive the implemented functionality in the same way ( we often cope with it in order to establish contact and complete understanding between them ). All configuration systems are quite complex due to a wide range of possibilities (we constantly have dozens of A / B tests of design, payment methods and promotional campaigns), and the smallest details can lead to the system not behaving at all as expected management. It is our responsibility to make sure that the developer correctly understood the task, and the manager was able to understand the documentation provided ( if there is such a thing at all ). And of course, after each change of configurators it is very cool to follow the results of their work and to clarify several times whether everyone really wanted to do this.

And here it is necessary to boast that one of the tools developed by us (and of course tested!) For routing payments to the necessary banks and accounts brought us the prestigious Merchant Spotlight Award.

Back end

And here the fun begins. That which is hidden from the eyes of ordinary users; what managers don't even want to know about; the place where the most monstrous fantasies of our developers come true - the internal logic of payment processing and service provision

Here a lot of all kinds of things are tested.

Appeals to partner systems : checking subscription status (sometimes we can’t manage them on our part, and all that remains is to check that they are still active), requests for updates and cancellations of subscriptions, and much more.
Processing notifications from partners : we must correctly process each notification (and after all each partner has its own format and protocol!), Determine the user, service and all possible parameters so that nothing is confused. Sometimes notifications mean nothing at all: “ Look, we still could not debit money from the user !”; Sometimes they contradict themselves: “The user canceled the payment :( But no, that’s the money came! ”; Sometimes they are not relevant at all : " Remember that subscription three years ago? So, it is still expired! " - and we have to come up with the right flow for every possible occasion.
Provision of services : in order not to lose orders of users in case of problems, services are rendered through queues. If something goes wrong - the event is postponed, and any service in any case should be delivered to the user. This is "in any case" we must guarantee when testing.
Updating subscriptions : if a user is subscribed to certain services, he must receive them on time. We should not “charge” him before (or later) time, it should always be written off exactly the amount to which he subscribed. In addition, we have a lot of different logic for choosing the time for updating subscriptions in different countries (either these are our experiments or the requirements of regulators). For example, somewhere we “chardzhim” users only during working hours, somewhere only on certain days of the week.
Payments according to the available data : as in any self-respecting payment system, with us the user can save the details of his payment method in order to pay faster the next time. We have to check that the details are stored safely (for bank cards, for example, we need to comply with PCI DSS), that payments are processed and cases are handled correctly when the parts are no longer valid (for example, a user card is blocked).
And so on and so forth .

The amount of different logic in the server code is just limitless. Each new task turns into an entertaining quest of the form "Understand how it works => Understand how it MUST work => Understand how to make the system work that way." What are the ways to achieve this?

First, you need to read the code . It is almost impossible to test billing as a black box: only having an idea of how the system works can one understand which cases can be tested here. In addition, very often for successful testing, you need to make changes in the code: remove calls to aggregators (so that we do not ask them the status of a non-existent test subscription), replace signature verification for notifications (so that it does not need to be generated each time) or “hardcode” the choice of a specific option in the A / B test (so as not to register dozens of users who fall into the right groups). Fortunately, we are developing testing utilities with all our strength in order to simplify these processes.

Secondly, one should not be afraid to test things in unobvious ways . You can not certainly play the case from the interface? You can write a functional test! You can climb into the test database "handles" and fill in the necessary data! You can collect the notification from the partner manually and send it to your own address! The main thing - do not be afraid to climb into the jungle.

Thirdly, the developer is your friend . Joint entertaining debug is a fascinating ( not always ) and rallying team ( except when you want to strangle the developer ) business. Together to deal with unexpected behavior is much easier. And you either understand your mistakes and justify the task, or find the real problem and allow the developer to return to finalizing it, already having some idea of the situation ( or understand the situation to the new developer, if the old one went on vacation ).

Automatic testing

Autotest is a very cool thing. And in fact, to retest here is much better than not testing . Directly we have all auto tests can be divided into four groups:

unit tests : written by developers while working on a task. In our process, the task is not considered solved until it is covered by tests;
integration tests : are written by developers (and sometimes testers) at the testing stage to check places that are hard to reproduce. They continue to replace part of the code, as well as unit tests, but they work with a much wider layer of entities at the same time;
System Selenium and Calabash tests : test the client as the user sees it. Not ideally stable, rather slow, but very useful, as they allow finding problems caused by the tasks of other departments;
system curl tests : quite a new direction. They check the overall system performance on thousands of different cases: we receive payment wizards of all services, all their options, in every country of the world, on every payment method. Testing as it is.

When do these tests run? In various combinations this happens all the time:

developers run tests manually when working on a task;
they automatically start when the task transitions to the “Ready” status;
QA-engineers run them manually during testing;
they run every time you build each new version of the build;
ultimately they are regularly and regularly run on preproduction.

Of course, all these autotests take considerable time, and therefore we always strive to optimize this process as much as possible. For integration and unit tests (and more recently for curl tests), we use cloud-based “testing” tests ( more than 73 thousand tests in 4 minutes! ). For Selenium tests, we have a “big farm” of the SeleniumGrid cluster. On the whole, the work on the improvement and optimization of tests never stops.

Monitoring

The work of the tester on the task does not stop immediately at the moment the task is sent to production. It can only be verified that it can withstand the strain of work in a combat environment only through careful monitoring. Have new unexpected errors appeared in the logs ( yes, the expected errors are, this is normal )? Has the load on the billing cluster increased? Has the profit in some country or in any payment method started to fall ( or grow sharply, which is also usually strange )? Badoo has a great monitoring department that monitors all metrics around the clock in manual and automatic modes. However, in any case, it will take some time to find out for themselves the causes of certain anomalies. Therefore, the QA-engineer is obliged to carefully carry out his task in the ( last ) battle.

For these purposes, we use several different systems, the most important of which are three:

RRD Tool : in RRD we keep logs of errors and debugs, graphs of a huge variety of main metrics (profit, number of payments, number of services rendered, queue sizes);
Splunk : a delightful system, with the help of which we analyze all billing events in real time, we can build various charts on the number of certain requests to the billing in time and much more;
Anomaly Detection : our own anomaly detection system that automatically reports the unexpected behavior of a metric. Unlike the first two systems, this works in fully automatic mode.

What can be considered anomalies on billing graphs? Let's look at this chart in Poland. Each point shows the total profit for the last day, the scale of the chart is also a day.

A nightmare, a terrible drop in profits, you have to beat all the bells! But what is it? We open the schedule for the month ...

What a disgrace? It turns out that mobile aggregators work in Poland. They conduct all subscription updates only on a specific day of the week, for example, on Tuesday. If the user has signed up for a week on Monday, then ... his all the same “zacardzhat” on Tuesday! Such are the orders in Poland. And each peak is a “cherished” day of the week of one or another aggregator.

We look further. A similar chart of profits from AppleStore for the week from the 24th to the 1st of the next month:

We start immediately scared? Of course! Such a fall, no growth per day is a definite problem! While we rush headlong around the office and scream, pass the day. And what do we see?

The schedule recovered by itself! Magic? Catastrophic errors? Reptiloid plot? No, it's Apple policy. They always update subscriptions on the same day of the month that the subscription was entered. But what happens in February with those who started subscriptions on the 30th or 31st of the day: do they happily sit for a month for free? Of course not, their “charjat” on February 28. And since then, only 28 numbers have begun to charge. Therefore, at the end of the month there are these two peaks (28th day for February and 30th day for all other “short” months), and 31st days no subscriptions are updated for more than a month.

As you can see, you also need to monitor wisely. As I have already said, retesting is not so bad, but catching cuffs from developers for excessive alarmism is also possible.

Instead of conclusion

Testing billing is an interesting and entertaining business. There are many extraordinary things in it, pitfalls and no one really knows the back streets, but the solution to almost every task is a real quest, after which you have an absolute sense of triumph. It’s a pity that not so much is said about this area of testing ( and at conferences I heard things like “And we test production billing” ). I expect that my article will help someone else to look at the testing processes in their company and, perhaps, decide to test their payment systems a little more closely. And indeed any low-level things. Believe me, it's really not boring!

Kudinov Ilya, Sr. QA Engineer

Source: https://habr.com/ru/post/316050/

All Articles