How we check the security of mobile applications, and why it is not easy. Security in Yandex

My name is Yuri Leonychev. I work in the Yandex information security service, where I develop interesting services that combine machine learning methods with BigData analysis. As you know, Yandex has a large number of mobile applications. And if we have been engaged in the security of our web applications for a long time, then mobile has often not received enough attention. This was partly due to the fact that mobile applications were considered a continuation of their “big” brothers, add-ons over the WEB API.

But with the advent of mobile platforms iOS and Android, the situation has changed dramatically. The number of applications we are developing has grown, their complexity has increased, and some of the applications have become separate large independent projects. In addition, we launched Yandex.Store, where we had to ensure the security of already third-party applications.
')
We have learned how to ensure the lack of vulnerabilities in both Yandex applications and third-party applications in various ways, including using machine learning. About how we work in this place I will tell. To begin with, we test our own applications.

Errors need not only search. It is very important to make sure that they do not appear in the applications. We decided to use the already well-known method from Microsoft - Secure Development Lifecycle ( SDL ). Of course, we would like everything that the SDL offers to be implemented and used right away, but this is too complicated. All SDL controls are more of a perfect end result.

The security of mobile applications in Yandex has several important features. We are actively developing many of our mobile applications for different platforms (iOS, Android, Windows Phone). Obviously, it would be very difficult to check all applications under all platforms by the force of the Yandex product security team. Therefore, we try to be guided by the principle of "Divide and rule." To this end, we constantly interact with key developers of mobile applications, trying to increase their awareness of security issues. For example, last year we conducted special trainings in which mobile application security experts from IOActive showed practical hacking techniques and code analysis. In practice, our developers were able to see how their application would be attacked and how to defend against many types of attacks. Usually, developers themselves turn to us in all cases when new functions or changes in applications affect security.

Most Yandex mobile applications use common components and libraries. We check these parts of the code regularly, as one fixed error in the shared libraries will affect all new versions of applications. For example, in our applications we found a vulnerability associated with a content provider open for all to read, in which SQL-injection was found. Although different researchers reported it and they wrote about different applications, the error itself was actually in one common library, so it was easy to fix it.

The most popular mobile applications undergo regular static code analysis, which is launched as one of the application building steps. We use a static code analyzer from the company Coverity. Our version of the analyzer has been slightly improved: additional checks have been made to find specific Android vulnerabilities. At each code commit, developers receive a detailed report on the errors found and the degree of their criticality. In this case, the static analysis works efficiently, since the code base of mobile applications is not too large and the developers have time to correct all the errors found. In this case, the developer can immediately see their mistakes - until the moment when the application will be published. In addition, static code analysis allows you to find errors in the semi-automatic mode, which you can hardly find with your hands. For manual checks, we also prefer to use static analysis, since in our case it is easiest to check the source code of the applications.

Many of our mobile applications interact with backends on Yandex servers. Such applications pass a two-step test. We look at the mobile and server side. All API calls that the application makes are checked, since the APIs most often work using the HTTP / HTTPS protocols, then regular vulnerabilities from OWASP Top 10 may appear.

An important difference in the approach to ensuring the security of our mobile applications is that we try to use crowdfunding when searching for vulnerabilities. We were the first Russian company to launch a full-fledged reward program for the found vulnerabilities and immediately included mobile applications in it. During the “Hunting for bugs” about 10% of the messages were errors in our mobile applications. Some of them were very interesting. For example, one of the participants of our “Ohota” told about vulnerabilities found in J. Browser under iOS at the HITB conference in Amsterdam. We treat such errors found positively, as they show us potentially weak points in the code that are worth paying close attention to. And when reviewing old legacy code, application developers often fix other problems.

All new applications are checked before launching by our product security team. Moreover, not only the compliance with the recommendations for safe development for the corresponding platform, but also the presence of hidden algorithmic errors is checked. We try not only to correct the errors, but also to explain to the developers why they appeared. To search for some vulnerabilities, special programs were written that allow you to visually demonstrate the presence of defects.

Security of third-party mobile applications

When we designed our mobile app store for Android, there was a concern about the fact that we would be downloading third-party developers to the store. Immediately we could not tighten the nuts, because we would scare away all the developers, and we had to fill the store. It would not be a good idea to ask everyone at the start to document their identity. On the other hand, if we do not control the situation at all, we would get a huge repository of malware, in which users would be wary of looking for something useful. Therefore, we decided to immediately integrate antivirus checks into the store, after which we did some research.

We had a large number of malware samples, which at that time were actively distributed among Android users. We selected a number of anti-virus engines and tested their effectiveness on test suites. Then Kaspersky Anti-Virus showed the best results, but even they did not satisfy us, so I had to come up with a solution.

Security Model for Android

The Android operating system initially contained many mechanisms for protecting information and restricting access to mobile device resources. Since Android is based on the Linux kernel, the mechanisms for restricting access to resources of processes belonging to different users were inherited by the new operating system. But due to the fact that Android was designed for mobile devices and applications had to be executed in a special Dalvik virtual machine, there appeared additional levels of abstractions and protection. Android device can be viewed on the classic diagram.

"

Android is characterized by strictly set permissions for the file system, launching user applications in separate processes in a kind of sandbox.

Privilege

It is important that in order to access most of the resources, you need to request a special permission in the application manifest. At the same time, when installing applications, users are warned about what features the application will have, to which user data it will have access. For example, an application with READ_SMS and INTERNET permissions can easily transfer one-time passwords that get into the user's smartphone to an outsider. Despite the screen that has changed for more informative, which shows the requested permissions during installation, and other tricks of Android developers, most users pay little attention to what they decided to run inside their device. The first time, when the platform documentation was rather poor, many developers also did not understand what permissions they needed for certain actions. Therefore, they violated the principle of the lowest available privileges and put permissions in the manifest to the maximum. This led users to perceive huge sets of permissions as the norm.

Common Malware Applications

Like any rapidly fascinating consumer market operating system, Android has attracted the attention of various intruders. An additional source of interest was the fact that at present the mobile device is not only a valuable source of personal data, but also actually an easily accessible wallet.

Already in 2010, antivirus companies reported on the first infections of mobile devices with malicious applications that send SMS to short-term paid numbers (examples one and two times ). The functionality of such applications was very simple. When installing, they already asked for the right to send SMS, which they did when they first started. In those applications, there were not rare errors due to which the SMS could not be sent at all or went to the wrong numbers. For four years, malicious applications have evolved significantly, but most of them still constitute the simplest programs aimed at making quick profits. According to LC, for 2013, 36% of Android Trojans sent paid SMS messages. New successful malicious applications are copied many times, so Trojans for Android are not too diverse.

Detection methods: static and dynamic

For successful detection of malicious applications, various techniques are combined: signatures, heuristics, emulation of applications are used. But the growth in the number of such programs has led to the need for methods for automated task execution, which often have to be done by antivirus analysts. For Android, some methods were not immediately effective enough. Application emulation was hampered by the huge fragmentation of the platform, the not very stable operation of the emulator and simple methods for its detection. Many researchers have suggested using machine learning techniques to find malware. We also decided to follow this path and, together with Yandex.Store, a classifier of loadable programs was launched a year ago using our machine learning algorithms ( presentation on YaC'13 ).

Building a classifier

Selection of classification factors

In order to select the correct classification factors, two approaches were used. The first is an analysis of already existing malicious mobile applications. It was done by hand and allowed to see some features. For example, malicious applications (at the time of analysis) were leaders in the use of obfuscation methods. Now this feature is not as relevant as the developers of popular applications include at least ProGuard during the build. Another important factor is the use of permissions. It has not lost its relevance over time, as simple malicious applications still prefer to ask for permission to send SMS.

The second approach in creating a set of factors was even more trivial. Some time was spent searching for already used classification factors in scientific articles. It did not give much results, but it allowed us to turn on the fantasy and come up with a lot of the most insane facts that could be counted by looking at the input file. Of course, it was not entirely clear what the “exhaust” would be from such a factor as the size of the input file or the number of URLs in variables and resources, but at the first stage I wanted to use all the features.

Evaluation of detection efficiency and operation speed

After several trainings of the classifier, Matrixnet allowed to throw out a lot of factors that are currently ineffective. This does not mean that we completely forgot about them. Since the launch, some factors have lost relevance, and some have become effective again. For example, the size of the file, which initially did not give a special contribution to detection, gained a certain weight over time, since the size of files for many games increased significantly. Some factors could be used only after a certain period of operation of the analyzer. For developers, the level of trust in them was calculated based on all the applications that they downloaded to our store. Of course, at the time Yandex.Store was launched, it was impossible to make such calculations, since most developers did not have too many different applications, most often one version.

Evolution of factors and retraining of the system

Changing the processes of building applications, the emergence of new types of malware, and not only this, forces us to constantly retrain the analyzer. In this case, this is a normal process, which is repeated every time after the start of degradation of the detection results. In many controversial cases, when an application is detected as malicious, but this is not confirmed by other methods, we add the application to the training set. Now retraining of the system can be done semi-automatically. When the first training was conducted, the sample sizes were about 250 files. Now they have already passed a thousand.

Future of the system

Support for new executable file mechanisms

The Android platform is developing very dynamically. Over the past two years, ART has appeared, some permissions have changed, the devices themselves have changed. Accordingly, it is necessary to constantly refine the application analyzer. Now there are ideas for the development of the project in two directions: the first is to improve the quality of static analysis by strengthening the checks of the native code; the second is the introduction of dynamic analysis to test applications.

Change factors

Improving static analysis through checks on native code is necessary, as malware developers have begun to use JNI more actively. Now you can write full-fledged Android applications in C ++ without any problems.

Also, there is a need for a great refactoring and code optimization, as we want to further reduce the time to scan files and increase the performance of the analyzer. However, this is more of a task for the near future.

Source: https://habr.com/ru/post/231783/

All Articles