An almost unbiased comparison of LMD, Manul, ClamAV and Virusday antiviruses.

Over the past year, the ranks of antiviruses that are struggling with the infection of websites, have been replenished with several new representatives. Now the choice of a webmaster or site owner becomes more difficult. Another year or two ago there was nothing to choose from.
Analysis of individual cases of the use of anti-viruses have already been published on Habré. However, no one did any general and complete comparison. To help us understand the difference between antivirus products for fighting viruses on servers and websites, we now compare the antivirus databases of four antiviruses in terms of completeness and accuracy. We will try to do it correctly and impartially.
')
Comparison of detection accuracy
Let's understand the accuracy of the detection of threats to each antivirus. For each of the patients, false positives are possible, and our task is to determine their share in the set of detected files by each of the antiviruses individually. After all, no one needs an antivirus that points the finger at good files instead of infected ones.
For the analysis of the accuracy of identifying threats, we will use a set of sites (W). We will not make a difference between infections and suspicions due to the fact that some antiviruses do not have such functions. We will have to manually check the file detected by each anti-virus for the presence of malicious code in it (we are fighting for objectivity). Such work, as you understand, (in case there are many such files) takes a lot of time and effort. Therefore, we will take a set of W sites that are necessary and sufficient for satisfactory accuracy and a relatively short manual analysis time. W = 1500.
Let be

,

,

,

- multiple files defined by each of the antiviruses separately when analyzing the same set of sites (W).

The accuracy of the definition (A) of the antivirus will be called a subset of various files that are guaranteed to contain malicious code.

identified by him on the corresponding set

. Those.

,

,

,

.
For clarity, we will measure the detection accuracy A in units of the corresponding

. The results of accuracy measurements are given below.

In the tests, we did not use md5 signatures from the ClamAV and LMD antivirus databases. When a website is infected, malicious code is almost always either embedded in existing files, or modified from one infection to another. This polymorphism is easy to implement when it comes to infecting websites. In this case, the effectiveness of determining threats by checksums of files is extremely low, but at the same time it implies the calculation of these amounts for thousands of files on the site, which significantly reduces speed, almost without affecting the quality of detection.
Comparison of completeness definition
The second important parameter for antivirus is the completeness of the definition, i.e. the number of detected threats from their total number on the infected server or site.
Completeness in our analysis is closely related to the accuracy of the definition. We will consider for comparison of completeness only exact (not false) anti-virus responses.
Take the set of accurate anti-virus responses from the previous test:

,

,

,

- sets of precisely infected files, determined by each of the antiviruses separately when analyzing the same set of sites (W). Believe that

- this is a consolidated set of guaranteed infected files detected by antiviruses on a given test set of W sites. In other words,

.

By the completeness of the definition (F) of the antivirus, we will call a subset of various guaranteed infected files identified by it on the combined set of files.

. Those.

,

,

,

.

We measure the completeness of the definition of F in units

. So, this is what data we obtained when analyzing the completeness parameter for each of the antiviruses.

So, we compared the quality and completeness of detection of infections. Now we will summarize in one table.
| The completeness of the determination of F,% | Accuracy of determination of A,% |
LMD | 11.86% | 60.52% |
Virusday | 78.87% | 92.72% |
Manul | 47.42% | 2.53% |
Clamav | 9.28% | 69.23% |
When manually checking the files identified by antiviruses as infected, we found several interesting facts that I would like to mention separately. Let's start with the number of files detected by each antivirus. Manul found infected or suspicious files on almost every checked site. We were very surprised when we considered that the proportion of files detected by them was 98% of the total number of files detected by all antiviruses.
The trouble is that, as can be seen from the second diagram, only 2.5% of these files were actually infected. The rest are false positives. Although the completeness of his definition is quite high, such a number of false positives greatly complicates further manual analysis and treatment. Analyzing the very anti-virus database Manul, we found out that it writes a sufficiently large number of frequently occurring secure code fragments into “suspicions”. For example, the following occurrences are defined as suspicions:
file_put_contents
,
@file_get_contents
,
move_uploaded_file
,
ini_set
,
error_reporting
,
phpinfo
,
extract
,
@include
,
mail
,
touch
,
chdir
,
copy
,
create_function
and so on. From this it follows that it is very difficult to write some serious PHP script in which Manul will not find suspicions.
In part, this also applies to ClamAV. He, for example, is looking for the occurrence of some simple English words and also defines them as infections. LMD very often for some unknown reason finds viruses in the zip.lib.php library found on many sites. It is specifically defined as the infection of such a piece of code:
$fr .= "\x00\x00";
The reasons for this are unknown. And such examples can still be found quite a lot. Virusday is also not sinless. It happens that he finds suspicious (for example, obfuscated) code in a file that is not malicious. Obfuscation can also be used to protect programs, and not just to hide malicious codes. But despite this, its completeness of definition surpasses other web antiviruses.
Our comparison, as written at the beginning of the article, is still “almost impartial” for several reasons. We conducted tests on websites, that is, we looked for viruses only in web scripts. In this regard, we did not process md5 signatures (for the reasons described above). ClamAV is also a more versatile threat finder, designed not only for websites.
Comparison of the quality of treatment
We dealt with detection. Let us find out how completely antiviruses detect a fragment of malicious code in a file. It is important for the organization of treatment. It will not be very good if the file found by the antivirus deletes the code fragment suddenly becomes inoperative, and as a result, the site itself often stops functioning. Of course, there are entire malicious files, but now we are considering cases where the malicious code is embedded in a good file and you cannot completely delete all its contents.
Let us analyze this parameter by the example of the malicious code that the compared antiviruses detected. In our example, the code had the following content:

LMD identifies it as malicious by the occurrence of such a piece of code:
= ''; for($i=0; $i < strlen($
By itself, this piece of code can hardly be called malicious, and it may well occur in normal scripts.
Manul defines it as malicious by the occurrence of such a site:
eval(
As we already wrote, Manul considers a lot of things dangerous. The use of many built-in PHP functions is present in almost every script and there is no point in raising an alarm only about this, it only complicates the analysis of scan reports.
ClamAV did not find anything in this code.
Virusday could find this code entirely. Accordingly, this code can be safely removed from the file, which is not the case with other antiviruses.
Take another example with a code like this:

He found all 4 antivirus.
LMD found such a site:
_']=Array(base64_decode('
Doubtful, but has the right to life.
Clamav:
<? $GLOBALS['_433305846_']=Array(base64_decode('' .'ZG' .'Vma' .'W5l'),base64_decode('ZmlsZV9n' .'ZXRfY2'
Good, but ClamAV is looking for this code by exact entry. He will not be able to find this code at the slightest change upon re-infection. But this type of virus is rarely found in this form, and most likely when infecting another site, the code will look slightly different.
Manul found this:
$GLOBALS['_433305846_']=Array(base64_decode('' .'ZG' .'Vma' .'W5l'),base64_decode('ZmlsZV9n' .'ZXRfY2' .'9udGVudH' .'M=')
It is also good, especially since any kind of this code will also be found by him, since the search is performed on a fairly complex regular expression, covering all varieties of this code.
But in no case can the found code be cut from the file. The virus also defines all this code from beginning to end as malicious and can cut it without serious consequences during treatment. In general, file disinfection (meaning not deleting a file, but cutting a malicious area from a file) is not available in all these antiviruses. Virusesday also can cure not everything, but a lot.
You might think that we specifically could take such examples of viruses that are not treated by other antivirus. This is not true. We just tried to choose examples that most clearly show the difference in the quality of treatment. In principle, almost any example that we could take would show the same thing, but not so clearly. We analyzed the antivirus databases of all antiviruses, and almost all the signatures in these databases detect certain small sections of the code without covering the entire virus body.
Instead of output
ClamAV does not know how to treat. LMD, ClamAV and Manul are free server utilities, and
Virusday is a paid SaaS with support and firewall. We will not discuss the functionality and usability of each antivirus in this article, since they have many differences, and we will limit ourselves only to a comparison of the anti-virus databases. Dry statistics speak better than an abundance of empty words. And besides, everyone is free to choose his own tool for the soul and needs.