Search by signatures is one of the most important algorithms that make modern data recovery programs what they are: universal tools that can extract files from formatted, damaged and inaccessible disks.
Let's first understand how Windows stores and deletes files.
Files are stored in the form of blocks of information recorded in hard disk sectors. Sectors can be arranged either sequentially, one after another, or be randomly scattered across the surface of the disk. The location of the sectors depends on which particular blocks were free when the file was saved to disk. If the system does not detect an uninterrupted free block of sectors of sufficient size on a disk in order to save the file as a continuous data sequence, the system will fragment the file, writing its separate parts into free blocks.
In order to navigate through the recorded information, Windows creates an entry in the file system with an indication of which sectors on the disk the contents of a particular file occupy.
')
At the moment when the user deletes the file, Windows does not erase or rewrite the contents of the sectors on the disk. The contents of the file entry in the file system are also not deleted, but are subject to modification: the system marks the entry as belonging to the deleted file. Accordingly, all sectors on the disk belonging to this file are free - now Windows can save some other file into this space. But until this happens, you can try to restore the contents of the deleted file. This will require special software to recover information.
Programs for recovering deleted files scan the file system in the search for records marked as deleted. After analyzing such records, it becomes possible to find out the exact addresses of the sectors on the disk in which the contents of the original file were written. After a quick additional check - if these sectors do not belong to any other file - the program will read the data from the necessary sectors and save them in a new file. Problem solved!
What happens if there is no record in the file system pointing to the deleted file? In this case, the simplest tools do not work. Another approach is required - “signature search for data recovery”.
Signature Search
Search by signatures allows data recovery programs to work with damaged and formatted partitions, as well as with disks re-partitioned. For technology, there are many commercial names. “Power Search”, “Content-Aware Analysis”, “Smart Scan” - all these technologies from different manufacturers work according to the same principle.
The basic principle of the signature search algorithms is the same as for the very first antiviruses. As the antivirus scans a file in search of data sections that match the known virus code fragments, the signature search algorithms used in data recovery programs read information from the disk surface in the hope of finding familiar data sections. Many types of file headers contain character strings. For example, JPEG files contain a sequence of “JFIF” characters, ZIP archives begin with “PK” characters, and PDF documents begin with “% PDF-“ characters.
Some files (for example, text and HTML files) do not have characteristic signatures, but can be identified by circumstantial signs, because contain only characters from the ASCII table.
More examples:
File | Starts with a signature |
---|
avi | 5249 |
bmp | 424D |
tif | 4949 |
doc | D0CF |
docx | 504B |
jpeg | FFD8 |
png | 8950 |
To restore a file, it’s not enough to find its beginning; you also need to determine its end. The end of the file can be found, knowing the size and address of the beginning of the file. File size is determined either by analyzing the header (ZIP, JPEG, AVI, etc.), or by reading and analyzing the disk sectors immediately following the header. For example, the algorithm will consider the first sector to be the end of a text or HTML file, which will contain non-ASCII characters.
Signature search is not a panacea. Overwriting the contents of the disk and fragmenting files (especially large files) have a negative impact on the ability to recover information.
A good example of software implementation of signature search is
Starus Partition Recovery .
Introductory video . The program in the test version allows you to analyze the storage medium and view the files found for recovery.
How to delete a file so that it cannot be restored
There is a whole class of programs designed for reliable and secure destruction of information. One of the best programs for deleting files and overwriting free disk space with random data is
Eraser .

Such programs use random number arrays to physically rewrite the disk space occupied by the file being deleted. Some security standards (for example, the standard used in the US Army) require several rewriting cycles and insist on using cryptographically resistant random number generators. In practice, a single rewrite cycle is enough for private users and most commercial organizations.