Search for deleted files: FAT file system

In this article, I would like to talk about the algorithms that we used when creating the data recovery program Hetman Partition Recovery .

But for a start it is worth saying that the recovery of files is generally possible, because they are stored as blocks of information recorded on hard disk sectors. Sectors can be arranged either sequentially, one after another, or be randomly scattered across the surface of the disk. The location of the sectors depends on which particular blocks were free when the file was saved to disk. If the system does not detect an uninterrupted free block of sectors of sufficient size on a disk in order to save the file as a continuous data sequence, the system will fragment the file, writing its separate parts into free blocks.

By the way, in case files are fragmented and the file system is damaged or destroyed (for example, after formatting a disk), data recovery tools use signature search algorithms that read all data from the surface of a hard disk in order to detect known file types.

And here lies the main difficulty. The signature search algorithms in their work depend on the file header structures, having analyzed which, it is possible to determine the file size. Possessing information about the location of the header and knowing the exact file size, the programs calculate sectors on the disk, which, in their opinion, contain the file data. As you understand, these algorithms will work correctly only in conditions when the entire file is stored as a single continuous fragment. If the file was saved as a set of separate fragments, it will be extremely difficult, almost impossible, to restore it in the absence of a record in the file system.
')
But even for such cases, there are special methods used by special services and criminologists, allowing to restore the contents of some types of files literally in pieces. Their work is similar to a puzzle collection: various pieces of data are tried in different places of the file, after which the file is analyzed for integrity using heuristic algorithms. Needless to say, the work of such algorithms is extremely resource-intensive, but at the same time it is so slow and unreliable that it is simply not profitable to use them outside the walls of forensic laboratories.

So, in order to navigate the recorded information, Windows creates a record in the file system with an indication of which sectors on the disk is occupied by the contents of a specific file.

At the moment when the user deletes the file, Windows does not erase or rewrite the contents of the sectors on the disk. In the case of deleting a file from the SSD, the memory can be cleared by the TRIM function enabled. The contents of the file entry in the file system are also not deleted, but are subject to modification: the system marks the entry as belonging to the deleted file. Now Windows can save some other file to this space. But until this happens, you can try to restore the contents of the deleted file. Tools for recovering deleted files scan the file system in the search for records marked as deleted. After analyzing such records, it becomes possible to find out the exact addresses of the sectors on the disk in which the contents of the original file were written. After a quick additional check - if these sectors do not belong to any other file - the program will read the data from the necessary sectors and save them in a new file. What happens if there is no record in the file system pointing to the deleted file? In this case, the simplest tools do not work.

Next, I will describe the algorithm for finding and restoring deleted files from the FAT section that we used when developing our program, this algorithm is best described in the book “Forensic analysis of file systems” by Brian Kerri.

The Windows partition system contains one or more tables, with each table entry describing one partition. The entry data usually indicates the starting sector sector, the ending sector sector (or length) and the type of the section.

To search for the file system, our program proceeds from the assumption that there was a file system in each section. Many file systems start with a data structure with a constant signature. For example, the FAT file system contains the values 0x55 and 0xAA in bytes 510 and 511 of the first sector. The recovery program searches for signatures and determines from them the possible beginning of the partition. When a signature is detected, additional checks are often performed with ranges of values allowed for some fields of the data structure. For example, one of the fields in the FAT file system determines the number of sectors in a cluster; the field value is a power of 2 (for example, 1, 2, 4, 8, 16, 32, 64, or 128). Any other value indicates that the sector is not part of the boot sector of the FAT file system, although it ends with a 0x55AA signature.

How to find a file in the FAT table

The basic concept of the FAT file system is that each file and directory is given a data structure called a directory entry. This structure stores the file name, its size, the starting address of the file content and other metadata. The contents of files and directories are stored in blocks of data called clusters. If a file or directory is allocated more than one cluster, the remaining clusters are located using a data structure called FAT. The FAT structure is used both to identify the next clusters in the files and to determine the allocation status of the clusters.

FAT file system is divided into three physical areas. The first area is called reserved, it stores data from the file system category. In FAT12 and FAT16, the reserved area occupies only 1 sector, but formally its size is determined in the boot sector. The second area FAT - contains the main and backup structures FAT. It begins in the sector following the reserved area, and its size is determined by the number and size of FAT structures. The third area - the data area contains clusters allocated for storing files and directory contents.

One of the first tasks in analyzing the FAT file system should be the identification of three physical areas. The reserved area begins in sector 0 of the file system, and its size is specified in the boot sector. In FAT 12/16, it usually takes only 1 sector, and in FAT32 several sectors are reserved for it.

The FAT area contains one or more FAT structures and starts in the sector following the reserved area. Its size is calculated by multiplying the number of FAT structures by the size of one structure; both values are stored in the boot sector (reserved area).

File recovery

When you delete a file in Windows, its directory entry is marked as unused, and the cluster entries in FAT are reset. The beginning and size of the file are known, but there is no information about the remaining clusters of the file.

You can try to restore the contents of the file by reading the data from a known initial cluster. Regarding the choice of the remaining clusters, the recovery program has two methods: read the amount of data corresponding to the file size, not paying attention to the selection state, or read the data only from free clusters.

The second method often leads to success, because it recovers some fragmented files. Figure 5 explains what is happening:
It shows six file system clusters and three different scenarios. The file size is 7094 bytes, and the cluster size is 2048 bytes; therefore, four clusters were allocated for the file. We also know that the file begins with cluster 56. The light gray clusters represent the actual location of the file contents in each scenario.

In the scenario in fig. A 5 (A) file occupies four contiguous clusters. In this case, clusters 56-59 are correctly restored in both cases. In fig. 5 (B) the file was divided into three fragments, and the clusters between the fragments (57 and 60) were allocated to another file during recovery. In this scenario, the first method restores clusters 56-59 and erroneously includes the contents of cluster 57. The second method correctly restores sectors 56, 58.59 and 61. In fig. 5 (C) shows the scenario in which the same fragments are allocated to the file as in fig. 5 (B), but the clusters between the fragments were not allocated to other files at the time of recovery. In this scenario, both methods mistakenly restore clusters 56-59, as was done in the previous example. In the process of restoring the file, other situations are possible, but in them a part of the file is erased, and no other method can recover the data exactly. Thus, method 2 (taking into account the state of cluster allocation) is able to recover deleted files more often than method 1.

In conclusion, I will say that the NTFS file system is much more complex than FAT, the search for deleted files in it is the topic of a separate article.

Source: https://habr.com/ru/post/177549/

All Articles

Search for deleted files: FAT file system

How to find a file in the FAT table

File recovery

More articles: