📜 ⬆️ ⬇️

Recovering deleted data using Scalpel

Every person in life has a situation like rm -rf on the folder where it should not have been done. Backup is good, but what if there aren't any? For Linux systems, there is the Scalpel utility, which allows you to recover deleted files by specified patterns, including the use of regular expressions.

Scalpel is a fork of the Foremost project (from version 0.69), which began its history in 2005. It has its own github repository and is faster in data recovery and efficiency than Foremost. Speaking about the difference between these two projects, we can say that Foremost released after version 0.69 has new semantic data recovery techniques. For example, when recovering JPEG files, the header of this file is used to calculate the corresponding image body, when Scalpel simply takes the data between the specified completion signatures and the beginning of the image file. Thus, it can be said that Foremost can more accurately recover lost data when Scalpel does this much faster.

Scalpel features:

  1. Recovery independent of file system
  2. Setting the minimum and maximum sizes of the file being restored
  3. Multithreading on multicore systems
  4. Asynchronous I / O operations that give a boost when searching by pattern
  5. Using TRE regular expressions to search at the beginning and end of a file
  6. The ability to recover from nested data structures
  7. GPUs have the option of using a GPU, which is available only for Linux and requires the pre-installed NVIDIA CUDA SDK and minor modifications of the source code (regular expression searches do not work with the GPU)

Scalpel is usually available among the packages of the corresponding Linux distribution, but you can also collect it from the raw by taking it from the github repository .
')
The application is configured in the /etc/scalpel/scalpel.conf file, where the corresponding file search patterns are specified. In it you can see ready-made presets for searching, for example, by images or doc files. To recover lost data, uncomment the appropriate templates and run the application.

If the file does not contain the template of the desired file, or for example you are looking for a particular xml format, then it becomes necessary to create your own template, which is described like the rules presented below.
TypeCase sensitiveSize rangeHeaderFooterSearch option
aviy50,000,000RIFF ???? AVI
docy10,000,000\ xd0 \ xcf \ x11 \ xe0 \ xa1 \ xb1 \ x1a \ xe1 \ x00 \ x00\ xd0 \ xcf \ x11 \ xe0 \ xa1 \ xb1 \ x1a \ xe1 \ x00 \ x00NEXT
pdfy500,000% PDF% EOF \ x0dREVERSE
pdfy500,000% PDF% EOF \ x0aREVERSE
texy300: 50,000/%.{1,20}\.tex//%.{1,20}\.tex\sEnd/
phpy100,000<? php?>REVERSE
Briefly about the columns


It is important to note that the values ​​for the last field of the Search option are:


The note
In case you need to use the question mark "?" as the required value in the header or end of the file, then you need to override the wildcard symbol, which is the question mark. To do this, at the beginning of the configuration file write
wildcard s
Where S is the new designation of the wildcard symbol in the search expression, or use the hexadecimal representation of this symbol, which is equivalent to \ 0x3f or \ 063

And now practice

Suppose we deleted files that are described by templates in the table above. Let's write these templates into the configuration file (the tab character is used as the column separation) /etc/scalpel/scalpel.conf and launch the recovery (I prepared an image with the data for recovery in advance)

root# scalpel MyDrive.img -o recover Written by Golden G. Richard III, based on Foremost 0.69. Opening target "/home/username/Documents/repair_files/test/MyDrive.img" Image file pass 1/2. MyDrive.img: 100.0% |*****************************************************| 500.0 MB 00:00 ETA Allocating work queues... Work queues allocation complete. Building carve lists... Carve lists built. Workload: avi with header "\x52\x49\x46\x46\x3f\x3f\x3f\x3f\x41\x56\x49" and footer "" --> 1 files doc with header "\xd0\xcf\x11\xe0\xa1\xb1\x1a\xe1\x00\x00" and footer "\xd0\xcf\x11\xe0\xa1\xb1\x1a\xe1\x00\x00" --> 2 files pdf with header "\x25\x50\x44\x46" and footer "\x25\x45\x4f\x46\x0d" --> 33 files pdf with header "\x25\x50\x44\x46" and footer "\x25\x45\x4f\x46\x0a" --> 19 files php with header "\x3c\x3f\x70\x68\x70" and footer "\x3f\x3e" --> 8 files Carving files from image. Image file pass 2/2. MyDrive.img: 100.0% *****************************************************| 500.0 MB 00:00 ETA Processing of image file complete. Cleaning up... Done. Scalpel is done, files carved = 63, elapsed = 6 seconds. 

After completion of the execution in the resulting folder, we find the files found and audit.txt which will contain brief information about the files found, like the one shown below.

 Scalpel version 1.60 audit file Started at Wed Jan 7 12:50:52 2015 Command line: scalpel MyDrive.img -o recover Output directory: /home/username/Documents/repair_files/test/recover Configuration file: /etc/scalpel/scalpel.conf Opening target "/home/username/Documents/repair_files/test/MyDrive.img" The following files were carved: File Start Chop Length Extracted From 00000003.pdf 549888 NO 4162 MyDrive.img 00000055.php 1227776 NO 99954 MyDrive.img 00000001.doc 8916992 YES 10000000 MyDrive.img 

Also note some of the available options.
-p if this option is used, the files will not be restored, but an audit file will be created in which you can see which files will be restored
-q with this option, scalpel will scan only the beginning of each cluster of a given size and search for the corresponding beginning of the desired file.
-v verbose mode
-o specify the directory where the result of the data recovery will be put

All successful data recovery!



useful links
  1. Github Scalpel repository
  2. Scalpel: A Frugal, High Performance File Carver. Golden R. Richard III, Vassil Roussev
  3. SANS Institute InfoSec reading room: data carving concepts
  4. TRE Regex Syntax

Source: https://habr.com/ru/post/247421/


All Articles