
Why write
- because it is convenient to have your own customizable tool in which you can intervene in archiving at any stage
- because it's interesting
- because many archivers have api, paid, and about others see the first argument.
Technologies and Libraries
You will
need the library
zlib.net.dll (
official site ).
Visual Studio 2010 Development Environment
C # language
Framework 3.5
Technical task
Archiver should be able to:
- compress files and directories
- compile archive without compression
- encrypt data (with and without compression)
- exclude specified paths
- delete files after they are compressed
- unpack compressed archive
Design
Archive format
By optimization came to the following option:
Purpose | The size |
Archive type | 1 byte |
Header length (after compression and encryption) | 4 bytes |
Heading (we will consider in more detail below) | N bytes |
First file content block | N bytes |
Second file content block | N bytes |
...... | ...... |
Content block of the K-th file | N bytes |
Archive Header Format
Purpose | The size |
Raw header size | 4 bytes |
Block 1 | N bytes |
Block 2 | N bytes |
...... | ...... |
K block | N bytes |
The format of the archive header block
Purpose | The size |
Block size | 4 bytes |
Absolute path length | 4 bytes |
Absolute way | N bytes |
Relative path length | 4 bytes |
Relative path | N bytes |
Object size after processing | 8 bytes |
')
A little bit of explanation. At the beginning of the archive file is stored the header, which collects all the metadata on the archive objects. The header itself goes through the same stages of compression and encryption as the archive files. After the title, there are blocks that store the contents of files after processing, the blocks go right along. Determining the boundaries of the block follows from the header, which stores the size of the blocks.
General principles of work
The user sets the compression options, on the basis of which the necessary file handlers are connected (archiver, encoder), each such handler contains two methods, Execute and BackExecute. When archiving, call the Execute method, while unzipping the BackExecute method, and when unzipping, we use handlers in the reverse order. Such a structure makes it extremely easy to supplement the program with any number of new processors (for example, implementing other methods of encryption or compression).
Work algorithm
- Determination of archive type (compressed, encrypted)
- Reading the list of archiving objects
- Forming a complete list of objects to be archived based on the read list and exclusion list
- Creating archive header (in object view)
- Enumerate the full list of objects in the title
- Processing the object, updating data on its size after processing in the header, writing to the temporary file of the processed content.
- Save header to file
- Header processing (compression, encryption)
- Building the final archive file
Implementation
ZLib is able to compress / decompress the data transferred to it as an array of bytes. Actually, this is all we need and all that will be used. He does not know how to encrypt data, for this we use the standard .NET Framework library - System.Security.Cryptography.
In the process of archiving / unarchiving, you can get data on the current object being processed, as well as any errors that have occurred.
In case of receiving an error while processing a file, the user is offered a choice of 4 actions:
- abort execution
- ignore error
- ignore all errors
- to repeat
The request for action can be canceled simply by commenting out the ErrorProcessing event, in which case the execution of the program is interrupted.
I will not give the program code, I give a link to the sources.
Directly:
ProjectIn the form of dll'kiSVN:
svn: //svn.code.sf.net/p/yark/code-0/trunk
Project:
sourceforge.net/projects/yarkAnd an example of use:
Compression
ArchiveProvider compressor = new ArchiveProvider(); using (SaveFileDialog sfd = new SaveFileDialog()) { if (sfd.ShowDialog() == System.Windows.Forms.DialogResult.OK) { CompressorOption option = new CompressorOption() { Password = __, WithoutCompress = true___, RemoveSource = true____, Output = sfd.FileName };
Unarchiving
ArchiveProvider decompressor = new ArchiveProvider(); using (FolderBrowserDialog fbd = new FolderBrowserDialog()) { if (fbd.ShowDialog() == System.Windows.Forms.DialogResult.OK) { decompressor.Decompress(__, fbd.SelectedPath, __); } }
Comparison of work results
By the time the result did not begin to detect, approximately equally.
Initial data:
- catalog with text files (1 430 Kb)
- catalog with mixed data (18,893 Kb)
| Text | Mixed data |
Winrar | 613 | 8,045 |
Zip | 638 | 8,709 |
This | 588 | 8,655 |
For rar and zip format, the usual compression parameter was set, which is used in the program.
The current archive format stores absolute file and directory paths, you can exclude them and slightly improve compression.
Possible improvements
- saving file information (date of creation / modification, access rights)
- add multithreading (you just need to parallelize the creation of temporary files)
- add comments to the archive
- associate files with the program