Writing a virtual file system in c ++

Another of my records from the sandbox, if you have time, I will translate the remaining parts

This is a translation of the first part of an article about writing VFS (virtual file system) to c ++ which I found for a long time. I hope you will like it. :)

Introduction

When I started developing my 3D engine, I realized that I needed something like a file system. Not a simple archive, but its own virtual file system that supports compression, encryption, has fast access time, and so on.

And I decided to lay out my work so that you do not have to reinvent the wheel. This article will be divided into 2 parts. The first is what you are reading now and the structure of the VFS will be described here. In the second part, we actually write the VFS itself (it will be quite large).
')

So what is VFS?

VFS is a file system similar to those in Windows (fat32, ntfs, etc.). The main difference between VFS and a real file system is that VFS uses a real file system inside it.

Functionality

Some of the VFS features are:
Fast access time
Several archives instead of a huge number of small files
Debugging option
Pluggable Encryption and Compression (PEC)
Security (inside the vfs file is stored MD5 key so that any changes to the archive will be immediately noticed)
Several root paths

Now, since we have listed the list of possibilities, we can proceed to the design phase

Basic design

Let's get started: We will have a main interface with 16 functions. I will first show you these functions, then we will discuss why they are and we will write them in the next part.

#define VFS_VERSION 0x0100<br/>
#define VFS_PATH_SEPARATOR '\\'<br/>
<br/>
void VFS_Init();<br/>
void VFS_Shutdown();<br/>

These functions actually do nothing except launching / unloading some structures that we will need later.

Filters

typedef BOOL (* VFS_FilterProc )( LPCBYTE pIn, DWORD dwInCount, LPBYTE* ppOut, DWORD* pOutCount );<br/>
<br/>
struct VFS_Filter<br/>
{<br/>
string strName;<br/>

string strDescription;<br/>
VFS_FilterProc pfnEncodeProc;<br/>
VFS_FilterProc pfnDecodeProc;<br/>
};<br/>
<br/>
void VFS_RegisterFilter( VFS_Filter* pFilter );<br/>
void VFS_UnregisterFilter( VFS_Filter* pFilter );<br/>
void VFS_UnregisterFilter( DWORD dwIndex );<br/>

DWORD VFS_GetNumFilters();<br/>
const VFS_Filter* VFS_GetFilter( DWORD dwIndex );<br/>

Well, this is a little more difficult. VFS_Filter is a kind of routine that processes data. You can for example write VFS_CryptFilter that encrypts / decrypts data. Do you have it? The filter consists of something like a pre-processor (pfnEncodeProc procedure) for data written to the archive and a post-processor (pfnDecodeProc procedure) for reading data from the archive. These filters implement the Pluggable Encryption and Compression mentioned above, so you can assign one or more filters to each vfs file you use. If you are a little confused, then look at Figure 1, which is a filter scheme.

You see, encoding / decoding procedures manipulate data flow in both directions: from archive to memory and from memory to archive.

For a better understanding of filters, let's write a simple filter (in any case, keep in mind that we will not be able to test the filter, because we will implement VFS later). Our filter will add 1 to each byte.

VFS_Filter ONEADD_Filter =<br/>
{<br/>
"ONEADD",<br/>
"This Filter adds 1 to each Byte of the Data. It doesn't really make sense, "<br/>
" but anyway, this is just a test, you know",<br/>

ONEADD_EncodeProc,<br/>
ONEADD_DecodeProc<br/>
};<br/>
<br/>
BOOL ONEADD_EncodeProc( LPCBYTE pIn, DWORD dwInCount, LPBYTE* ppOut, DWORD* pOutCount )<br/>
{<br/>
assert( ppOut );<br/>
assert( pOutCount );<br/>

<br/>
// Allocate the Memory.<br/>
*ppOut = new BYTE[ dwInCount ];<br/>
<br/>
// Perform a For-Loop through each Byte.<br/>
for( DWORD dwIndex = 0; dwIndex < dwInCount; dwIndex++ )<br/>
{<br/>

( *ppOut )[ dwIndex ] = ( BYTE )( pIn[ dwIndex ] + 1 );<br/>
}<br/>
<br/>
// Set the Output Count.<br/>
*pOutCount = dwInCount;<br/>
}<br/>
<br/>
BOOL ONEADD_DecodeProc( LPCBYTE pIn, DWORD dwInCount, LPBYTE* ppOut, DWORD* pOutCount )<br/>

{<br/>
assert( ppOut );<br/>
assert( pOutCount );<br/>
<br/>
// Allocate the Memory.<br/>
*ppOut = new BYTE[ dwInCount ];<br/>
<br/>
// Perform a For-Loop through each Byte.<br/>

for( DWORD dwIndex = 0; dwIndex < dwInCount; dwIndex++ )<br/>
{<br/>
( *ppOut )[ dwIndex ] = ( BYTE )( pIn[ dwIndex ] - 1 );<br/>
}<br/>
<br/>
// Set the Output Count.<br/>

*pOutCount = dwInCount;<br/>
}<br/>

The only meaning of this filter is that it can be made more difficult to open the vfs file.

Root Path Functions

We discussed the functions of root paths not so long ago, remember? If not - no problem. I said that we want a function to use several root paths, that is, several search paths, such as the program installation directory, an optical disk, and a network drive. The following functions will be used to accomplish this:

void VFS_AddRootPath( LPCTSTR pszRootPath );<br/>
void VFS_RemoveRootPath( LPCTSTR pszRootPath );<br/>
void VFS_RemoveRootPath( DWORD dwIndex );<br/>
DWORD VFS_GetNumRootPaths();<br/>
LPCTSTR VFS_GetRootPath( DWORD dwIndex );

Everything is easy enough, right?

Some simple, but necessary things.

The following 4 functions are quite simple:

void VFS_Flush();

This function will close all open vfs files that are not accessed. You may wonder why a vfs file that is not accessed does not automatically close, but if we do that, then we had to re-analyze the vfs file every time we open or close the file. Look at this code for further explanation:

// Reference Count is 1. -> Load + Parse!!!<br/>
DWORD dwHandle = VFS_File_Open( "Bla\\Bla.Txt" ); <br/>

<br/>
// Reference Count is 0. -> Close!!!<br/>
VFS_File_Close( dwHandle ); <br/>
<br/>
// Reference Count is 1. -> Load + Parse!!!<br/>
dwHandle = VFS_File_Open( "Bla\\Bla.Txt" );<br/>

<br/>
// Reference Count is 0. -> Close!!!<br/>
VFS_File_Close( dwHandle );<br/>

You see, we would have to open the archive file two times. A good place to call is VFS_Flush (). The game may be when all the data in the level have been loaded. But here are the last 3 main functions:

struct VFS_EntityInfo<br/>
{<br/>
BOOL bIsDir; // Is the Entity a Directory.<br/>

BOOL bArchived; // True if the Entity is located in an Archive.<br/>
string strDir; // like Models/Sarge/Textures<br/>
string strPath; // like Models/Sarge/Textures/Texture1.Jpg<br/>
string strName; // like Texture1.Jpg<br/>
DWORD dwSize; // The Number of Files and Subdirectories for a<br/>
Directory.<br/>

};<br/>
<br/>
BOOL VFS_Exists( LPCTSTR pszPath );<br/>
void VFS_GetEntityInfo( LPCTSTR pszPath, VFS_EntityInfo* pInfo );<br/>
DWORD VFS_GetVersion();<br/>

The first function is currently checking if the object exists with the pszPath path. You see that this is a fairly simple thing, but in C (+ +) the standard library does not contain such a function (I know, they have functions like stat (), but I just want things like exists ()). The second function is something like stat (), it returns information about the object.

The latter function has nothing to do with the information about the object; this function simply returns the current version of VFS. Nothing special (Seriously, it just returns the VFS_VERSION constant ;-)

File Interface

We looked at simple things. But do not worry, there are still a couple of easy things ahead. In fact, everything described in this part of the article is easy. Unfortunately, if you want something more complicated, you will have to wait for the next part of this article ... ;-)

Well, here they are, the file interface functions:

#define VFS_INVALID_HANDLE ( ( DWORD ) -1 )<br/>
<br/>
// The VFS_File_Open/Create() Flags.<br/>
#define VFS_READ 0x01<br/>
#define VFS_WRITE 0x02<br/>

<br/>
// The VFS_File_Seek() Flags.<br/>
#define VFS_SET 0x00<br/>
#define VFS_CURRENT 0x01<br/>
#define VFS_END 0x02<br/>
<br/>
// Create / Open / Close a File.<br/>
DWORD VFS_File_Create( LPCTSTR pszFile, DWORD dwFlags );<br/>
DWORD VFS_File_Open( LPCTSTR pszFile, DWORD dwFlags );<br/>
void VFS_File_Close( DWORD dwHandle );<br/>

<br/>
// Read / Write from / to the File.<br/>
void VFS_File_Read( DWORD dwHandle, LPBYTE pBuffer, DWORD dwToRead, DWORD* pRead = NULL );<br/>
void VFS_File_Write( DWORD dwHandle, LPCBYTE pBuffer, DWORD dwToWrite, DWORD* pWritten = NULL );<br/>
<br/>
// Direct Data Access.<br/>
LPCBYTE VFS_File_GetData( DWORD dwHandle );<br/>
<br/>
// Positioning.<br/>
void VFS_File_Seek( DWORD dwHandle, LONG dwPos, DWORD dwOrigin = VFS_SET );<br/>

LONG VFS_File_Tell( DWORD dwHandle );<br/>
DWORD VFS_File_GetSize( DWORD dwHandle );<br/>
<br/>
// Information.<br/>
BOOL VFS_File_Exists( LPCTSTR pszFile );<br/>
void VFS_File_GetInfo( LPCTSTR pszFile, VFS_EntityInfo* pInfo );<br/>
void VFS_File_GetInfo( DWORD dwHandle, VFS_EntityInfo* pInfo );<br/>

There are only a few things worth noting. First, the dwFlags parameter of VFS_File_Create () and VFS_File_Open () can be either VFS_READ or VFS_WRITE or both, which means read, write or read / write access. Secondly, these two functions return a handle, which is used by almost all other functions, as a kind of pointer. We will not use pointers, but we will use a handle since they provide another level of abstraction. I would also like to mention the fact that our functions will load the entire file into memory. This is necessary due to the filtering feature (since they need memory for processing). You can access this memory directly with VFS_File_GetData (). Well, the rest in things you should know, thanks to the standard I / O library.

The interface of our library

This may be the place you were expecting, starting with some lines or it would be better to say pages (and when we talk about waiting: what I didn’t expect is the fact that this is page 7 or so. WOW!).

In any case, let's continue:

// Create / Open / Close an Archive.<br/>
DWORD VFS_Archive_Create( LPCTSTR pszArchive, const VFS_FilterNameList& Filters, DWORD dwFlags );<br/>
DWORD VFS_Archive_CreateFromDirectory( LPCTSTR pszArchive, LPCTSTR pszSrcDir,<br/>

const VFS_FilterNameList& Filters, DWORD dwFlags );<br/>
DWORD VFS_Archive_Open( LPCTSTR pszArchive, DWORD dwFlags );<br/>
void VFS_Archive_Close( DWORD dwHandle );<br/>
<br/>
// Set the Filters used by this Archive.<br/>
void VFS_Archive_SetUsedFilters( DWORD dwHandle, const VFS_FilterNameList& Filters );<br/>
void VFS_Archive_GetUsedFilters( DWORD dwHandle, VFS_FilterNameList& Filters );<br/>
<br/>

// Add / Remove Files to / from the Archive.<br/>
void VFS_Archive_AddFile( DWORD dwHandle, LPCTSTR pszFile );<br/>
void VFS_Archive_RemoveFile( DWORD dwHandle, LPCTSTR pszFile );<br/>
<br/>
// Extract the Archive.<br/>
void VFS_Archive_Extract( DWORD dwHandle, LPCTSTR pszTarget );<br/>
<br/>
// Information.<br/>
void VFS_Archive_GetInfo( DWORD dwHandle, VFS_EntityInfo* pInfo );<br/>
void VFS_Archive_GetInfo( LPCTSTR pszArchive, VFS_EntityInfo* pInfo );<br/>

Very simple interface, right? Just the usual stuff for an archive file. And now you finally see the application for the filter functions that we saw before. You can apply filters using VFS_Archive_Set / GetUsedFilters ().

Folder Interface

This is the latest VFS interface. It contains 3 functions that should be understood without explanation as I think.

// Information.<br/>
BOOL VFS_Dir_Exists( LPCTSTR pszDir );<br/>
BOOL VFS_Dir_GetInfo( LPCTSTR pszDir, VFS_EntityInfo* pInfo );<br/>
<br/>

// Get the Contents of a Directory.<br/>
vector< VFS_EntityInfo > VFS_Dir_GetContents( LPCTSTR pszDir, BOOL bRecursive = FALSE );

Functions 1 and 2 are fairly simple (such as if they are used for files). Function 3 acts as a DOS command. ;-)

A little chat

That's all. We have completed the first part of the article. I do not believe (me too :) approx. translator). But the hardest is still to come:

We need to WRITE VFS !!!

Download article_vfs_header.h

Source: https://habr.com/ru/post/64538/

All Articles