Steam Files. Part 1 - GCF / NCF

As promised in the previous article , I begin to publish articles about the part of the Steam infrastructure that the Anti-Steam community was able to open by reverse engineering and lengthy brainstorming.

Until recently, GCF format files were the standard for all games produced by VALVE, and NCF for all others. In themselves, these files represent a file system image with several levels of protection. The difference between NCF and GCF is that the former contain only headers, and the files belonging to them are located in a separate directory ( <Steam> / SteamApps / common / <game name> ). Therefore, I will describe the GCF, and all the features of the NCF will be given later.

In this article I will examine in detail the structure of these files and work with them on the example of my library (a link to it is at the end of the article). The beginning will be quite boring - a description of the structure and purpose of their fields. The most "tasty" will be after them ...
')
All the code shown here is the result of reverse engineering of the Steam libraries. Most of the information about the format of files was obtained from open sources , but I added a little bit and significantly optimized work with cache files (even in comparison with the most popular HLLIB library at that time).

General file structure

The file is logically divided into 2 parts - the headers and the content itself. The content is divided into blocks, which in turn are divided into sectors of 8kB, which belong to certain files and their sequence is described in the headers. All headers contain fields that are four-byte integers (the exception is the part responsible for the list of file and directory names).

Headers consist of the following structures:

Fileheader
BlockAllocationTableHeader
BlockAllocationTable []
FileAllocationTableHeader
FileAllocationTable []
ManifestHeader
Manifest []
FileNames
HashTableKeys []
HashTableIndices []
MinimumFootprints []
UserConfig []
ManifestMapHeader
ManifestMap []
Checksumdatacontainer
FileIdChecksumTableHeader
FileIdChecksums []
Checksums []
ChecksumSignature
LatestApplicationVersion
Dataheader

The first thing that catches your eye is ChecksumSignature , which is the encrypted hash of the header section responsible for checksums of files.
All these headers and the purpose of their fields will be discussed below.
For those who did not read very carefully, let me remind you that all the fields of almost all headers are four-byte integers ( uint32_t in C ++), unless otherwise specified.

Fileheader

Based on the name, is the header of the entire file and contains the following fields:

Headerversion
Cachetype
FormatVersion
ApplicationID
ApplicationVersion
Ismounted
Dummy0
FileSize
ClusterSize
ClusterCount
Checksum

HeaderVersion is always 0x00000001, indicating the version of this header.
CacheType is 0x00000001 for GCF and 0x00000002 for NCF.
FormatVersion - indicates the version of the structure of the remaining headers. The latest version is 6. It will be described below.
ApplicationID - file identifier (AppID).
ApplicationVersion - the version of the file contents. Used to control the need for updates.
IsMounted - contains 0x00000001 if the file is currently mounted by another application. Currently not used, therefore always equal to 0x00000000.
Dummy0 - leveling field containing 0x00000000.
FileSize is the total file size. If it exceeds 4GB, then this field contains the difference <file size> -ffffffff , and the file size is calculated from
block size and quantity
ClusterSize - the size of the data block in the content. For GCF contains 0x00002000, and for NCF - 0x00000000.
ClusterCount - the number of data blocks in the content.
Checksum - header checksum. Calculated by the following function:

UINT32 HeaderChecksum(UINT8 *lpData, int Size) { UINT32 Checksum = 0; for (int i=0 ; i<Size ; i++) Checksum += *(lpData++); return Checksum; }

The first parameter is the pointer to the structure, and the second is its size, with the exception of the Checksum field (that is, 4 less).

BlockAllocationTableHeader

Contains the description of the block table (not sectors!):

Blockcount
BlocksUsed
LastUsedBlock
Dummy0
Dummy1
Dummy2
Dummy3
Checksum

BlockCount - contains the total number of blocks in the file.
BlocksUsed - the number of used blocks. Always less than the total number of blocks. If it approaches it, the value of the total number increases, causing all subsequent headers to be rebuilt and the first data sector to move to the end of the file to free up space for headers.
LastUsedBlock - the index of the last block used.
Dummy0, Dummy1, Dummy2, Dummy2 - leveling fields, contain 0x00000000.
Checksum - header checksum. Contains the sum of all previous fields.

BlockAllocationTable

Is an array of BlockAllocationTableEntry structures, the number of which is equal to the total number of blocks ( BlockAllocationTableHeader.BlockCount ):

uint16_t Flags
uint16_t Dummy0
Filedataoffset
FileDataSize
FirstClusterIndex
NextBlockIndex
PreviousBlockIndex
ManifestIndex

Flags - contains bit flags of the block. Possible masks:

0x8000 - block is used;
0x4000 - local copy of the file takes precedence;
0x0004 - the block is encrypted;
0x0002 - the block is encrypted and compressed;
0x0001 - the block contains some raw data (RAW).

Dummy0 leveling field, contains 0x0000.
FileDataOffset contains the offset of this block relative to the file to which it belongs.
FileDataSize - the size of a fragment of the file stored in this block.
FirstClusterIndex - index of the first cluster in the cluster table.
NextBlockIndex - the index of the next block. Contains the value of a BlockAllocationTableHeader. BlockCount if this is the last block in the chain for this file.
PreviousBlockIndex - contains the index of the previous block in the chain. If it is the first, then it contains the value of BlockAllocationTableHeader. Blockcount
ManifestIndex - manifest index for this block.
The index of the table is the block number from the ManifestMap list.

FileAllocationTableHeader

Sector Table Header:

ClusterCount
FirstUnusedEntry
IsLongTerminator
Checksum

ClusterCount - contains the number of sectors. Contains a value equal to FileHeader.ClusterCount .
FirstUnusedEntry - the index of the first unused sector.
IsLongTerminator - defines the value, which is an indicator of the end of the chain of sectors. If it contains 0x00000000, then the terminator is 0x0000FFFF, otherwise - 0xFFFFFFFF.
Checksum - header checksum. As with the BlockAllocationTableHeader , is the sum of the previous header fields.

FileAllocationTable

A sector table containing FileAllocationTableHeader.ClusterCount records of type uint32_t . Each cell contains the index of the next cluster in the chain or the terminator value (see the FileAllocationTableHeader declaration if it is the last in the chain.
The index of the list is the sector number.

ManifestHeader

It contains the description of the manifest table:

Headerversion
ApplicationID
ApplicationVersion
Nodecount
Filecount
CompressionBlockSize
Binary size
NameSize
HashTableKeyCount
NumOfMinimumFootprintFiles
NumOfUserConfigFiles
Bitmask
Fingerprint
Checksum

HeaderVersion - version of the header. Contains 0x00000004.
ApplicationID - file identifier. Equal to FileHeader.ApplicationID .
ApplicationVersion - the version of the file contents. Equal to FileHeader.ApplicationVersion .
NodeCount - the number of elements in the manifest.
FileCount - the number of files declared in the manifest (and contained in the cache).
CompressionBlockSize - the maximum size of a compressed block (its uncompressed data).
BinarySize - the size of the manifest (including this structure).
NameSize is the size of the data block containing the element names (in bytes).
HashTableKeyCount - the number of values in the hash table.
NumOfMinimumFootprintFiles - the number of files required to run the application (which must be unpacked to disk).
NumOfUserConfigFiles - the number of user configuration files. With this file on the disk, it is not overwritten when the game starts and has a higher priority.
Bitmask - contains bitmasks . In public versions, files always contain 0x00000000.
Fingerprint is a unique number that is randomly generated each time a manifest is updated.
Checksum - checksum. Calculated using the Adler32 algorithm. The calculation algorithm will be given after the description of the headers.

Manifest

A tree containing a description of all the files in the cache. The size of the table is equal to the value of ManifestHeader.NodeCount . All elements of the table are represented by the following structures:

Nameoffset
CountOrSize
Fileid
Attributes
Parentindex
NextIndex
ChildIndex

NameOffset - the offset of the name of the element in the corresponding data block.
CountOrSize - the size of the element. For directories, it is equal to the number of child elements, and for files, it is directly the size of the file (or part of the file described by this manifest).
FileId - file identifier. Used to bind multiple manifests for large files and search for a list of checksums.
Attributes - bit file attribute field. Possible values (from confirmed):

0x00004000 - the node is a file;
0x00000100 - encrypted file;
0x00000001 - configuration file. The local copy is not overwritten.

ParentIndex - parent element index. For the root element is 0xFFFFFFFF.
NextIndex - the index of the next item at the current tree level.
ChildIndex - the index of the first child.
If there are no elements for NextIndex and ChildIndex , then they contain the value 0x00000000.
The tree must contain at least one element - the root.
The index of the list containing elements of the tree is the element number (used later)

FileNames

A char data block, ManifestHeader.NameSize bytes in size. Contains null-terminated strings , which are the names of the elements described in the manifest tree. It is obligatory to have the first, root element - an empty string. The offset of element names is given by the value of Manifest []. NameOffset

HashTableKeys

Contains a hash table of item names. Contains index values for HashTableIndices , distributed over indices derived from the Jenkins hash function lookup2 for lowercase strings. More will be considered in the description of the search elements.

HashTableIndices

Contains an index table of the elements referenced by the values from the previous table. Number of elements - ManifestHeader.NodeCount .

MinimumFootprints

Contains a list of item numbers in the Manifest that need to be unpacked when the application starts.

Userconfigs

Contains a list of item numbers in the Manifest that are user configuration files.

ManifestMapHeader

Manifest map title:

Headerversion
Dummy0

HeaderVersion - version of the header. Equal to 0x00000001.
Dummy0 - equalizing value. Contains 0x00000000.

Manifestmap

Contains a table of links to the first block ( BlockAllocationTable structure) for each element. The element index is the element number in the manifest tree. For directories and files that are not stored in the cache (with a size of zero or for NCF), contains a value equal to BlockAllocationTableHeader.BlockCount .

Checksumdatacontainer

Header of the container storing checksums:

Headerversion
Checksumize

HeaderVersion - version of the header. Equal to 0x00000001.
ChecksumSize - container size. Calculated from the following structure and by LatestApplicationVersion inclusive.

FileIdChecksumTableHeader

Heading the table of checksum indices:

FormatCode
Dummy0
Fileidcount
Checksum count

FormatCode is a kind of constant. Equal to 0x14893721.
Dummy0 - leveling field. Contains the value 0x00000001.
FileIdCount - the number of elements in the "element-first_hesh" table.
ChecksumCount - the number of items in the list of checksums.

FileIdChecksums

Table associating files with a list of checksums:

Checksum count
FirstChecksumIndex

ChecksumCount - the number of checksums in the list for this item.
FirstChecksumIndex - the index of the first checksum in the list.
The index is the value of Manifest []. FileId .

Checksums

List of checksums. Contains consecutive sublists whose first element is referenced by the value FileIdChecksums []. FirstChecksumIndex .
Values are calculated by the following algorithm:

 UINT32 Checksum(UINT8 *lpData, UINT32 uiSize) { return (adler32(0, lpData, uiSize) ^ crc32(0, lpData, uiSize)); }

ChecksumSignature

The signature of the block checksum. Contains the hash value for the checksum block calculated using the SHA-1 algorithm and encrypted using the RSASSA-PKCS1-v1_5 algorithm .

LatestApplicationVersion

This field contains the version of the checksum block. Updated to the latest after each content update.

Dataheader

A header describing the physical placement of data in the cache:

ClusterCount
ClusterSize
FirstClusterOffset
Clustersused
Checksum

ClusterCount - the number of sectors. The value is equal to the FileHeader.ClusterCount field.
ClusterSize - sector size. The value is equal to the FileHeader.ClusterSize field.
FirstClusterOffset - offset of the first sector relative to the beginning of the file.
ClustersUsed - the number of used sectors.
Checksum - header checksum. Equal to the sum of the previous header fields.
After updating the content, the number of used sectors could be reduced. In such cases, the released sectors were transferred to the end of the file to reserve space for future updates.

Algorithms

Finally, the turn of the most interesting came - the most interesting examples of code working with these structures with detailed explanations. The full source package can be found on my repository .

Calculating file size

In most cases, the file size is equal to the value of the Manifest []. CountOrSize field . But for files larger than 4GB, this path is not suitable. VALVE programmers went around this way: for files larger than 2GB, set the high-order bit of this field to “1” and enter another (or several) elements in the list with the same values of the other fields, obtaining a kind of chain. Summing up the value of the Manifest []. CountOrSize fields from this chain, we calculate the final file size.

File Size Calculation Code

 UINT64 CGCFFile::GetFileSize(UINT32 Item) { UINT64 res = lpManifest[Item].CountOrSize & 0x7FFFFFFF; if ((lpManifest[Item].CountOrSize & 0x80000000) != 0) { for (UINT32 i=0 ; i<pManifestHeader->NodeCount ; i++) { ManifestNode *MN = &lpManifest[Item]; if (((MN->Attributes & 0x00004000) != 0) && (MN->ParentIndex == 0xFFFFFFFF) && (MN->NextIndex == 0xFFFFFFFF) && (MN->ChildIndex == 0xFFFFFFFF) && (MN->FileId == lpManifest[Item].FileId)) { res += MN->CountOrSize << 31; break; } } } return res; }

Here I made a small “feint with my ears”, assuming that files larger than 4 GB will not be included in the cache ...

Search item by name

for example, we need to find a file named “hl2 / maps / background_01.bsp”. All names are stored in our tree, so the path will have to be divided into elements connected by a separator (in this case, “/”). Then we look for an element with the name "hl2" among the descendants of the root element. He has an element with the name "maps", and only then an element with the name "background_01.bsp". This path is the most obvious, but very slow - there is a byte-by-byte string comparison, and a walk through the tree. Solid costs.
To speed up this procedure, there are hash tables in the headers.

Search item by name using hash

C ++

 UINT32 CGCFFile::GetItem(char *Item) { int DelimiterPos = -1; for (UINT32 i=0 ; i<strlen(Item) ; i++) if (Item[i] == '\\') DelimiterPos = i; char *FileName = &Item[++DelimiterPos]; UINT32 Hash = jenkinsLookupHash2((UINT8*)FileName, strlen(FileName), 1), HashIdx = Hash % pManifestHeader->HashTableKeyCount, HashFileIdx = lpHashTableKeys[HashIdx]; if (HashFileIdx == CACHE_INVALID_ITEM) if (strcmp(LowerCase(Item), Item) != 0) { Hash = jenkinsLookupHash2((UINT8*)LowerCase(Item), strlen(FileName), 1); HashIdx = Hash % pManifestHeader->HashTableKeyCount; HashFileIdx = lpHashTableKeys[HashIdx]; } if (HashFileIdx == CACHE_INVALID_ITEM) return CACHE_INVALID_ITEM; HashFileIdx -= pManifestHeader->HashTableKeyCount; while (true) { UINT32 Value = this->lpHashTableIndices[HashFileIdx]; UINT32 FileID = Value & 0x7FFFFFFF; if (strcmp(GetItemPath(FileID), Item) == 0) return FileID; if ((Value & 0x80000000) == 0x80000000) break; HashFileIdx++; } return CACHE_INVALID_ITEM; }

Delphi

 function TGCFFile.GetItemByPath(Path: string): integer; var end_block: boolean; Hash, HashIdx, HashValue: ulong; FileID, HashFileIdx: integer; PathEx: AnsiString; begin result:=-1; {$IFDEF UNICODE} PathEx:=Wide2Ansi(ExtractFileName(Path)); {$ELSE} PathEx:=ExtractFileName(Path); {$ENDIF} Hash:=jenkinsLookupHash2(@PathEx[1], Length(PathEx), 1); HashIdx:=Hash mod fManifestHeader.HashTableKeyCount; HashFileIdx:=lpHashTableKeys[HashIdx]; if HashFileIdx=-1 then begin if (LowerCase(Path)<>Path) then begin {$IFDEF UNICODE} Hash:=jenkinsLookupHash2(@LowerCaseAnsi(PathEx)[1], Length(PathEx), 1); {$ELSE} Hash:=jenkinsLookupHash2(@LowerCase(PathEx)[1], Length(PathEx), 1); {$ENDIF} HashIdx:=Hash mod fManifestHeader.HashTableKeyCount; HashFileIdx:=lpHashTableKeys[HashIdx]; if HashFileIdx=-1 then Exit; end; end; dec(HashFileIdx, fManifestHeader.HashTableKeyCount); repeat HashValue:=lpHashTableIndices[HashFileIdx]; FileID:=HashValue and $7FFFFFFF; end_block:= (HashValue and $80000000 = $80000000); if CompareStr(ItemPath[FileID], Path)=0 then begin result:=FileID; Exit; end; inc(HashFileIdx); until end_block; if (result=-1) and (LowerCase(Path)<>Path) then result:=GetItemByPath(LowerCase(Path)); end;

As can be seen from the code, from the entire path to the file we take only its name and calculate the hash for it. We take the remainder of the integer division of the result by the value of ManifestHeader.HashTableKeyCount - this will be the number of the entry in the HashTableKeys list containing either 0xffffffff (if there is no such element) or the value X + ManifestHeader.HashTableKeyCount . Based on this, we calculate X , which is the number of an element in the HashTableIndices list, from which the required element can be located. The values from this list indicate the item you are looking for, whose name is compared in the query. If it does not match, we take the next element of the list and repeat until the most significant bit of the element number is “0”.
I understand what turned out to be confusing, but this is exactly how it works ... Blame the VALVE programmers for this confusion.
This method is much better than direct search in the tree - performance was compared when starting the game with the self-written library emulator Steam.dll, which will be discussed later.

Getting the full path to the item

This action is somewhat back to the previous one - by the element number, you need to walk through the tree to the root element and get the path to the file.

Getting the file path

C ++

 char *CGCFFile::GetItemPath(UINT32 Item) { size_t len = strlen(&lpNames[lpManifest[Item].NameOffset]); UINT32 Idx = lpManifest[Item].ParentIndex; while (Idx != CACHE_INVALID_ITEM) { len += strlen(&lpNames[lpManifest[Idx].NameOffset]) + 1; Idx= lpManifest[Idx].ParentIndex; } len--; char *res = new char[len+1]; memset(res, 0, len+1); size_t l = strlen(&lpNames[lpManifest[Item].NameOffset]); memcpy(&res[len-l], &lpNames[lpManifest[Item].NameOffset], l); len -= strlen(&lpNames[lpManifest[Item].NameOffset]); res[--len] = '\\'; Item = lpManifest[Item].ParentIndex; while ((Item != CACHE_INVALID_ITEM) && (Item != 0)) { l = strlen(&lpNames[lpManifest[Item].NameOffset]); memcpy(&res[len-l], &lpNames[lpManifest[Item].NameOffset], l); len -= strlen(&lpNames[lpManifest[Item].NameOffset]); res[--len] = '\\'; Item = lpManifest[Item].ParentIndex; } return res; }

Delphi

 function TGCFFile.GetItemPath(Item: integer): string; var res: AnsiString; begin res:=pAnsiChar(@fNameTable[lpManifestNodes[Item].NameOffset+1]); Item:=lpManifestNodes[Item].ParentIndex; while (Item>-1) do begin res:=pAnsiChar(@fNameTable[lpManifestNodes[Item].NameOffset+1])+'\'+res; Item:=lpManifestNodes[Item].ParentIndex; end; Delete(res, 1, 1); {$IFDEF UNICODE} result:=Ansi2Wide(res); {$ELSE} result:=res; {$ENDIF} end;

The code for Delphi is significantly less due to the fact that for C ++ I did not use the class std :: string - I did not know about it then. With it, the code would come out much shorter ...

Streams

When writing libraries for archive-like file formats (which contain other files), I use the “stream-to-stream” principle, which allows you to open files in the archive without unpacking it. For example, in the old-version cache half-life.gcf was the pak0.pak file, which is an archive. As a result, I opened the half-life.gcf file , in it - pak0.pak . in which in turn read the necessary files. And all this - without unpacking even in memory, all the functionality is implemented through the wrappers I wrote over the file streams (low-level, at the WindowsAPI level).

Opening a file in the cache

C ++

 CStream *CGCFFile::OpenFile(char* FileName, UINT8 Mode) { UINT32 Item = GetItem(FileName); if (Item == CACHE_INVALID_ITEM) return NULL; if ((lpManifest[Item].Attributes & CACHE_FLAG_FILE) != CACHE_FLAG_FILE) return NULL; return OpenFile(Item, Mode); } CStream *CGCFFile::OpenFile(UINT32 Item, UINT8 Mode) { StreamData *Data = new StreamData(); memset(Data, 0, sizeof(StreamData)); Data->Handle = (handle_t)Item; Data->Package = this; Data->Size = this->GetItemSize(Item).Size; if (IsNCF) Data->FileStream = (CStream*)new CStream(MakeStr(CommonPath, GetItemPath(Item)), Mode==CACHE_OPEN_WRITE); else BuildClustersTable(Item, &Data->Sectors); return new CStream(pStreamMethods, Data); }

Delphi

 function TGCFFile.OpenFile(FileName: string; Access: byte): TStream; var Item: integer; begin result:=nil; Item:=ItemByPath[FileName]; if (Item=-1) then Exit; if ((lpManifestNodes[Item].Attributes and HL_GCF_FLAG_FILE<>HL_GCF_FLAG_FILE) or (ItemSize[Item].Size=0)) then Exit; result:=OpenFile(Item, Access); end; function TGCFFile.OpenFile(Item: integer; Access: byte): TStream; var res: TStream; begin res:=TStream.CreateStreamOnStream(@StreamMethods); res.Data.fHandle:=ulong(Item); res.Data.Package:=self; res.Data.fSize:=(res.Data.Package as TGCFFile).ItemSize[Item].Size; res.Data.fPosition:=0; if (IsNCF) then begin CommonPath:=IncludeTrailingPathDelimiter(CommonPath); case Access of ACCES_READ: begin res.Data.FileStream:=TStream.CreateReadFileStream(CommonPath+ItemPath[Item]); res.Methods.fSetSiz:=StreamOnStream_SetSizeNULL; res.Methods.fWrite:=StreamOnStream_WriteNULL; end; ACCES_WRITE: begin ForceDirectories(ExtractFilePath(CommonPath+ItemPath[Item])); res.Data.FileStream:=TStream.CreateWriteFileStream(CommonPath+ItemPath[Item]); end; ACCES_READWRITE: res.Data.FileStream:=TStream.CreateReadWriteFileStream(CommonPath+ItemPath[Item]); end; res.Data.FileStream.Seek(0, spBegin); end else GCF_BuildClustersTable(Item, @res.Data.SectorsTable); result:=res; end;

Thus, the work with the contents is much simpler - you can open files and read data from them without unnecessary gestures.

Retrieve file with checksum verification

In this procedure, the streams described above are actively used - I just read the file with fragments of a fixed size (the maximum fragment size for checksums is 32Kb), I calculate checksums for them and compare them with the values from the table in the headers.

Extracting a file with checking its COP

C ++

 UINT64 CGCFFile::ExtractFile(UINT32 Item, char *Dest, bool IsValidation) { CStream *fileIn = this->OpenFile(Item, CACHE_OPEN_READ), *fileOut; if (fileIn == NULL) return 0; if (!IsValidation) { if (DirectoryExists(Dest)) Dest = MakeStr(IncludeTrailingPathDelimiter(Dest), GetItemName(Item)); fileOut = new CStream(Dest, true); if (fileOut->GetHandle() == INVALID_HANDLE_VALUE) return 0; fileOut->SetSize(GetItemSize(Item).Size); } UINT8 buf[CACHE_CHECKSUM_LENGTH]; UINT32 CheckSize = CACHE_CHECKSUM_LENGTH; UINT64 res = 0; while ((fileIn->Position()<fileIn->GetSize()) && (CheckSize == CACHE_CHECKSUM_LENGTH)) { if (Stop) break; UINT32 CheckIdx = lpFileIDChecksum[lpManifest[Item].FileId].FirstChecksumIndex + ((fileIn->Position() & 0xffffffffffff8000) >> 15); CheckSize = (UINT32)fileIn->Read(buf, CheckSize); UINT32 CheckFile = Checksum(buf, CheckSize), CheckFS = lpChecksum[CheckIdx]; if (CheckFile != CheckFS) { break; } else if (!IsValidation) { fileOut->Write(buf, CheckSize); } res += CheckSize; } delete fileIn; if (!IsValidation) delete fileOut; return res; }

Delphi

 function TGCFFile.ExtractFile(Item: integer; Dest: string; IsValidation: boolean = false): int64; var StreamF, StreamP: TStream; CheckSize, CheckFile, CheckFS, CheckIdx: uint32_t; buf: array of byte; Size: int64; begin result:=0; StreamP:=OpenFile(Item, ACCES_READ); if (StreamP=nil) then Exit; Size:=ItemSize[Item].Size; if Assigned(OnProgress) then OnProgress(ItemPath[Item], 0, Size, Data); if Assigned(OnProgressObj) then OnProgressObj(ItemPath[Item], 0, Size, Data); StreamF:=nil; if (not IsValidation) then begin if DirectoryExists(Dest) then Dest:=IncludeTrailingPathDelimiter(Dest)+ExtractFileName(ItemName[Item]); StreamF:=TStream.CreateWriteFileStream(Dest); StreamF.Size:=ItemSize[Item].Size; if StreamF.Handle=INVALID_HANDLE_VALUE then begin StreamF.Free; Exit; end; end; SetLength(buf, HL_GCF_CHECKSUM_LENGTH); CheckSize:=HL_GCF_CHECKSUM_LENGTH; while ((StreamP.Position<StreamP.Size) and (CheckSize=HL_GCF_CHECKSUM_LENGTH)) do begin CheckIdx:=lpFileIdChecksumTableEntries[lpManifestNodes[Item].FileId].FirstChecksumIndex+ ((StreamP.Position and $ffffffffffff8000) shr 15); CheckSize:=StreamP.Read(buf[0], HL_GCF_CHECKSUM_LENGTH); CheckFile:=Checksum(@buf[0], CheckSize); CheckFS:=lpChecksumEntries[CheckIdx]; if (CheckFile<>CheckFS) and (not IgnoreCheckError) then begin if Assigned(OnError) then OnError(GetItemPath(Item), ERROR_CHECKSUM, Data); if Assigned(OnErrorObj) then OnErrorObj(GetItemPath(Item), ERROR_CHECKSUM, Data); break; end else if (not IsValidation) then StreamF.Write(buf[0], CheckSize); inc(result, CheckSize); if Assigned(OnProgress) then OnProgress('', result, Size, Data); if Assigned(OnProgressObj) then OnProgressObj('', result, Size, Data); if Stop then break; end; SetLength(buf, 0); StreamP.Free; if (not IsValidation) then StreamF.Free; end;

In the code for Delphi there is an additional code to display the progress of the work - a call to the OnProgress, OnProgressObj callback functions.

Decrypt file content

Since many games shortly before the release can be downloaded in advance, their content in such cases is completely or partially encrypted. With the release of the game, the key for decrypting this content becomes available, implemented by the following code:

File decryption

C ++

 UCHAR IV[16] = {0}; void DecryptFileChunk(char *buf, UINT32 size, char *key) { AES_KEY aes_key; AES_set_decrypt_key((UCHAR*)key, 128, &aes_key); AES_cbc_encrypt((UCHAR*)buf, (UCHAR*)buf, size, &aes_key, IV, false); } UINT64 CGCFFile::DecryptFile(UINT32 Item, char *key) { UINT64 res = 0; CStream *str = OpenFile(Item, CACHE_OPEN_READWRITE); if (str == NULL) return 0; char buf[CACHE_CHECKSUM_LENGTH], dec[CACHE_CHECKSUM_LENGTH]; UINT32 CheckSize = CACHE_CHECKSUM_LENGTH; INT32 CompSize, UncompSize, sz; while ((str->Position() < str->GetSize()) && (CheckSize == CACHE_CHECKSUM_LENGTH)) { UINT32 CheckIdx = lpFileIDChecksum[lpManifest[Item].FileId].FirstChecksumIndex + ((str->Position() & 0xffffffffffff8000) >> 15); INT32 CheckSize = (INT32)str->Read(buf, 8); memcpy(&CompSize, &buf[0], 4); memcpy(&UncompSize, &buf[4], 4); if (((UINT32)UncompSize > pManifestHeader->CompressionBlockSize) || (CompSize > UncompSize) || (UncompSize < -1) || (CompSize < -1)) { // Chunk is not compressed CheckSize = (UINT32)str->Read(&buf[8], CACHE_CHECKSUM_LENGTH-8); DecryptFileChunk(&buf[0], CheckSize, key); } else if (((UINT32)UncompSize <= pManifestHeader->CompressionBlockSize) && (CompSize <= UncompSize) && (UncompSize > -1) || (CompSize > -1)) { // Chunk is compressed CheckSize = (UINT32)str->Read(&buf[8], UncompSize-8); INT32 CheckFile = UncompSize; if (CompSize%16 == 0) sz = CompSize; else sz = CompSize + 16 - (CompSize%16); memcpy(dec, buf, sz); DecryptFileChunk(&dec[0], sz, key); uncompress((Bytef*)&buf[0], (uLongf*)&CheckFile, (Bytef*)&dec[0], sz); } str->Seek(-CheckSize, USE_SEEK_CURRENT); str->Write(&buf[0], CheckSize); UINT32 Check1 = Checksum((UINT8*)&buf[0], CheckSize), Check2 = lpChecksum[CheckIdx]; if (Check1 != Check2) break; res += CheckSize; } lpManifest[Item].Attributes = lpManifest[Item].Attributes & (!CACHE_FLAG_ENCRYPTED); return res; }

Delphi

 const IV: array[0..15] of byte = (0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0); procedure DecryptFileChunk(buf: pByte; ChunkSize: integer; Key: Pointer); var AES: TCipher_Rijndael; src: array[0..HL_GCF_CHECKSUM_LENGTH-1] of byte; begin Move(buf^, src[0], HL_GCF_CHECKSUM_LENGTH); AES:=TCipher_Rijndael.Create(); AES.Init(Key^, 16, IV[0], 16); AES.Mode:=cmCFBx; AES.Decode(src[0], buf^, ChunkSize); AES.Free; end; function TGCFFile.DecryptFile(Item: integer; Key: Pointer): int64; var StreamP: TStream; CheckSize, CheckFile, CheckFS, CheckIdx, sz: uint32_t; buf: array of byte; dec: array[0..HL_GCF_CHECKSUM_LENGTH] of byte; CompSize, UncompSize: integer; Size: int64; begin result:=0; StreamP:=OpenFile(Item, ACCES_READWRITE); if (StreamP=nil) then Exit; Size:=ItemSize[Item].Size; if Assigned(OnProgress) then OnProgress(ItemName[Item], 0, Size, Data); if Assigned(OnProgressObj) then OnProgressObj(ItemName[Item], 0, Size, Data); SetLength(buf, HL_GCF_CHECKSUM_LENGTH); CheckSize:=HL_GCF_CHECKSUM_LENGTH; while ((StreamP.Position<StreamP.Size) and (CheckSize=HL_GCF_CHECKSUM_LENGTH)) do begin CheckIdx:=lpFileIdChecksumTableEntries[lpManifestNodes[Item].FileId].FirstChecksumIndex+ ((StreamP.Position and $ffffffffffff8000) shr 15); CheckSize:=StreamP.Read(buf[0], 8); Move(buf[0], CompSize, 4); Move(buf[4], UncompSize, 4); if (ulong(UncompSize)>fManifestHeader.CompressionBlockSize) or (CompSize>UncompSize) or (UncompSize<-1) or (CompSize<-1) then begin //Chunk is not compressed! CheckSize:=StreamP.Read(buf[8], HL_GCF_CHECKSUM_LENGTH-8); DecryptFileChunk(@buf[0], CheckSize, Key); end else if ((ulong(UncompSize)<=fManifestHeader.CompressionBlockSize) and (CompSize<=UncompSize)) and ((UncompSize>-1) and (CompSize>-1)) then begin CheckSize:=StreamP.Read(buf[8], UncompSize-8); CheckFile:=UncompSize; //Chunk is compressed! if (CompSize mod 16=0) then sz:=CompSize else sz:=CompSize+16-(CompSize mod 16); Move(buf[8], dec[0], sz); DecryptFileChunk(@dec[0], sz, Key); uncompress(@buf[0], CheckFile, @dec[0], sz); end; StreamP.Seek(-CheckSize, spCurrent); StreamP.Write(buf[0], CheckSize); CheckFile:=Checksum(@buf[0], CheckSize); CheckFS:=lpChecksumEntries[CheckIdx]; if (CheckFile<>CheckFS) and (not IgnoreCheckError) then begin if Assigned(OnError) then OnError(GetItemPath(Item), ERROR_CHECKSUM, Data); if Assigned(OnErrorObj) then OnErrorObj(GetItemPath(Item), ERROR_CHECKSUM, Data); break; end; inc(result, CheckSize); //StreamP.Position:=StreamP.Position+CheckSize; if Assigned(OnProgress) then OnProgress('', result, Size, Data); if Assigned(OnProgressObj) then OnProgressObj('', result, Size, Data); if Stop then break; end; lpManifestNodes[Item].Attributes:=lpManifestNodes[Item].Attributes and (not HL_GCF_FLAG_ENCRYPTED); fIsChangeHeader[HEADER_MANIFEST_NODES]:=true; SaveChanges(); SetLength(buf, 0); end;

ManifestHeader

ManifestHeader
Manifest[]
FileNames
HashTableKeys[]
HashTableIndices[]
MinimumFootprints[]
UserConfig[]

ManifestHeader.Fingerprint
ManifestHeader.Checksum

Adler32 :

Delphi

 function ManifestChecksum(Header: pCache_ManifestHeader; entries, names, hashs, table, MFP, UCF: pByte): uint32_t; var tmp1, tmp2: uint32; begin tmp1:=Header.Fingerprint; tmp2:=Header.Checksum; Header.Fingerprint:=0; Header.Checksum:=0; result:=adler32(0, pAnsiChar(Header), sizeof(TCache_ManifestHeader)); result:=adler32(result, pAnsiChar(entries), sizeof(TCache_ManifestNode)*Header^.NodeCount); result:=adler32(result, pAnsiChar(names), Header^.NameSize); result:=adler32(result, pAnsiChar(hashs), sizeof(uint32)*Header^.HashTableKeyCount); result:=adler32(result, pAnsiChar(table), sizeof(uint32)*Header^.NodeCount); if Header^.NumOfMinimumFootprintFiles>0 then result:=adler32(result, pAnsiChar(MFP), sizeof(uint32)*Header^.NumOfMinimumFootprintFiles); if Header^.NumOfUserConfigFiles>0 then result:=adler32(result, pAnsiChar(UCF), sizeof(uint32)*Header^.NumOfUserConfigFiles); Header.Fingerprint:=tmp1; Header.Checksum:=tmp2; end;

Conclusion

, ( , - ) ( , ). ( - ...).
— 2011- .

PS: — (, , ). , , , — - ( ). …

Source: https://habr.com/ru/post/224027/

All Articles

Steam Files. Part 1 - GCF / NCF

General file structure

Fileheader

BlockAllocationTableHeader

BlockAllocationTable

FileAllocationTableHeader

FileAllocationTable

ManifestHeader

Manifest

FileNames

HashTableKeys

HashTableIndices

MinimumFootprints

Userconfigs

ManifestMapHeader

Manifestmap

Checksumdatacontainer

FileIdChecksumTableHeader

FileIdChecksums

Checksums

ChecksumSignature

LatestApplicationVersion

Dataheader

Algorithms

Calculating file size

Search item by name

Getting the full path to the item

Streams

Retrieve file with checksum verification

Decrypt file content

ManifestHeader

Conclusion

More articles: