In this post, the SQLite database will be examined in section, you can find information about the structure of the database file, the presentation of data in memory, as well as information about the structure and file representation of the B-tree.
Database file format
The entire database is stored in a single file on a disk called the “main database file”. During transactions, SQLite stores additional information in the second file: a rollback journal, or, if the database is in WAL mode, a log file with information about the records. If the application or computer is disconnected before the end of the transaction, then these files are called “hot journal” or “hot WAL file” and contain the necessary information to restore the database to a consistent state.
Pages
The main database file consists of one or several pages. All pages in the same database have the same size, which can be from 512 to 65536 bytes. The page size for the database file is determined by a whole 2 byte number with an offset of 16 bytes from the beginning of the database file.
All pages are numbered from 1 to 2147483646 (2 ^ 31 - 2). The minimum size of the database: one page is
512 bytes in size , the maximum size of the database is 2147483646 pages of 65536 bytes (
~ 140 TB ).
')
Headline
The first 100 bytes of the database file contain the header of the database, table 1 shows the header scheme.
Lock-byte page
The blocking page is one page of the base file and is located between offsets 0x1073741824 and 0x1073742335; if the size of the base is smaller, then it does not have a blocking page. This page is needed to implement the OS Interface blocking primitives.
Freelist
The list of blank pages is organized as a linked list. Each element of the list consists of two numbers of 4 bytes. The first number determines the number of the next element of the freelist (trunk pointer), or is zero if the list is over. The second number is a pointer to a data page (Leaf page numbers). The figure below shows the scheme of this structure.

B - tree
SQLite uses two types of trees: “table B - tree” (data is stored on the leaves) and “index B - tree” (keys are stored on the leaves).
Each entry in the “table B - tree” consists of a 64-bit integer key and up to 2147483647 bytes of arbitrary data. The key “table B - tree” corresponds to the SQL table ROWID.
Each entry in the “index B - tree” consists of an arbitrary key up to 2147483647 bytes in length.
Page B - tree
The B-tree page has the structure:
- Database file header (100 bytes)
- B-tree page header (8 or 12 bytes)
- Array of cell pointers
- Unallocated space
- Cell content
- Reserved place
The header of the database file is found only on the first page, which is always the old “table B - tree”. All other pages of the B-tree in the database do not have this title.
The B-tree page header is 8 bytes in size for leaf pages and 12 bytes for inner pages. Table 2 shows the page header structure.
table 2
Freeblock is a structure used to define unallocated space within a B-tree page. Freeblock organized in a chain. The first 2 bytes in the freeblock (from high to low) is the offset to the next freeblock, or zero, if the freeblock is the last in the chain. The third and fourth bytes are an integer, freeblock size in bytes, including a header of 4 bytes. Freeblocks are always linked in ascending order of offset.
The number of fragmented bytes is the total number of unused bytes in the cell's content area.
The array of cell pointers consists of K 2-byte integer offsets of the cell contents (with K cells in the B-tree). The array is sorted in ascending order (from the smallest keys to the largest).
Unallocated space is the area between the last cell of the array of pointers and the beginning of the first cell.
The reserved space at the end of each page is used by extensions to store information about the page. The size of the reserved area is defined in the database header (default is zero).
Representation
This section describes the structure of data storage in the database. The data on the leaf pages of the table b - tree and the keys index b - tree are stored as an arbitrary sequence of bytes called
record format (read more
habrahabr.ru/post/223451 ).
TABLE
Each table (with ROWID) is represented in the database as
table b - tree . Each entry in the tree corresponds to a SQL table row. A single SQL row of a table is represented as a sequence (the same as specified when creating it) of table columns in record format. If the table has an INTEGER PRIMARY KEY that is a ROWID alias, then NULL will be written instead of its value. SQLite will always use the key
table b - tree instead of NULL when accessing INTEGER PRIMARY KEY. If the Affinity column (type conversion recommendation, read more
habrahabr.ru/post/149635 in the section “Data Types and Comparison of Values”) is REAL and the value can be converted to INTEGER without data loss, then the value will be stored as an integer. When extracting data from a database, SQLite converts an integer to REAL.
TABLEWITHOUT ROWID
Each table (without ROWID) is represented in the database as an
index b - tree . The difference from tables with rowid is that the key of each SQL record of the table is stored in the record format, the key columns being stored as specified in the PRIMARY KEY, and the rest in the order specified in the table declaration.
Thus, the entries in
index b - tree are represented just like in
table b - tree , except for the order of the columns and the fact that the content of the row is stored in the key of the tree, and not as data on the leaves as in
table b - tree .
INDEX
Each index (declared CREATE INDEX, PRIMARY KEY or UNIQUE) is represented in the database by the mean
index b - tree . Each entry in this tree corresponds to a row in the SQL table. The index tree key is a sequence of column values ​​specified in the index and is completed with the row key value (rowid or primary key) in record format.
UPD 13:44 :
Representation section reworked, thanks for the criticism of
mayorovp (you could of course move, but oh well).