
Recovering data from cockroachdb is easy - just roll everything from backup. How not to make backups? For the base, which version 1.0 came out just six months ago? Well, do not despair, most likely the data can be recovered. I will talk about how I restored the database for my project of amusing social network in
bamboo and streaming this process on YouTube.
How will we recover
First you need to deal with what happened, why CockroachDB fell? The reasons are different, but in any case the server no longer starts or does not respond to requests. In my case, after a short googling, the base rocksdb was broken:
E171219 15:50:36.541517 25 util/log/crash_reporting.go:82 a panic has occurred! E171219 15:50:36.734485 74 util/log/crash_reporting.go:82 a panic has occurred! E171219 15:50:37.241298 25 util/log/crash_reporting.go:174 Reported as error 20a3dd770da3404fa573411e2b2ffe09 panic: Corruption: block checksum mismatch [recovered] panic: Corruption: block checksum mismatch goroutine 25 [running]: github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).Recover(0xc4206c8500, 0x7fb299f4b180, 0xc4209de120) /go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:200 +0xb1 panic(0x1957a00, 0xc4240398a0) /usr/local/go/src/runtime/panic.go:489 +0x2cf github.com/cockroachdb/cockroach/pkg/storage.(*Store).processReady(0xc420223000, 0x103) /go/src/github.com/cockroachdb/cockroach/pkg/storage/store.go:3411 +0x427
Restoring RocksDB repository
If you beat rocksdb base, then to restore it to cockroach version 1.1, the necessary command is already built in:
')
$ cockroach debug rocksdb repair
You may also need to specify the path to the data directory, if the path is custom.
After recovery, you can try to restart cockroachdb. In my case it did not help, but the error became different:
E171219 13:12:47.618517 1 cli/error.go:68 cockroach server exited with error: cannot verify empty engine for bootstrap: unable to read store ident: store has not been bootstrapped Error: cockroach server exited with error: cannot verify empty engine for bootstrap: unable to read store ident: store has not been bootstrapped
Obviously, something was beaten somewhere in the settings, and I do not understand the cockroachdb storage format well enough to understand that it still doesn’t suffice. Therefore, let's go the other way: we know that inside this Key-Value store and even approximately know what we need to look for, as the developers told (
here and
here ) about it in their blog.
We rip out data straight from RocksDB
Since the format of keys and records is approximately known to us, you can take a directory with data from a “broken” instance and try to go through all the keys and pull data directly from there. I will talk about the single-host version, but if there are many hosts and the entire cluster has died, then you will need to learn how to remove duplicates and identify the latest key versions.
We will write everything on go, of course. At first I decided to try to take the
github.com/tecbot/gorocksdb library, and it even started up, but gave an error message that the comparator
cockroach_comparator
unknown to it. I took the necessary comparator from the source code of the cockroach itself, but nothing has changed.
Since I was too lazy to figure out what was happening, I decided to go a different way and just took and immediately prepared the package directly from the source code of the cockroachdb itself: everything in the package
github.com/cockroachdb/cockroach/pkg/storage/engine is all that is needed for to work properly with KV-base.
Therefore, we will open the database and start to iterate and try to look for the names of the keys, in the meaning of which there are some lines that we know for sure what is in the database:
package main import "github.com/cockroachdb/cockroach/pkg/storage/engine" func main() { db, err := engine.NewRocksDB(engine.RocksDBConfig{ Dir: "/Users/yuriy/tmp/vbambuke", MustExist: true, }, engine.NewRocksDBCache(1000000)) if err != nil { log.Fatalf("Could not open cockroach rocksdb: %v", err.Error()) } db.Iterate( engine.MVCCKey{Timestamp: hlc.MinTimestamp}, engine.MVCCKeyMax, func(kv engine.MVCCKeyValue) (bool, error) { if bytes.Contains([]byte(kv.Value), []byte("safari@apple.com")) { log.Printf("Email key: %s", kv.Key) } return false, nil }, ) }
I got about this:
Email key: /Table/54/1/158473728194052097/0/1503250869.243064075,0
This key has quite a lot of components, but this is what I managed to find out:
0. Table means “table” :)
1. Table number (tables must be in the order of creation)
2. Key type. 1 means normal entry, 2 means index
3. The value of the primary key (1,2,3, ...)
4. I do not know, apparently the version?
5. timestamp
That is, you can display all keys and their values for all tables in a separate file and split them by table number. After that, it should become more or less clear how the tables are called (and what is their structure, because you have it preserved :)?).
Parse the format of records
I recovered data from cockroachdb version 1.0.4, so for later versions the details may differ. But here's what I managed to understand:
1. The first 6 bytes in the value can be ignored. Apparently, this is a checksum of data and some other meta-information, for example, bits about nullable fields.
2. Next are the data itself, and before each column, except the first, there is a separate byte with its type
An example from the messages table (I used od to get a readable form of binary data):
The structure of the messages table was as follows:
CREATE TABLE messages ( id SERIAL PRIMARY KEY, user_id BIGINT, user_id_to BIGINT, is_out BOOL, message TEXT, ts BIGINT );
$ head -n 2 messages | od -c 0000000 1 / 1 / 0 / 1 5 0 3 2 5 0 8 6 8 0000020 . 7 2 7 5 5 4 8 2 8 , 0 0000040 = 241 E 270 276 \n # 202 200 230 316 0000060 316 ˁ ** 263 004 023 202 200 204 231 374 235 222 264 004 032 0000100 026 031 N ow I usereal 0000120 P ostgre SQL 023 230 277 256 217 0000140 240 320 375 342 ( \n 0000146
Let's sort this data in order:
1. First, in the file I wrote down the name of the key - in the fragment
0000000 1 / 1 / 0 / 1 5 0 3 2 5 0 8 6 8 0000020 . 7 2 7 5 5 4 8 2 8 , 0 0000040 =
This is all a piece of the key from which we need to take the value of the primary key (the format of the keys is described above)
2. Title. On line 0000040 after the key there is a 6-byte header:
241 E 270 276 \n #
it is always different, but for all my tables, the first 6 bytes needed to be simply skipped.
3. The first field, user_id. The numbers that I met in cockroachdb were always encoded by varint from the standard library. The first column can be read
using binary.Varint . We will need to read the following piece:
0000040 = 241 E 270 276 \n # ---> 202 200 230 316 0000060 316 ˁ ** 263 004 <---- 023 202 200 204 231 374 235 222 264 004 032
4. The second field, user_id_to. It turned out that at the beginning of the field is its type and 023 means a number and is also readable like varint. You can write the corresponding functions to read such columns from a byte array:
func readVarIntFirst(v []byte) ([]byte, int64) { res, ln := binary.Varint(v) if ln <= 0 { panic("could not read varint") } return v[ln:], res } func readVarInt(v []byte) ([]byte, int64) { if v[0] != '\023' { panic("invalid varint prefix") } return readVarIntFirst(v[1:]) }
5. Next comes the boolean field. I had to tinker a bit, but I was able to find out that you can use the out-of-the-box function from the
github.com/cockroachdb/cockroach/pkg/util/encoding
package called
encoding.DecodeBoolValue This function works just like the ones just mentioned, it only returns an error instead of panic . We use panic for convenience - we don’t need to handle errors in a one-time utility in a clever way.
6. Next comes the message text. Before the text fields is byte 026, then the length and then the content. It looks like this:
0000100 026 031 N ow I usereal 0000120 P ostgre SQL 023 230 277 256 217 0000140 240 320 375 342 ( \n
One would think that the first byte is the length, and the text itself goes on. If the values are small (conditionally up to 100 bytes), then it even works. But in fact, the length is encoded in another way, and the length can also be read using the functions from the encoding package:
func readStringFirst(v []byte) ([]byte, string) { v, _, ln, err := encoding.DecodeNonsortingUvarint(v) if err != nil { panic("could not decode string length") } return v[ln:], string(v[0:ln]) } func readString(v []byte) ([]byte, string) { if v[0] != '\026' { panic("invalid string prefix") } return readStringFirst(v[1:]) }
7. Well, the final regular number, read with the help of our function readVarInt.
Reading DATE Column
With a DATE column, I suffered because the encoding package did not immediately contain the necessary function :). I had to improvise. I will not torment you for a long time, the DATE format is a regular number (column type 023 hints), and it is recorded in it ... The number of seconds in UNIX TIME format divided by 86400 (the number of seconds in days). That is, to read the date, multiply the read number by 86400 and treat it as unix time:
v, birthdate := readVarInt(v) ts := time.Unix(birthdate*86400, 0) formatted := fmt.Sprintf("%04d-%02d-%02d", ts.Year(), ts.Month(), ts.Day())
Insert back into base
To insert the data back into the database, I personally wrote a simple function for escaping lines:
func escape(q string) string { var b bytes.Buffer for _, c := range q { b.WriteRune(c) if c == '\'' { b.WriteRune(c) } } return b.String() }
And used it to compile SQL queries manually:
fmt.Printf( "INSERT INTO messages2(id, user_id, user_id_to, is_out, message, ts) VALUES(%s, %d, %d, %v, '%s', %d);\n", pk, userID, userIDTo, isOut, escape(message), ts, )
But you can create a CSV, use your model for the base, use prepared expressions, etc. - as you please. This is not a problem after you parsed binary data storage format in CockroachDB :).
Links, conclusions
Thank you for scrolling to the end :). Better make backups, and don't act like me. But if suddenly you really need to pull data from CockroachDB, then this article should help you a bit. Do not lose data!
CockroachdbMy funny social networkSources of my data recovery utilityProcess on youtube (2 of 3 videos)