⬆️ ⬇️

Recover Apache Derby without backup

For my own pleasure, a robot for Wikipedia is spinning on my personal computer ( account1 , account2 , source code ). The bot keeps a local cache of Wikipedia versions — in order not to go to the remote server each time, as well as a set of specific data that has been collected for the last couple of years and is very important for the bot to work. The data is collected into a database running Apache Derby, and, along with the cache, the database takes about 50 GB.



And so, one fine day, when the bot processed the data in 8 threads on 4 CPUs, Abbyy Finereader recognized the 14th volume of the Russian biographical dictionary edited by A. A. Polovtsev, and the opponents made their move in Civilization Age of Kings ... he appeared - the blue screen of death. Long time no see, I thought, restarting the computer. Okay, with a reason - most likely problems with a video adapter on hardware. It was only when the computer was booted up and I tried to start the bot again, this appeared:

ERROR XSDG2: Invalid checksum on Page


And the last backup, as usual, is dated March ...



Having spent half an hour-hour to investigate the problem (read - Google search), I found out that:



How to do the last? Just remove the logs! However, as IBM warns us (the Cloudscape developer is what became Derby later):



Never delete or manipulate * ANY * files in a database directory structure. Doing so will corrupt the database.


')

So, what did I do on the night from Saturday to Sunday?

  1. I made sure that there is more space on the hard disk than 3 times the current base size.
  2. I made a backup of the broken database. Better late than never.
  3. After offering the prayer to the Flying Spaghetti Monster, deleted all the files from the logs folder (the folder itself must be left - it won't load without it).
  4. I went through the SQuirreL SQL Client to the database and made sure that it works.


But this, of course, not all. It should be understood that deleting transaction logs, especially with recovery errors in the database, cannot lead to anything good. Such a base will work, but not for long - until the first oncoming jamb in the structure. It is necessary to somehow force the database to dig through all the data in all the tables, it is desirable to overwrite them into new files line by line (in the Derby terminology, new conglomerates).



The mechanism built into Apache Derby backup-restore does not help us - it stupidly copies data files without checking their structure in any way. However, the delivery includes an interesting utility called ij . Besides the fact that it gives console access to the database, it also allows you to call functions that are not available when accessing the database from other applications. We need two such functions from the Export & Import kit:



These functions do export (and import) data into a standard text format. The additional prefix LOBS_FROM_EXTFILE indicates that binary and simply large text fields will be saved in a separate file. This is convenient if you want to see the correctness data at least at the edge of the eye.



Therefore, the next step was to launch ij ...

java -cp "derby.jar;derbytools.jar" -Dij.maximumDisplayWidth=120 org.apache.derby.tools.ij

connect to the database (don't forget the semicolon at the end of the ij command)

ij>connect secretarydb;

export of all application data (separate commands for each table)

ij>CALL SYSCS_UTIL.SYSCS_EXPORT_TABLE_LOBS_TO_EXTFILE(null,'TABLENAME','c:\data\TABLENAME.del',null,null,'UTF-8', 'c:\data\TABLENAME.dat');

By the way, the “SHOW TABLES” command in ij is .



After the export is complete, you can take your favorite text editor, which would support the appropriate data sizes, and check the readability of at least the beginning and end of the file. Extra lines can be removed from the .del file, if something turns out to be wrong ... well, don’t deny yourself anything - this is the moment when you can edit the balance on the account of a couple of subscribers.



After completion, remove the database. If anything, we have a backup. Then it was enough for me to run the bot again, and, made on hibernate, he restored the table structure at startup. After that, immediately stop the application and run ij to import data:

java -cp "derby.jar;derbytools.jar" -Dij.maximumDisplayWidth=120 org.apache.derby.tools.ij

ij>connect secretarydb;

ij>CALL SYSCS_UTIL.SYSCS_IMPORT_TABLE_LOBS_TO_EXTFILE(null,'TABLENAME','c:\data\TABLENAME.del',null,null,'UTF-8', 1);

The one at the end indicates that the data in the table needs to be overwritten if something is already there. You do not need to specify the name of the file with LOB data - there are links to it in the main file.



That's all. Data restored. But, of course, the next function that will be in the bot is automatic archiving of the most important data at each launch.

Source: https://habr.com/ru/post/188238/



All Articles