This story began many, many revisions back - then the SVN repository was pristine clean, and not a single bug had yet defiled it with its presence. The first commits, the first rollbacks, log views - it was all so exciting, so new. And how could the repository then assume that these first, such pleasant steps would later lead him to the surgical table?
The repository grew, crepe, material. Over time, I got used to commits, the first tags appeared, and even the dreams of branches no longer seemed unrealizable. The repository met other SVN repositories, and even began to exchange files with some. At times, he spent a lot of time siphoning off changes from his new friends, enjoying the analysis of diffs as he went along.
The first clouds over the horizon began to appear when the Repository began to hear more and more often the unfamiliar words “Git” and “DVCS” in the conversations of developers. He tried to ask his repository friends about it, but they just averted their eyes ...
Over time, these worries subsided, but life was no longer the same. The repository plunged, began to think more about the meaning of life. The old connections were no longer pleasurable, but rather turned into burdensome dependencies. Increasingly, nightmares about intractable conflicts during the merzhs began to occur.
')
Life flowed measuredly when the commits stopped. Anxious forebodings instantly resumed. The repository, of course, reassured itself with the thought that the developers must be on vacation, and soon everything would return to its normal course, but the feeling of something bad did not leave.
One night they knocked on the door. “Finally, svn commit!” The Repository thought happily, and deftly jumped to the doors for years. “Who is there?” He asked, but in response he heard only intense puffing outside the door. Nasty sucked under the spoon. After a moment, behind the door they struck: “Open, svnadmin dump”. With trembling hands, the Repository opened the door ... and fell unconscious.
One by one, commits raced before their eyes, they appeared and disappeared, and this was repeated again and again. “Svnadmin load, svnadmin dump, svnadmin load, svnadmin dump” - The repository could not understand what was happening, it regained consciousness, then again fell into oblivion. And only a small, flickering light in the distance gave us hope ...
Let's leave our hero for a while and try to figure out what happened.
In the eyes of developers
An astute reader can assume that the situation in which the repository finds itself is somehow related to
distributed version control systems . That's exactly what happened - the developers imbued with all the advantages of
Mercurial and decided to migrate all their code there. The transition from Subversion to Mercurial, however, can be done fairly painlessly, so why was it necessary to torment the repository?
The fact is that earlier the developers made two mistakes, which now showed themselves in all their glory:
- Initially, contrary to all best practices, they began to commit to the root of the repository, without bothering to create the traditional folders trunk, branches and tags. Much later, these folders were created and the files were transferred, but, as you know, you cannot discard commits from the repository. Now, if you leave everything unchanged, then when migrating to Mercurial, commits outside the trunk will disappear.
- As we remember, our hero Repository is mired in “burdensome dependencies,” and this, of course, is about svn: externals. Earlier, I talked about the inconsistency of this technology. When migrating to Mercurial, there is currently no easy way to migrate svn: externals.
For the correct transfer of the entire story from Subversion to Mercurial, the developers decided to modify the SVN repository so as to get rid of the above problems.
Repository under the scalpel
To modify the repository, we first need the svnadmin utility, which allows you to do a full dump of the SVN repository, a dump for specific revisions or a range of revisions, and also roll up dumps on an already existing repository. A small digression for Windows users - this utility is not part of TortoiseSVN, but you can install an additional Subversion client, such as
Slik SVN , in which svnadmin is present.
The general idea of ​​modifying the repository is this: we must recreate the repository from scratch by copying the “correct” commits without changes and replacing the “wrong” ones with our hands. In our case, correctness means a situation where a commit does not contain references to svn: externals and all files are located on the trunk. After the re-creation, you need to correct the date and time for the original commits for the changed commits.
In order to understand how the method works, consider the modification of the real repository. I have prepared a small test repository, in which all the above problems are present. You can download its dump (demo_repo.dmp) along with the other dumps mentioned in the article
here .
The repository contains 8 revisions; here is a description of each of them:
- A text file has been added to the repository root.
- A small C # HelloWorld project has been added to the repository root.
- Added folders trunk, tags, branches. All files from the root are transferred to trunk.
- The folder '3rd party' has been added, and the svn: externals property has been set for downloading files from an external repository to a subfolder.
- The svn: externals property has been removed for the '3rd party' folder and is set on the trunk folder.
- Removed text file from trunk.
- For the trunk folder, the svn: ignore property is set. Some files in HelloWorld are modified.
- Added a text file to the trunk. Removed subfolder in HelloWorld.
Obviously, the “correct” here are only revisions 6 and 8. All the rest will need to be modified anyway.
The problem with revisions 1, 2, 3 is that the files are outside of the trunk. To fix this, we will insert one commit in front of them, in which we will create folders branches, tags, trunk; commits 1 and 2 are modified so that the files are added to the trunk; commit 3 is completely removed.
Revisions 4 and 5 are related to svn: externals, we must replace external dependencies directly with files downloaded from the external repository. Revision 7, seemingly harmless, is also indirectly related to svn: externals, more detailed below.
So, let's begin. I will modify the repository under Windows, but nothing prevents to do exactly the same steps, for example, under Linux.
Getting rid of svn: externals
Do not immediately try and move all commits to the trunk, and get rid of svn: externals. It is better to perform these tasks consistently. We will start with the second task.
First, we must identify all problem revisions that need to be replaced manually. Open the full dump of our repository in a text editor (I use
Notepad ++ ) and consistently look for occurrences of the phrase “svn: externals”. They can be much more than necessary, we are interested only in such occurrences:
K 13
svn: externals
For each such place we need to find the string “Revision-number:” above. It contains the number of the problem revision we need.

When we reach the end of the dump for our test repository, we should get the following problem revision numbers: 4, 5, and 7.
Now we need to prepare partial dumps for revisions that do not require changes. For this:
- We load our repository completely into some local folder, let's call it full_repo. This is done using the following command line calls:
C: \ Subversion> svnadmin create full_repo
C: \ Subversion> svnadmin load full_repo <demo_repo.dmp
- Next, prepare the dumps for those revisions of the repository that do not need to be changed (1-3, 6 and 8):
C: \ Subversion> svnadmin dump full_repo -r 0: 3 --incremental> demo0_3.dmp
C: \ Subversion> svnadmin dump full_repo -r 6 --incremental> demo6.dmp
C: \ Subversion> svnadmin dump full_repo -r 8 --incremental> demo8.dmp
- We check that the received dumps do not contain occurrences of the string “svn: externals”. It is important to carefully check it now, an accidental mistake at this stage can make life difficult in the future.
Next, we begin to create a modified repository. First, we load into it (let's call it, for example, result_repo) the first 3 revisions:
C: \ Subversion> svnadmin create result_repo
C: \ Subversion> svnadmin load result_repo <demo0_3.dmp
Now you need to fix the 4th commit. For this we need to checkout our new repository. To do this, you can use TortoiseSVN or continue working from the command line, for example:
C: \ Subversion> svn co file: /// C: / Subversion / result_repo result_checkout
We check that everything goes according to our plan - go to the result_checkout folder and check that 3 revisions are loaded:
C: \ Subversion> cd result_checkout
C: \ Subversion \ result_checkout> svn log -l 1
-------------------------------------------------- ----------------------
r3 | shibaev | 2010-12-09 23:53:09 +0600 (Thu, 09 Dec 2010) | 1 line
Moved to trunk.
-------------------------------------------------- ----------------------
Now we have to execute the “correct” 4th commit. For this we need the state of the original repository on the 4th revision. Therefore, we do a checkout for him, but right away for revision # 4:
C: \ Subversion \ result_checkout> cd .. C: \ Subversion> svn co file: /// C: / Subversion / full_repo full_checkout -r 4
The last command may take a bit longer, since files from the external repository will be downloaded.
Check your watch. Our working folder looks like this:

And the log for full_checkout in TortoiseSVN is like this:

In the log, we look at what has changed in the 4th revision. And the following has changed - the folder “3rd party” has been added to the trunk, but not simple, but with the svn: externals property set:
iTextSharp https://itextsharp.svn.sourceforge.net/svnroot/itextsharp/tags/iTextSharp_5_0_5/iTextSharp/text/xml/simpleparser/
Therefore, in order to fix the 4th commit, we need to add the folder / trunk / 3rd party / iTextSharp to the repository to be fixed and commit:
- Go to result_checkout / trunk
- Create a folder 3rd party
- Copy iTextSharp folder from full_checkout into it
- Remove the .svn subfolder from it
- Add (svn add) to the result_checkout repository folder 3rd party with all its contents
- Make a commit, keeping the original signature, for example:
C: \ Subversion \ result_checkout> svn commit -m "Added external reference"
Great, there is a first modified commit! However, before we go further, let's think about what the error during such a process is fraught with (if, for example, we committed something wrong or rolled up the wrong dump). Imagine that the repository is not 8 commits, but 2000, and the first time svn: externals are found in revision number 1200. In this case, already loading the very first dump (revisions from 0 to 1199) and its checkout will take indecently a lot of time.
The consequences of an error can be very unpleasant - we can easily spoil the recreated repository, and we have to start all over again. Therefore, you must immediately attend to the problem of backup for intermediate results. It is necessary to save at least the repository (result_repo), as well as checkout (result_checkout), if its recovery from scratch takes significant time.
Let's organize a backup like this - we will compress the necessary folders into a zip-archive and copy to some folder. To do this, use the archiver
7-zip . Install it and add the path to 7z.exe in the PATH environment variable. Now you can backup the intermediate states of the repository as follows:
C: \ Subversion> 7z a backups \ result_repoX.zip result_repo,
where X is the revision number (for example, now in our case X == 4)
After we have saved the intermediate result, it's time to move on - you need to manually commit commit # 5. For this:
- update full_checkout to revision number 5:
C: \ Subversion \ full_checkout> svn up -r 5
- see what changes the fifth commit contains
In this case, the changes are minimal - the svn: externals property has been deleted for the “3rd party” folder and created for the trunk folder. The address of the external repository has not changed, as well as the folder into which you want to download files. Therefore, for our new repository, no changes will be required, you just need to somehow commit. To do this, you can create an empty Fictive file in the trunk, add it to SVN and make a commit. It’s better not to change the caption signature (why it will be clear later)
After the 5th commit is added, roll demo6.dmp:
C: \ Subversion> svnadmin load result_repo <demo6.dmp
We make a backup and proceed to the last serious point of our program - modification of commit number 7. Update the full_checkout repository to the 7th revision and see the list of changes:

The reason why this commit needs to be modified is not immediately clear, because no changes have been made to svn: externals. The svn: ignore property was changed for the trunk folder, maybe this is the reason? And there is. The fact is that the SVN dump for revision 7 contains a full description of the properties of the trunk folder, along with svn: externals:

So, for the cause. There are 2 ways to make the 7th commit: tricky and straightforward.
The first is to take the dump for the 7th revision from the original repository, delete the mention of svn: externals from the properties of the trunk folder (open in a text editor, delete, subtract the number of bytes removed from the “Prop-content-length” and “ Content-length. ”Or use the command“ svnadmin setrevprop ”) and roll this already correct one into the new repository. In this case, this approach is the best solution.
However, it is more interesting to consider the second method, since in most cases for real repositories you will have to use them. This method is to try on the TortoiseSVN skin on yourself and make all changes to the 7th commit yourself.
How accurately we will repeat the actions of the SVN client depends entirely on our patience and attentiveness. It is not necessary to process each change individually, the main thing is to ensure that the sets of files in the source and the new repositories match. The only thing is to move files / folders in SVN, it is desirable to repeat exactly in order not to lose the history of the object before moving. By the way, at this stage you can also delete the Fictive file created earlier in the trunk.
After all changes are made, it is worth checking that the affected projects are still compiled and the tests pass. However, if you have complete confidence in your actions as a Subversion client, then you can not check.
Next, commit our changes and exit the home straight. Roll the last dump demo8.dmp. We update both working copies of result_checkout and full_checkout to the latest revision and compare them. If everything was done correctly, they should be completely identical in the file set.
Get the dump for the modified repository:
C: \ Subversion> svnadmin dump result_repo> without_externals.dmp
This dump is completely free from external dependencies, but some changes are still required from us. The fact is that for commits made manually, the fresh commit time is written in the dump, as well as the wrong user name (usually the same as the user name of the operating system in Windows). It is necessary to correct.
We open the original and new dumps in a text editor and consistently go through the modified revisions (4, 5, 7 in our example). It will look something like this:

Now you need to carefully replace the corresponding dates in the new dump with the original ones. To do this, you can again use the svnadmin utility and its “setrevprop” option, however, in my opinion, this can be done more quickly with your hands in a text editor.
Replacing the date is a harmless operation, but replacing the author of a commit will require additional efforts. As you can see, the values ​​of the “Prop-content-length” and “Content-length” properties, as well as the number above the author of the commit for the original and updated dumps do not match. This is due to the fact that the length of the author's name is different from the original. Therefore, we first change the author of the commit, and then update the corresponding values. Notepad ++ provides a convenient way to verify that everything was done correctly:

After the properties of all modified commits are corrected, we can consider the stage of getting rid of svn: externals completed. However, I recommend after this to check that the received dump is fully correctly loaded into the local repository:
C: \ Subversion> svnadmin create test_repo
C: \ Subversion> svnadmin load test_repo <without_externals.dmp
During this test, errors made during the modification of the repository can be automatically detected. In this case, the loading of the dump will be interrupted at the location of the error, and we can find out the reasons. This usually happens if we have corrupted the contents of a file or have not updated a file. An error message will signal a checksum mismatch.
We transfer all commits to trunk
With the skills gained in the previous step, it is easy to transfer all commits to the trunk. So, we have a repository without_externals.dmp, in which the first two commits are made to the repository root, and in the third commit, the folders trunk, tags, branches are created, and all files are transferred there.
Action plan:
- Create a clean repository, empty folders, tags, trunk folders in it.
- We split the dump without_externals.dmp into two - revisions from the 0th to the 2nd and revisions from the 4th to the 8th. Kommit number 3 is thrown out!
- In the first dump (from 0 to 2 revisions) we add the prefix “trunk /” to all paths in the file
- We consistently roll both dumps onto our new repository.
- We check that no errors were made, and make a full dump of the resulting repository.
- Modify the time and author for the first commit
Go. Create a clean repository (final_repo folder), check it out (final_checkout). Create 3 folders, add to the repository and commit:
C: \ Subversion> svnadmin create final_repo
C: \ Subversion> svn co file: /// C: / Subversion / final_repo final_checkout
C: \ Subversion> mkdir branches tags trunk
C: \ Subversion> svn add branches tags trunk
C: \ Subversion> svn commit -m "Prepared repository structure"
We split the dump for the repository without external dependencies into two, to eliminate unnecessary commits.
C: \ Subversion> svnadmin create without_externals_repo
C: \ Subversion> svnadmin load without_externals_repo <without_externals.dmp
C: \ Subversion> svnadmin dump without_externals_repo -r 0: 2 --incremental> final0_2.dmp
C: \ Subversion> svnadmin dump without_externals_repo -r 4: 8 --incremental> final4_8.dmp
Open the final0_2.dmp dump and replace all the lines “-path:” with “-path: trunk /”. It is important that we replace it exactly (not just the “Node-path:”), since the paths can be used in the “Node-path” and “Node-copyfrom-path” properties.

We roll final0_2.dmp and final4_8.dmp to the new repository:
C: \ Subversion> svnadmin load final_repo <final0_2.dmp
C: \ Subversion> svnadmin load final_repo <final4_8.dmp
To check the SVN log for final_checkout, all commits should operate on files only on the trunk. Get the updated dump:
C: \ Subversion> svnadmin dump final_repo> final.dmp
We modify the time and the author for the 1st revision, as well as the time of the 0th revision. We take as a basis the time of the 0th revision in without_externals.dmp, use it for the 0th revision in the new dump, then add a few seconds to it and set it for the 1st revision.
We check that the dump received is in order by uploading it to the repository:
C: \ Subversion> svnadmin create test_final_repo
C: \ Subversion> svnadmin load test_final_repo <final.dmp
If during the download an error about checksum mismatch occurred, most likely the line “-path:” was found somewhere inside the files and we accidentally replaced it (in real repositories we cannot see with our eyes all the places that need to be replaced, because lots of). It happens that the line is found exactly in this form, as in the properties of the revision (for example, “Node-path: ...”), so tightening the replacement pattern does not help.
In our example, the error does not occur, which means that we have obtained a repository that is fully usable for migration to Mercurial.
Let's sum up
We reviewed a number of techniques used to modify SVN repositories. Highlights:
- The svnadmin utility is the main tool. Allows you to load a dump into the repository, get a dump for the desired revisions.
- We can alternate in the modified repository “manual” commits and loading of existing dumps. A typical trick - we throw out unnecessary commits from the repository and replace (or just skip) them in the new repository.
- We make a backup for intermediate results as often as possible.
- For all typical operations (dump loading, dump creation, checkout, backup) it is convenient to use bat or bash scripts.
- When we modify the SVN dump in a text editor - do not forget to update the numeric values ​​- the size of the corresponding content in bytes.
- If our SVN dump is large (> 700 MB), then there are problems with editing it, since most text editors for Windows cannot open it properly. Notepad ++ is no exception. I tried a lot of different options, only EmEditor helped, which normally works even on a machine with a small amount of RAM.
- If, after modification, it is not possible to load the repository from the received dump - either they messed up with offsets, forgot to update a file, or corrupted the contents of the file.
- When you have to repeat changes in a commit yourself, it is useful to recursively delete all the .svn subfolders inside a folder. To do this, you can use a script (for Windows) that recursively deletes all subfolders in the current directory:
FOR / F "tokens = *" %% G IN ('DIR / B / AD / S * .svn *') DO RMDIR / S / Q "%% G"
- You should not modify the signatures for commits in any way, as this affects the number of bytes in the revision description in the dump. Because of this, it will be slightly more difficult to update the name of the author of the commit, since you will have to take into account the difference in the length of the signature to the commit.
Conclusion
The repository came to life. Everything mixed up in his once light head - commits, logs, svnadmin. The veil of anesthesia has not yet completely dissipated, but the Repository has already felt something inside it has changed.
Opening the eyes, the repository first looked around - the room was spacious, but the interior was completely unfamiliar. In place of the oak chiffonier with commits, a small shelf stood tall, and instead of boxes with tags on the wall a small sticker was attached with obscure lines like “6fc1d7a7ae346b0a09be5647e94c1561764b8619”. However, surprisingly, the general decoration of the repository room was pleasant, everything was decorated with feeling and according to Feng Shui.
The repository did not even have time to get comfortable with how the doorbell rang. "Hello! Delivery of commits ”- the small man in a cap with an inscription“ Mercurial ”rattled. The repository did not have time to open his mouth, as the stranger quickly handed him two booklets with the words “hg commit” on the cover and was like that.
The repository remembered that he had already experienced this. A long time ago, at the very beginning, a stranger dragged the very first “svn commit” into his apartment. Is history now repeating itself? The repository was captured with delight - very few people are lucky enough to return to childhood, along with all the accumulated experience and knowledge, and it seems that this is exactly what happened! And in vain he stopped believing in Santa Claus - it is possible that this miracle is precisely his handiwork ...
The repository did not know what lies ahead. One thing he understood clearly - he began a new life - an interesting, unpredictable, with adventure. Life on the planet Mercurial.