
On duty, I inherited a certain system that has ~ 15 years of history and about a few dozen installations in different organizations. The system itself is relatively small (~ 25K lines of code, ~ 1K commits), but the problem was in release management:
- there was a main tree in subversion (initially in cvs, of course), where the “main course of the party” was held - some large-scale changes were made, new features were added, global errors were corrected, etc.
- specific installations were made by:
- at best, svn checkout, which was then updated via svn update; almost all installations made local improvements “live” (at least the configuration files were corrected) and these changes were not committed anywhere; if at the next svn update, changes in upstream created a conflict - the conflict was resolved “on the spot” by the programmer who did the update, again without any tracking of changes
- in the worst case, svn export, which then, of course, was not updated at all, remaining once and for all (or at least until the bosses think again) at the level of development of the export date; in especially neglected cases (from the late 1990s - early 2000s), they did this also because there was simply no physical ability to checkout - the organization did not have access to the Internet, they simply brought the archive to a diskette and deployed it once
In practice, of course, grateful customers of this system from time to time still want to get support, fix bugs, and sometimes even some global improvements in the core of the system.
After a short consultation, it was considered inexpedient to continue support of such a distributed system in svn and it was decided to migrate to git.
')
Problem number one - to drag the master tree from svn to git - it was generally solved simply by
standard git-svn tools .
The set of problems number two - how to pour into the tree numerous forks in different installations - it was decided to disassemble "as they become available." When the next organization woke up, it was necessary:
- get them fork
- Understand where he was at the time and forked and to what level he last rebased (if it is svn checkout)
- create a new brunch for this fork
- try to divide the changes into more or less semantically-related smaller pieces and commit them all into this brunch
The main gag was suddenly at step 2 - to understand where the next installation was forked from. In the case of svn checkout, one could at least look at the current state of working copy, in the case of svn export, it was not trivial to guess. Having stumbled upon a semi-manual archaeological study of the state of the code a couple of times, I was bored and decided to automate the searches. There was no ready solution (git bisect here, unfortunately, is no good) and the following script turned out:
The script takes 2 parameters: (1) the path to the git-repository, (2) the path to the next fork candidate, for which you need to find a place for a “tie-in” to the general tree of the project's development. The script trivially calculates the amount of diff (in lines) between each checkout of the repository and the candidate-in-box. With a high probability - a commit where the volume of differences is minimal - and there is an optimal place to base a brunch. The result of the work looks like this:
3810315aaa238e32a7106312f9973f1d1f0ea097 651
19b595d87eecc43933ea60d89882319c7ac3f512 835
989cee69664733b773a4a81cc49e2a1a0cdff38a 872
9026dae1154f98018c808b73c7f1c6cd09310dc7 885
802943edf287ad28d5e71a57510400afacb49176 894
c5bd4050fce754e16664e6e1eeb57a4ff3ed06c6 894
dcb70c4a2e9fc0431ceb6154ecd1688189362622 908
...
This means that the problem will most likely be solved in the following way:
$ git branch new-organization 3810315aaa238e32a7106312f9973f1d1f0ea097 $ git checkout new-organization $ cp -r ../new-organization-fork/* .
... after which you can already deal with the changes, try to divide them into pieces and commit (perhaps even with --date and --author, if you can figure them out).
I would be glad if this solution will be useful to someone else. Comments and tips on how to do better are welcome.