📜 ⬆️ ⬇️

Transition to DVCS, Mercurial

For what purpose and for whom?


This article is a summary of the main advantages of DVCS.
I put together all the arguments about the benefits of the transition to DVCS (namely, mercurial) and I will try to explain it in an accessible manner for those who have not had practical experience working with any of the DVCS.

We produce software products, services, anything. And the product of our activity is written code, which in the process of delivery to the end users goes through a number of specific stages: they made a commit, everything was put on the conveyor, it went further. The organization of working processes with the use of version control systems may reflect the stages of the development process. Why? You can consider the version control system as a kind of “production pipeline”, in fact, it serves as a conveyor: the code that should go to testing comes first to the testing branch, here the code goes around, if everything goes well, it goes further if you don’t pass testing it falls off .

Why switch to distributed systems? The joys of transition


Local work

The first reason which is indicated in all sources is the ability to work locally.
Another first reason is speed. It may seem ridiculous: "Well, what speed do you need forgiven when you commit or update or other operations that you perform." This is so, in fact, not so often in centralized revision control systems you commit and do not update so often or perform other manipulations with the remote repository. But here is a stick in 3 ends. First, DVCS pushes you to work much more closely with the repository. Secondly, the difference in speed is striking. There is no need at all to think about the product of practically any operations and the speed is not measured at all, that is, the speed is instantaneous.
')
Local commits

Local commits generate frequent commits. What is good? A deeper story, more granular changes and easier to understand what you have done or have done others. The local repository is exclusively yours. Until you have synchronized with the central repository while your repository is little tied to the remote one. It allows you to do different things with your repository, in the most emergency case if you have done something - delete it and take it again from the center, this is of course an extreme case that occurs when you don’t really know what you are doing. Local rollbacks, when you need to return to the previous revision, can be done much less painfully than in SVN. Local work is another big plus. In centralized version control systems (SVN applies to them) commit for frequent is a responsible business, the action is strongly meaningful for which it is strongly prepared. People are afraid to make commits and do them a little.

Trust hierarchies

The ability to build a hierarchy of trust. They are important in the case of the development of a corporate project, especially in the case of a project with a large number of participants and with different quality. Let's say if there is an Open Source project where a person appears on the side of which you do not know who wants to help you, and can really help with something. Trust hierarchies allow you to create relationships between multiple local repositories. You can quite simply and predictably take only from those whom you want and allow you to push only to those whom you really trust. The push process is a new DVCS terminology. In the case of SVN, commit occurs in the main repository and is carried out outside your control. In DVCS, if someone launches into your local repository, you can take this update, review it from all sides, test it and only then decide whether it is suitable for your system and whether it is worth it for you to push it into the integration repository.
On the one hand, hierarchical relations are, of course, good, on the other hand they complicate the whole concept and complicate your life. If the hierarchy is not thought out to the end and is not built as it should, you may encounter a situation where in order to accomplish what you want you will need to perform a chain of operations. Well, if you, for example, have not really decided who can do what, but technically this is “to whom” and “what” is limited, you may have to first push in one place, then upgrade from this place and push in another. Agree, the lesson is not healthy. Well, undoubtedly, using the point-to-point system, that is, a point to the center, when you are a secondary point and the center is the main one and you can only commit and update it, it seems more meaningful at first glance.

Model “Project to repository”

Forcing the model “project to the repository” is another plus which is difficult to overestimate. SVN has such a common practice: you have one project, there are branches, tags and a trunk. That's where you all commit. In practice, to put it mildly, this is not always the case. SVN as a system that can cover a tree of any complexity and work approximately equally, some will say well others will say equally badly, with any complexity of directory systems, and so it provokes you to have a repository for some area of ​​activity. That is, several areas are created that do not overlap with each other. This is another problem, I don’t know how it is solved in modern svn, but for a long time there was no reason to link projects to different repositories in order to do something in common. Well, roughly speaking, from a project that is controlled by one repository that is present there, commit to another. Many have a very bulky system and almost all of your development, all your projects are in one place and are physically stored on one server and in one repository.
In any DVCS this will not work (one repository per project). There is practically no way out. Want to not want you have to organize one project per repository. It has a number of its pluses, a number of its minuses, but the pluses greatly outweigh. Conceptually and essentially, it feels like the right decision. Purely from a technical and safe point of view, this is also seen as the right decision. You can configure, for example, one of the practical cases, the rights for only one repository to one group of developers and the rights to push into another repository to a completely different group and they may not overlap with each other. That is, one can only read from here, others from there can only read and write here. It is clear what is meant - the division of rights for projects.

In the case of deployments, deployments that provide source code are safer in many ways. When deployment occurs in the most primitive way, what is done in svn by svnup or svncheckout is safer here. It is difficult for you, even impossible to take to a combat server or to some kind of “dangerous” server more than you want to take. Take only one repository that you deploy, and we remember that the repository is synonymous with the word "project", that is, in the worst case, your project will suffer. There are also a number of advantages, first of all, the absence of nested .svn subdirectories that svn doesn't do that anymore, but nevertheless many people have a version up to 1.6.
There is one directory of highly-level .hg directory, and these official subdirectories do not appear in all places where necessary and not necessary, and the risk that an evil hand will penetrate them decreases due to the fact that these directories in one place and the place need to be protected only one not in each subdirectory.
Since the projects themselves are separate folders, you can easily copy them from place to place and transfer them through your file system. This is a known problem in svn when you first checkout a high-level project and then manipulate inside the projects. This causes a number of problems. The idea of ​​moving the directories inside the controlled svn hierarchy is a very bad idea and can lead to a lot of headaches and require a specialist who is very good at how everything works in svn so that the situation can be resolved.

Reliability of delivery of revisions

Another advantage of such systems is called "signing" or "subscription of revisions by electronic signature", well, roughly speaking, one or another reliability hash, which is then used in all operations as a means of controlling data integrity. It is about integrity check (data integrity control), not about secrecy and defenses against attacks. What is important and the fact that it is really practical, the system guarantees you the delivery of not destroyed data.

Sane model branches. Merge, branch (merging, branches)

We came to the main advantage, to the very thing that pushes people to switch to DVCS.
That's the problem. The problem is that in systems with central servers (svn) there is complexity and the complexity of branches and support of these branches that is completely recognized by both developers and users. Support means if you have made a branch, then from these branches you have to take them, you have to put them in them, you have to pour them, pour them, you can produce everything that you can imagine in a rollicking tree. And now the mechanism is very far from ideal, it requires from you a strong understanding of what you are doing, why you are doing, maintaining the sequence of actions and often very mysterious events occur. It can be assumed that this is the fault of the programmer who allowed the mystic, but the Internet is full of cries about how people are trying to merge and how they do it strangely. In practice, at the moment, the number of people who are actively working with branches in SVN is not large. And those renegades who sit on svn and actively work with branches, they avoid merdzhi as they can.
Merge, branch in DVCS - simplicity is quite unusual. And it works, well, it just works. What works in subversion can hardly be called a working configuration. And this is not due to the fact that some developers are smarter, others are more stupid, some have provided others not. The fact is that conceptually the system is distributed, it is strongly tied to the branching and then to the merger of these changes. Well, a simple example, if both of you and your colleague take from a certain place a certain copy of the repository and work with it independently, commit to it all the time and carrying out your brilliant changes, when you try to fill it back, you will find that the model is consistent in the simplest case, when one revision replaces another it will not look so consistently in the place where you merged. This is especially noticeable if you push from time to time, from time to time you take revisions from there, in short, the situation of many heads here is not something special, some kind of problem, it is quite a working situation. That is, I would like to say that in distributed systems, if there is no normal merge and normal branching, all this will simply not work. Switching between branches is made extremely simple.

Ssh repository access

It can be complete or controlled, it is already up to the administrator. You can access ssh directly to that remote repository to which you need to access, and depending on the rights you can push there, you can pull from there.

Easy deployment of integration and central repositories and replicators

No special problems to raise a remote repository that will take your pushes or give you pools - no.

Theoretically better scalability

Some praise DVCS for better scalability. That is, they can be easily expanded when, for some reason, one server can no longer cope, it is easy to connect 2, 5 and 10. Some kind of sly advantage, because in practice I didn’t notice so narrow places that the server was a gag. Here, the server's requirement for what is “central” (because it is as central as yours, everyone just accesses it) is a repository, the requirements for access to it, and for performance are much more benign. They do not often communicate with him.

What do you lose when switching to DVCS


A drop of tar. Loss of model simplicity

The very first thing that you lose, and of course a serious loss, is the loss of simplicity of the model. If everything is simple in svn, there is a place where you commit, there is a place where you update its state is understandable and more or less predictable, now the model has completely changed. You have some kind of independent repositories. Even advanced specialists, who have theoretically understood everything in practice, stick into this complicated model with their foreheads, for example, the idea that your repository is not at all like mine and my version number has nothing in common with your version number - it doesn’t fit easily into your head.

Conditional minus - it is necessary to read docks

Mercurial to talk on the fingers is still complicated. It is relatively simple, it is simpler than many other distributed systems, but nevertheless it is necessary to read at least a minimum of documentation. It is necessary to do it. You can not just sit down for him and start working with him fully. For complete understanding, you need to read the documentation.
Really large repositories can be a problem. Just because a large repository with the whole story can be downloaded for a long time when cloning on a bad channel. Well that is done only once.

Ways to do branch

The crown way to do this is clones. The point is that if you have a repository in place And then you can make a clone in place B, work with it, commit and at the time of cloning these 2 copies are absolutely unrelated, you can work on different copies in parallel. This is an example of the most primitive strum. Merge later is also not difficult at all, because you can make a pull repository B from repository A and then make Merge in a standard way. The main advantage of this way, it is difficult to confuse something here. You have 2 different places, they are independent and in order to switch from one place to another (from one strap to another strap) you need to do quite conscious actions. It’s easy to get rid of such a bribe because you don’t need to do anything, you can either just forget about it, or just delete it at the file system level and that’s it, and there is no strum. After you have worked, it is natural for everything to be, it can be removed, and there will be no trace of this very strum.

There are also disadvantages of this method, and some of them are practically not used in a similar way due to these disadvantages. Firstly, a standalone copy of the repository in terms of project management is sometimes a troublesome thing. Perhaps the fact that you are developing a program does not really like when your projects jump from place to place, when you switch to some kind of test or developer version or unstable version and this requires changing the path to the project, well in some situations it is not very conveniently. The second problem is that if you need to make this repository accessible, you need to push 2 times. That is, you need to push your main repository and push this one. They are so independent that it already creates certain problems, and the support of these 2 versions requires double gestures, I mean 2 brunches. And for a large repository, taking a clone can be costly, although there are optimization technologies here, that is, if your server is far away, you don’t need to take clones 2 times from a remote server just take one in place A and then in place B make a clone in A. Mercurial’s smart in order not to copy files locally, it makes links, in this case it also saves disk space, and the most important speed for large repositories is this can be significant.

The second way (also conditional) of the branching is the brunching of bookmarks, that is, you can call any revision somehow and start working with this revision. In principle, if you understand the concept of many heads, all these methods are about one thing - “You lead different heads”. In the case of bookbooks, you just meaningfully call one of the heads, for example, branch A, and work with it. Strictly speaking, you don’t have to call her somehow. If you are able to remember her ID number, you can work with her anonymously and this is a very similar way of another jabbering when you don’t call just updating anything (here “update” means “switch to head ”) to a specific revision to a certain head and work from this place. Bookmark-and just the optimization of this method so that you do not forget what you had in mind.
Now about the merits of browsing bookmarks. Firstly, it is a very fast and very clear way. This is a much more understandable way than making clones, although for some reason clones are considered the default path. All branches are located in a single place, that is, you do not make physical copies (clones).

Named strings mentioned above, the practices say this is the most correct way, what is called hg branch - this way is logical, understandable and quite safe in their opinion, unlike clones, you do not need to maintain 2 parallel development trees and somehow accompany these 2 trees you do not need to reflect these changes with your environment, that is, your project is in one place, just from time to time you switch it to either one branch or another branch. The switching is elementary, done with the help of the standard update command with the name of the strap.
Naturally, in this method as in the rest, switching is done very quickly from one brand to another, they are global and they are eternal, that is, information about the brand is part of the standard meta-date (meta-information) mercurial that provides, and information about strumming there will always be (unlike git). If you push your repository, this information also pushes for any clones for any portability, it is transferred and in my opinion is the right way and it is even difficult to call it a special advantage, this is how it should work.
Of the shortcomings of this method - they are considered 2. Some say this is a complex method, not very clear (it’s hard to agree with that). The second disadvantage is that it slightly spoils this concept of warnings and self-monitoring from the point of view of excessive multi-headedness. When you are trying to inject something that delivers a new head, in the event that this new head is part of your main strum, in this case you probably have something to think about. Of course it depends on your logic, on your agreements. But when you create a new strum and the first push will tell you that it would be good for you to do the "-f" because a new head is being created. In this case, you were exactly going to create a new head, and such a message may be a little surprising and alarming, “why is it that it swears I seem to be doing everything right.”

Perhaps this flaws are over. And from the point of view of merging (merging) of these strummers, it is almost the same in all cases. That is, you need to somehow deliver the information, well, if this is a clone you pull, in other cases you do a merge pointing with what head or what kind of thing you are merging it and everything. From this point on, everything is the same for all ways of branching and merging. I hope that I somehow described the cases when the transition makes sense. People who switched to DVCS believe that this transition makes sense and the meaning was huge and they don’t regret for a minute that they left svn and got acquainted with the new and wonderful world of DVCS.

In addition


Rebase

Problem with rebase. Why is that bad? Information is lost. This is such a thing that immediately at a glance with not obvious results and it is not very clear why to do this. To avoid a large number of commits from the “I’ve almost done feature 1” series, “I’ve almost done feature 1 by 85%”, etc. This feature allows you to make 1 commit per feature and launch it. The danger of a rebase is not in the fact that we merge several of our commits into one; this is, in general, not a rebase, but a basic operation. The danger of a rebase is that we made 10 commits ourselves and we took an update from the repository and there Petya made 10 commits. Now, if we honestly believe, then we see that: here we have a repository branch point, Petiny's changes, your changes, and here at some point they came together, it turned out one source code. If we rebase, we do the following. We accordingly force the version control management system to forget about what we did our changes in parallel with Petya and transplant them over Petite’s changes, and then, after that, commit the result to the version control system. At the same time, this one is the process of “transplanting” to us, it makes a kind of “intermediate” commit which in the code makes some changes in order to synchronize the state after Petit and before us. In this case, respectively, the history is monstrously lost, that is, when we look at such a “flattened” commit in the future, we cannot determine that this was done in parallel with Peter and what actually was the reason for such a merge.
How can I lose history when a rebase is made? There is such a chance to lose history. Here you split the rebase in a particular branch on a specific revision, and another developer after this point made a branch from your branch (it spun off) then it doesn't become obvious. This is the case when both developers work with a public branch, for it the history cannot be rewritten. In order to avoid such confusion for public branches do not rebase, rebase only on their own private branches.

Functional shell

A functional shell is a valuable thing. Suppose we do some kind of feature, write code ... write ... write. Suddenly we see a bug in the code, we have already made some changes in this code. Next we have several options, we can fix the bug and commit everything together, it's all bad. We can save our changes somewhere to fix the bug with a separate commit and then restore the changes, etc. Well, or other curves methods. Merkurilovsky extention shelve does the following. We wrote a feature, we suddenly found a bug, clicked the shelve button, hid our changes, our working copy looks like it was before making changes. And we reigned the bug, commited it, after which we click unshelve and our changes come back to us, we see our changes but now with a fixed bug. We finish our changes after which we commit them.

Repository Management Managers

There are for mercurial repository managers. Allow you to search for a project file, search by comment or content:

rhodecode - the site has a demo, you can see it in action.
phpHgAdmin
built-in light-weight web server
hGate
All solutions are free.

PS Umputun's podcasts and Artur Orlov's report

Update
Corrected the article where it was said that the method of bookching bookmarks is deprived of portability.
In fact, bookmarks are synchronized automatically between repositories.

It was also worth mentioning about mercurial-server. The name is a bit confused, this is not a Mercurial server.
mercurial-server provides an improved management interface for the shared ssh mechanism that is provided in hg-ssh.

Source: https://habr.com/ru/post/170339/


All Articles