⬆️ ⬇️

How development workflow affects task decomposition



One of the most important factors influencing the speed of development and the success of a project launch is the correct decomposition of the product manager’s idea into tasks for programming directly. How to do it right? Take the script of the new features from the product and immediately start coding? First, write acceptance tests, and then - the code that will ensure their passage? And maybe, to shift everything on the shoulders of developers - and let them decide during the scrum poker ?



Let's think and identify the problems that may arise in the process of separation of tasks, and ways to solve them. This post will cover the basic principles of task decomposition when working in a team. My name is Ilya Ageev, I am the head of QA in Badoo. Today I’ll tell you how workflow influences decomposition, how different testing and layout of tasks that arise as a result of decomposition differ, and what rules should be followed for the development process to run smoothly for all participants.



Why is it important?



It must be remembered that the development process is not only a direct session of writing code. When we talk about development, I urge you to look at the whole process, from the formulation of the problem to the stable operation of the feature of our users. If you do not take into account all the stages that precede coding and follow it, then it is very easy to get into a situation where everyone does something, performs their KPI , receives bonuses, and the result is deplorable. Business is bent, competitors are "stifling", but at the same time everyone is great.



Why it happens? It's simple: human psychology makes people look at situations from the point of view of their own comfort . The developer does not always want to think about what will happen to the code after it is written. Solved the problem - and good. He is extremely rarely interested in this (which is why we, IT professionals, work in this industry - our motivation mainly rests on the interestingness of the tasks), because in relations with people there is so much uncertainty. Many developers feel much more comfortable sitting at a computer and concentrating on solving their own interesting task - blockchains with neural networks - they don’t want to be distracted and think about some product managers, deadlines, users who will then use their ingenious work (or else they will begin to criticize!).



This is not bad and not good - we appreciate the developers precisely for thoughtful and competent solution of technical problems. But a narrow look at problems often stops development. And it's about the development of not only specific people, but also the company as a whole. After all, the growth of the company and the improvement of the corporate culture are possible only with the growth of each employee. Therefore, it is important for us to sometimes get out of the "cocoon" and force ourselves to look at problems more broadly in order to stimulate this growth.



And, of course, if such an important stage, as decomposition, is entrusted to a person who looks at everything exclusively from the point of view of his own convenience, there is a real risk to sting a lot of problems at subsequent stages: when merging the results of his work with the results of others, with code review, with testing, laying out in production, etc.



Thus, determining for oneself how to properly break up a particular task, estimating where to start and where to come as a result, it is important to take into account as many factors as possible, and not to look at the problem only “from your bell tower”. Sometimes in order for things to work faster and more efficiently in the next stages, you have to do something more complicated and slower at the stage for which you are responsible.



A good example is writing unit tests. Why do I need to spend my precious time writing tests, if we have testers who then test everything? And then, that unit tests are necessary not only to facilitate the coding process - they are also needed in the subsequent stages. And they are needed as air: with them the process of integration and verification of regression is accelerated tens, hundreds of times, the automation pyramid is based on them. And this is even if you do not take into account the acceleration of your own work: after “touching” the code in some place, you yourself need to make sure that you do not accidentally break something. And one of the fastest ways to do this is to drive away unit tests.



Workflow



Many teams, to somehow formalize the relationship between the participants of the process, agree on the rules of work in a team: agree on coding standards, a common workflow in the version control system, establish a release schedule, etc.



Needless to say, if you initially agree on a process, without taking into account the entire life cycle of a feature, you can get a slowdown and a “rake” in the future? Especially when you consider the growth of the project and the company. We do not forget about premature optimization , but if there is a process that works well on different scales, then why not use it initially?



Speaking of workflow development, many who use Git, immediately recall (in vain) about some kind of "standard git-flow", considering it ideal, correct, and often embed it in themselves. Even at conferences where I spoke, talking about workflow in Badoo, I was asked several times: “Why did you invent your own, why don't you use standard git-flow?” Let's understand.





First of all , usually speaking about this flow, they mean this picture. I took it from the Vincent Driessen article “A successful Git branching model”, which describes a scheme that has worked quite successfully on several of its projects (it was back in 2010).



Today, some large players in the hosting code market generally offer their flow, criticizing the “standard git-flow” and describing its flaws; give their schemes, recommendations, techniques.



If you search on git-scm.com (it would be good to google it at all), then you will be surprised to find that there is no recommended (and even less “standard”) workflow. That's because Git is, in fact, a framework for storing versions of code, and how you organize this storage and collaboration, depends only on you. You should always keep in mind that if some flow "took off" on some projects, this does not mean at all that you, too, will fully fit it.



Secondly , even in our company, different teams have a different flow. The flow of PHP server code development, C / C ++ and Go demons, the mobile command flow are different. And we did not immediately come to this: we tried various options before dwelling on something concrete. By the way, not only workflow differs in these teams, but also testing methodologies, setting tasks, releases and the delivery principle itself: what is delivered to your personal servers and computers (smartphones) of end users cannot be developed equally by definition.



Thirdly , even accepted workflow is a recommendation rather than an indisputable rule. The tasks of a business are different, and it’s good if you managed to choose a process covering 95% of cases. If your current task does not fit into the selected flow, it makes sense to look at the situation from a pragmatic point of view: if the rules prevent you from making effective, to hell with such rules! But be sure to consult with your manager before making a final decision - otherwise a mess can begin. You can tritely ignore any important points that are known to your supervisor. And, perhaps, everything will go like clockwork - and you will be able to change the existing rules so that it will lead to progress and will be the key to growth for all .



If everything is so difficult, and even the flow is not a dogma, but only a recommendation, then why not use one branch for everything: Master for Git or Trunk for SVN? Why complicate things?



To those who look at the problem one-sidedly, this approach with one branch may seem very convenient. Why bother with some branches, sweat with the stabilization of the code in them, if you can write the code, commit (push) to the common repository - and enjoy life? And the truth is, if there are not very many people working in the team, it can be convenient, since it eliminates the need to merge branches and organize branches for release. However, this approach has one very significant drawback: the code in the general repository may be unstable. Vasya, working on task # 1, can easily break the code of other tasks in the common repository, flood his changes; and until he corrects them / rolls back, the code cannot be uploaded, even if all the other tasks are ready and working.



Of course, you can use tags in the version control system and code-frieze , but it is obvious that the tagging approach is not very different from the branching approach, at least because it complicates the initially simple scheme. And the code-frieze, all the more, does not add speed to work, forcing all participants to stop development until stabilization and release calculations.



So the first rule of good task decomposition is as follows: tasks should be broken down so that they fall into a common repository in the form of logically complete pieces that work by themselves and do not break the logic around them.



Feature branches



With all the variety of options for workflow in our company they have a common feature - they are all based on separate branches for features . This model allows us to work independently at different stages, to develop different features, without interfering with each other. And we can test them and merge them into the general storage, only after making sure that they work and do not break anything.



But this approach also has its drawbacks, based on the very nature of feature lobbies. In the end, after isolation, the result of your work will need to be drained into a common place for everyone. At this stage, you can overwhelm a lot of problems, ranging from merge conflicts and ending with very long testing / bug fixing. After all, separating into your branch of code, you isolate not only the general storage from your changes, but also your code from changes from other developers. As a result, when it comes time to merge their task into a common code, even if it is tested and working, “dances with a tambourine” begin, because Vasya and Petya in their branches affected the same lines of code in the same files - conflict .



Modern storage systems for code versions have a bunch of handy tools, merge strategies, and more. But to avoid conflicts is still not possible. And the more changes, the more flattering they are, the more difficult and longer these conflicts are to resolve.



Conflicts related to code logic are even more dangerous when SCM merges code without problems (because there are no conflicts in the lines in files), but due to the isolation of development, some common methods and functions in the code changed their behavior or were removed from the code altogether. In compiled languages, the problem seems to be less acute - the compiler validates the code. But the situation where the method signatures have not changed, and the logic has changed, has not been canceled. Such problems are difficult to detect, and they further alienate the happy release and force them to retest the code many times after each merge. And when there are a lot of developers, a lot of code, a lot of files and a lot of conflicts, everything turns into hell, because while we corrected the code and rechecked it, the main version of the code has already gone far ahead, and we need to repeat everything again. Do you still not believe in unit tests? Hehehehe!



To avoid this, many are trying as often as possible to merge the results of general work into their branch. But even the observance of this rule, if the feature layer is large enough, will not help to avoid problems, no matter how hard we try. Because you are receiving someone else’s changes in your code, but no one sees your changes . Accordingly, it is necessary not only to fill in someone else's code more often in your branch, but also your code in the general storage - as well.



Hence, the second rule of good decomposition : feature bridges should contain as few changes as possible in order to get into the common code as soon as possible.



Parallel work



Well, but how then to work in separate branches, if several programmers are working on the same task, divided into parts? Or if they need changes in parts of the code common to different tasks? Both Petya and Vasya use a common method, which in the framework of Petya's task should work according to one scenario, and in Vasya's task - according to another. How to be?



Here a lot depends on your release cycle, because we consider the moment of its completion in production as the moment of completion of the task. After all, only this moment guarantees us that the code is stable and working. If you did not have to roll back the changes from production, of course.



If the release cycle is fast (several times a day you lay out on your servers), then it is quite possible to make the features dependent on each other in readiness stages. In the example of Petya and Vasya, above we create not two tasks, but three. Accordingly, the first one sounds like “we change the general method so that it works in two variants” (or we get a new method for Petit), and the other two tasks are the tasks of Vasya and Petit who can start work after the first task is completed without intersecting and without interfering with each other.



If the release cycle does not allow you to lay out often, then the example described above will be an exorbitantly expensive pleasure, because then Vasya and Pete will have to wait days and weeks (and in some development cycles and a year) until they can start working on their tasks.



In this case, you can use an intermediate branch, common to several developers, but still not stable enough to be laid out on production (Master or Trunk). In our flow for mobile applications, this branch is called Dev, in the Vincent Driessen scheme it is called develop.



It is important to keep in mind that any change in the code, even the merging of branches, the injection of common branches into a stable Master, etc., must be tested (remember about conflicts by code and logic, right?). Therefore, if you come to the conclusion that you need a common code branch, then you need to be ready for another testing stage - after the merge, you need to test how the feature integrates with another code, even if it has already been tested in a separate branch.



Here you can notice that you can after all test only once - after the merger. Why test before it, in a separate thread? Right, you can. But, if the task in the branch does not work or breaks the logic, this inefficient code will get into the general storage and not only will prevent colleagues from working on their tasks, breaking some parts of the product, it can also put a bomb, if there are wrong changes decides to base the new logic. And when there are dozens of such tasks, it is very difficult to look for the source of the problem and fix the bugs.



It is also important to understand that, even if we use an intermediate developmental code branch, which may not be the most stable, the tasks or their pieces in it should be more or less complete. After all, we need to be released at some point. And if in this thread the code of the features will break each other, then we will not be able to give everything - our product will not work. Accordingly, after testing the integration of features, you need to fix bugs as soon as possible. Otherwise, we get a situation similar to the one when using one branch for all.



Therefore, we have the third rule of good decomposition : the tasks should be divided so that they can be developed and released in parallel.



Feature flags



But what about the situation when a new change in business logic is a big one? Only programming such a task can take several days (weeks, months). Let's not we merge unfinished pieces of features into the common storage?



And here we will be! There is nothing wrong with that. An approach that can be applied in this situation is feature flags . It is based on the introduction of “switches” (or “flags”) into the code, which enable / disable the behavior of a certain feature. By the way, the approach does not depend on your branch model and can be used in any of the possible.



A simple and understandable analogue can be, for example, a menu item for a new page in an application. While the new page is being developed piece by piece, the menu item is not added. But as soon as we’ve finished everything and put it together, we add a menu item. The same with the featureflag: we wrap the new logic in the condition of flag inclusion and change the behavior of the code depending on it.



The last task in the process of developing a new big feature in this case will be the task “enable featureflag” (or “add a menu item” in the example with a new page).



The only thing you need to keep in mind when using featureflags is an increase in feature testing time. After all, the product must be tested twice: with the feature set turned on and off. You can save money here, but you should act extremely delicately: for example, test only the state of the flag, which is laid out to the user. Then, during the development process (and the calculations in parts), the tasks will not be tested at all, but will be tested only during the verification of the last task “enable feature flag”. But here we must be prepared for the fact that the integration of the pieces of the feature after the flag is turned on may go away with problems: bugs may appear that were allowed in the early stages, and in this case, finding the source of the problem and eliminating errors can be costly.



Conclusion



So, when decomposing tasks, it is important to remember three simple rules:



  1. Tasks should be in the form of logically complete pieces of code.
  2. These pieces of code should be small and should fall into the common code as quickly as possible.
  3. These pieces should be developed in parallel and laid out independently of each other.


Where is easier? By the way, the independent calculation, in my opinion, is the most important criterion. One way or another, other points flow from it.



I wish you good luck in developing new features!



')

Source: https://habr.com/ru/post/335254/



All Articles