They say that you do not need to reinvent the wheel. At first glance, this seems obvious. If you have spent time on development, then why do it again, because you can reuse the old solution? It would seem from all sides a good option. But not everything is so simple. As an old
“gray-haired” programmer , I have seen organizations fall victim to this misconception over and over again, investing in preliminary design and development, but never achieving the promised massive ROI through reuse. In fact, I believe that
our overly optimistic assessment of the benefits and ease of reuse is one of the most common and dangerous pitfalls in software development.
I consider that the root of the problem is that
Daniel Kaneman formulated, as a rule,
“What you see is what it is” . In a nutshell, it explains the rigid limitations of a person on quick decision making, using only the available data and some basic heuristics. Slowing down thinking takes time and discipline - so instead we try to
replace complex problems that we do not fully understand with simple ones.
In the case of reuse, intuition simply and convincingly represents the hard-to-find analogy of software in the physical world as a “wheel” that should not be reinvented. This is a convenient mental model that we often return to when making reuse decisions. The problem is that
such a reuse is mistaken or, at least, depressingly incomplete. Let's see why ...
(Brief caution: I’m talking about large-scale reuse, not about the level of methods and functions. I’m not at all going to apply the DRY principle at lower levels of detail. I also see reuse as the use of certain services, libraries and other things created within the company, not outside. I do not recommend creating your own JS MVC framework! Okay, now back to the originally planned article ...)')
Intuition
Imagine system A, which inside contains some logic C. A new system B should soon be built, and it also needs the same basic logic C.
Of course, if we simply extract C from system A, then we can use it in system B without the need for re-implementation. Thus, the savings is the time to develop C - it took to be implemented only once, and not again for B.
In the future, even more systems will find the need for the same common code, so the benefits of such allocation and reuse will grow almost linearly. For each new system that reuses C instead of an independent implementation, we get additional savings equal to the time spent implementing C.
Again, the logic here is simple and seemingly iron - why develop several instances of C, if you can just develop it once and then reuse it. The problem is that the picture is more complicated - and what seems like an easy springboard for ROI can turn into an expensive straitjacket. Here are some options how our basic intuition about reuse can fool us ...
Reality
The first problem is the
selection . Intuition said that C can be obtained like a designer's detail is beautiful and easy. However, the reality of disentangling a common code may turn out to be different: you are trying to pull one macaroni out of a bowl - and you discover that the whole dish is just one big macaroni. Of course, everything is usually not so bad, but there are a lot of hidden dependencies and connections in the code, so the initial idea of ​​area C grows as you start to unwind it. Almost never as easy as you expected.
In addition, C almost always needs other things to work (for example, in other libraries, utility functions, etc.). In some cases, these are general dependencies (i.e., both A and C need them), and in some cases not. In any case, a simple picture of A, B, and C may not look so simple anymore. For this example, assume that A, B, and C share the common library L.
Another problem is
change : different C users will often have slightly different requirements for what they should do. For example, in C there can be some function that should behave a little differently, if it is called by A, than if it is caused by B. The common solution for such cases is parameterization: this function takes some parameter that allows it to understand how itself lead with regard to who caused it. This can work, but it increases the complexity of C, and the logic is also cluttered, because the code is filled with blocks like “If the call is from A, then run such a block of logic”.
Even if C is
really perfect for A and B, anyway, the changes will almost certainly have to be made with the advent of new systems, say D and E. They could use C as is, but then they themselves need some refinement, a little or more. Again, each new C adaptation presents additional complexity — and what was previously easy to understand is now becoming much more difficult as C turns into something more, satisfying the needs of D, E, F, and so on. Which leads to the next problem ...
As complexity grows, the developer finds it harder to understand what C does and how to use it. For example, developer A may not understand any parameter for function C, since it only applies to systems E and F. In most cases, some level of API
documentation is required (perhaps
Swagger , Javadoc or more) to explain the input and output, exceptional conditions and other SLA / expectations. And although the documentation itself is a good thing, it is not without its problems (for example, it needs to be kept up to date, etc.).
Another consequence of increased complexity is that it becomes harder to maintain
quality . Now C serves many hosts, so many borderline cases appear for tests. In addition, since now C is used by many other systems, the influence of any particular bug is amplified, as it can surface in any or all systems. Often, when making any change in C, it is not enough to test only the common component / service, but a certain level of regression testing is also required for A, B, D and all other dependent systems (it doesn’t matter if the C change is used in this system or not!).
Again, since we are talking about reuse on a non-trivial scale, it is likely that C will have to be developed by a separate group of developers, which may lead to a
loss of autonomy . Individual groups usually have their own release schedules, and sometimes their own development processes. The obvious conclusion is that if team A needs some improvement in C, then it will probably have to work through process C, i.e., champion A must provide the requirements, defend its priority and assist in testing. In other words, Team A no longer controls its own destiny with respect to the functionality that C implements - it depends on the team that supplies C.
Finally, when updating C, by definition,
different versions appear. Depending on the nature of the reuse, various problems may arise. In the case of reuse at the assembly stage (for example, in a library), different systems (A, B, etc.) can stay with their working versions and choose the right moment to upgrade. The disadvantage is that there are different versions of C and there is a possibility that any one mistake will have to be fixed in all versions. In case of reuse at runtime (for example, microservice), C must either support multiple versions of its API in one instance, or simply upgrade without backward compatibility and, thus, force A and B to update, too. In any case, requirements for reliability and rigor of processes and organizations to support such reuse are significantly increased.
Conclusion
In sum, my point is
not that large-scale reuse should be avoided, but that it is not as simple as intuition says. Reuse is indeed difficult, and although it may still have advantages that outweigh the disadvantages, these disadvantages should be realistically considered and discussed in advance.
Even after careful analysis, if large-scale reuse
is correct, you need to decide how to do it. Experience prompts to be more careful with dependency arrows. Reuse when the “re-user” is under control is almost always easier to implement and easier to manage than reuse when the reusable resource accesses the system. In the example above, if C were a library or microservice, then A and B would get control. In my opinion, this speeds up implementation and reduces the level of management / coordination in the long term.
Turning C into a framework or platform switches dependency arrows and complicates control. A and B are now obligated to C. This type of reuse is not only harder to implement (and do it right), but it further leads to a stronger blocking (i.e., A and B are completely dependent on C). Popular
wisdom says that "the library is a tool, the framework is a way of life."
Finally, I would like to know your opinion or experience of large-scale reuse. When does it work, and when does it fail?