I love to work on greenfield projects. However, looking back over the last 8 years I almost exclusively worked on legacy code and technical debt. It took me some time to accept that fact but now I can extract joy from such often painful tasks. Writing this article is one of such joys I want to share with you.
All code is a liability
Labeling some components as technical debt and others as well-designed or clean is a cruel oversimplification. Every line of code eats resources. All of your code is cost, a liability. First we have the cost to write the code. Then we need to run it, causing infrastructure cost. Finally, we have to maintain it.
The difference between technical debt and clean software is a difference in degree, not in kind.
You want your code generate more value than it costs. The code which does not deliver on this promise gets labelled technical debt because the maintenance costs are too high. But an inefficient component which gobbles up resources or crashes and frustrates users and engineers alike is in its essence the same as the stable, scalable solution we enjoy working with. It’s return of investment is just far lower than we expected. Although the cut-off here is arbitrary because technical debt may well have a positive return of investment. It is a difference of degree, not kind.
There are no technical problems
Much has been written about technical aspect of technical debt and little about the human role in it. As an engineer we look at technical debt as a purely technical issue. We want to refactor or replace the component, because every time we touch the component we pay the interest. By refactoring the component we pay off the principal. Paying off the debt might mean replacing unmaintained dependencies, migrating to a different data structure, switching databases or frameworks, rewriting inefficient algorithms or separating tightly coupled components. It’s all technical. But is it really?
The database is a piece of tech. The choice of database is a decision made by a bunch of humans. The interpretation that this choice is ill-fitted is again a human interpretation. The efficiency of an algorithm can be calculated. But if we assume that it takes more effort and time to write a more efficient algorithm then development cost can quickly outweigh the savings in resources. Rather than finding the most efficient algorithm we are better off trying to find a good enough algorithm. Good enough by its very nature is a subjective, human interpretation. The same goes for labels like technical debt.
The system is never wrong or right, it just is. What changes is our perception, when our expectations don’t match the system’s functioning. So by its nature no technical debt is purely technical, there is always a human component. That means we can’t simply go and look for the root cause in the code, we need to debug the human aspect as well.
Consider the context
Let’s take the simplest example of a terribly incompetent developer who drops all data because they wrote:
if delete_all_data = true instead of
if delete_all_data == true
There is one technical reason: An assignment operator instead of a comparison. But there are many human aspects to this:
- How did this person get hired?
- Who is responsible for the quality of the product?
- Was the data supposed to be backed up?
- Who reviewed their code?
All of these questions have to be answered not by the developer or the technology but by the organisation. The organisation is designed in a way that it lets this technical debt exist. The organisation supplied a hotline for anyone to take on instant loans in the name of the company.
When organisations evolve, technical debt appears
Most forms of technical debt have a less clear origin story than the one above. Your tech debt often grows slowly over time and you won’t recognise it’s an issue until it’s too late. But why is it that while writing code we often don’t see the debt? Because in the current context it’s probably the best solution. Only when circumstances change does our interpretation of the code flip from good enough to technical debt.
A well-known example are monolithic applications. They appear naturally because they are simply a reflection of the organisation: There are no separate departments or business units. You all sit together in the same room with everyone’s role somewhat intermingled. There are rarely merge conflicts or release issues because the team is tied closely together, just like each component in the monolith. This is well known as Conway’s law:
Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization’s communication structure.— Melvin E. Conway
Once the company grows and the team gets split into multiple teams does the monolith become technical debt. Of course, the technical issues were always present – it’s just the human aspect which shifted to seeing problems:
- How do you coordinate releases?
- How do you split responsibilities?
- Who is responsible for production issues?
Organisational and technological evolution at different paces
The organisation chart is highly elastic. The executive team can change it with an hour long meeting and one email afterwards. Business units and team borders are abstract and quickly redefined on paper. Our code however is less elastic. Even the single-responsibility micro service probably takes longer to adapt and deploy than drafting the re-org email. This creates friction. The organisation chart dictates one way people should organise but people’s work still evolves around the structure the code mandates.
Think of a team member who had been working on the same part of the code for the last two years. The latest update to the organisational chart transfers her to to another team. However, the team which is now in charge of that component still at least needs a handover. More likely they will often ask questions and add her to pull request reviews as the go-to expert of the codebase.
Considering less elastic parts of an organisation
If we want to have a well functioning organisation we have to consider the less elastic parts of the system. When our organisation grows we need to make space for the growing pains to heal. We need to make time and room for our slower, less elastic systems to adapt.
Here are three guidelines to make this a reality in your organisation:
1. Make time for transitions
Engineers need time to hand over a code base to another team. The teams require time to to adapt the code to suit the new communication paths. The need for such transitions are often clearly visible when multiple teams or business units get restructured. But in fact it happens all the time with just a single person leaving, product decisions being taken or discovering new information.
2. Make your technical debt visible
You can’t fight an enemy you can’t see. So start with visualising your technical debt and then see how you can tackle it. Find the hotspots of your technical debt. The parts which engineers often interact with and cause them a lot of pain. I wrote about how to measure technical debt before so I won’t go into more detail here.
3. Implement feedback loops
When your organisation changes you need to monitor the impact the change has. Establish metrics and feedback loops which report the health of your tech stack and delivery process. Have a close look at metrics like cycle time (how long it takes from starting to finishing) and defect rate. There are many more ways how your code can feed back information to you and your organisation but it’s important to get started with the basics first.
Think beyond the code
You ship your org chart. If you are not aware of this you are doomed to run into the same issues over and over again.
I hope I could give you a new perspective on technical debt, one that is not solely focused on your code, best practices or design patterns. The code is only one part of your technical debt and we tend to overemphasise this aspect. By also looking at the human aspect of technical debt we can find more efficient ways of dealing with our technical debt.