Systems and Team Drift
As an organisation grows, the patterns for architecture and the teams that form around them shift. It's common for this shift to cause a drift in the structure of teams and communication patterns, and the architecture they support. This is one of the challenges we are facing at MHR.
What is drift?
Drift is the separation of the intended/desired architectural design and the design created due to the communication and formation of the teams building it.
An example would be having a monolith application and a small number (3-4) of small teams (4-6 people) building it. The tech stack is simple, a back-end(BE) language and HTML, CSS and JS for the front-end(FE). Two teams are building the product, containing FE and BE engineers and separate teams for infra/ops and design.
As the application grows and we create more teams, the decision is made to move the front end onto a modern library like React, Angular, or Vue, as doing so allows you to create a central repository of shared design components. How do we do it?
Do you create a new centralised team of FE engineers? Do you create a new small matrixed team of FE engineers? Do you move the design into the Product Teams and have a matrixed team of Design and FE?
This is where drift starts. e.g., create the new shared library of code and leave the teams as they are because you want them to "own their domain". The org grows even larger, introducing more domains and teams, the communication between all these teams becomes less frequent, and there is a more significant disconnect between them, all natural growth patterns in any org. What do you think happens to the shared library?
You start to see more instances of code relating to one team and domain — a drift from the intent of a centrally shared library. If you have never heard of it, this is Conway's law in practice.
Conways Law "Organisations design systems that mirror their own communication structure."
You can break down communication structure into "team structure", "incentives", and "operating models". But for now, we will focus on the interplay between Teams and Architecture. The other aspects are articles all of their own.
You can picture how much more complicated this gets when you start to think about breaking the monolith into microservices and introducing a modern DevOps model. The amount of drift and the size of the drift can be significant. It happens naturally as the rate of change between the technology system and the teams building it are not the same. The path is not linear.
So back to MHR ...
Our Teams and Architectural drift
When the latest product was first created, it was sensibly created out of the engineering group that existed at MHR. As this engineering group built the product, they created an architecture following the desired best practices for the system and the teams based on existing structures and processes. Both sensible decisions to expedite delivery. However, this created a drift between the desired product architecture and the teams from the outset, and this drift has since grown as both the product and the teams building it have grown. It's time for us to correct this drift to allow us to move forward in the direction we want to take.
Analysing the drift
What is the drift at MHR? As mentioned in the opening article in this series, our SDLC is based on the DSDM Framework. Our teams are then built around this model. This means our teams have a Software Delivery Lead, who acts as the line manager for the team members, and the project manager responsible for delivering the work on schedule. The teams are then made up of a mixture of FE and BE engineers and Quality Analysts. All teams are then supported by a central Architecture and Platforms team. The size of the teams is proportional to the size and complexity of the product area they are responsible for. All the teams contributed to a number of repositories relating to "services" as they attempted to build a microservice-style architecture. However, the communication and operating patterns of growing teams with the DSDM framework have resulted in a more monolithic architecture than the intended design. The result is a semi-distributed code base with a lot of tight coupling. The natural result of the drift between the intended/desired architecture and how the business scaled the teams.
Steering into the drift
We've noticed a drift happening at MHR; now, how are we going to correct it? This is where many companies get it wrong. The default answer for many companies at this point is to suggest a re-org. If you're unfamiliar with the term (lucky you), it's when you take what you have, rip it apart, and rebuild it into new teams.
The problem with a re-org is that it kills team morale, breaks existing communication structures, and often results in a few people being "out of place" in the new structure. Usually, those left "out of place" are either made redundant or merged into teams with no sense of purpose, often resigning not long after. Re-orgs are complex and dangerous; they're a last resort, not the first tool in your tool belt. That's not what we want to do. Instead, we will correct our drift how you'd correct drift in a car, steer into it, making minor adjustments until you're back on track.
We're starting by adjusting our team structures to align more closely with our intended architecture. Borrowing from Team Topologies, we want to have our teams' own vertical slices of the platform and be able to deliver independently (Stream Aligned). To help us do this, we are moving teams away from DSDM to scrum (as per Part 1) so that they focus on delivering value in their product stream. As we do this, we are changing the team structures. We are re-deploying our Software Delivery Leads into positions as Agile Delivery Managers (Scrum masters) or Engineering Managers based on their background and skill sets. Depending on that transition, we will hire to fill the missing ADM or EM. Then we will break down the larger product areas into smaller ones and break apart those larger teams (10+) into two smaller (5-8) closely aligned teams (even sharing the ADM and EM where appropriate).
This aligns our team structures and communication more with our desired micro-service architecture. Small, independent teams that release and communicate change outwards without relying on other teams.
You'll see how all the concepts outlined in pt.1 connect here. To straighten up the direction of travel, the subsequent correction we will need to make is breaking apart our monolithic and highly coupled"micro-services". Each new team will begin to work on doing this, taking the areas they are responsible for and transitioning them into a truly independent service so that we can begin to create the architecture to support our independent team structures. In essence, giving us independent teams working on independent services in a microservice architecture.
How we do this and identify the proper boundaries in our domain to correctly identify how to transition and split teams will be a topic for another post – assuming we do so correctly and don't "over-correct" to find we need to adjust again later. Let's wait and see.
Lesson's to learn
The key things we have identified and the key lessons I hope you can take away here are;
- Drift is natural in a growing organisation.
- Correcting drift, as you notice it, is always better than a re-org.
- Conway's law applies, and your team structure affects your architecture.
- You can't course correct all at once, you need to make small adjustments, or like when correcting drift whilst driving, you can over correct and drift in another direction.