Scaling really large products

Cross-functional teams let your product scale quickly, but after time some cracks appear and delivery speed can suffer. We share four different approaches that can help teams continue to perform as they scale.

In a previous article, I wrote about how to structure product teams for scale. Each team needs to be autonomous because otherwise the number of dependencies between teams will increase and slow down the delivery of everyone.

An example of an e-commerce product separated into Catalogue, Search, Cart, Listings, Fulfillment and Analytics.

But autonomous teams will also run into problems as the product scales.

Large products require a lot of changes to either support additional use cases of different customers or even just to manage the load as usage scales. You could increase team sizes, but this has a reducing marginal benefit because teams that are too large become difficult to align and coordinate. Another option is to reduce the scope for each team and increase the number of overall teams. This can work to a point but eventually you lose the benefit of a team being able to deliver a meaningful feature on their own without creating new delivery dependencies between teams.

Luckily there are a few different techniques that companies can use to continue scaling their products.

Centralised Teams / Centre’s of Excellence (CoE)

A common approach is to isolate out some of the skills into a separate center of excellence, or centralised team. Many companies have centralised research and design teams who offer services to and support the work of the autonomous product teams.

Unfortunately, Centers of Excellence are an anti-pattern. According to research in the State of DevOps Report 2019, CoE’s are correlated with poor business performance because they become bottlenecks, hinder innovation, and create dependencies that slow down the delivery process.

Team Topologies, a fantastic book about how to structure teams, does have a style of Center of Excellence that they call Complicated Subsystem teams. These teams are specifically for products where the skillset is so difficult to decentralise that the overhead of collaboration is justified. Their examples include machine learning and legacy mainframe systems that require very bespoke technical skill sets.

Research and design activities do not fit into this description so creating separate CoE teams for these functions should be avoided.

Platforms

There are some activities that each team will need to perform. Things like authentication, logging, message handling and reporting processes are needed by all value streams. There are also going to be product specific functionality that might be required across teams such as inventory management, payment processing, recommendation engines and more. These are commodity processes that customers expect but they are not differentiators for your product.

By moving these elements of the product into a centralised platform you can free up development effort for all of the aligned value stream teams. This will leave them with more capacity to focus on higher value, differentiating work.

In addition to the technical platforms, there is also an opportunity to move some design work into a design system platform and some research work into a research participant sourcing platform.

An example of a product supported by a technical platform, design system as well as other internal products like finance and legal.

There are tradeoffs however. It is a delicate balancing act to identify how much work should be abstracted out to the platform because, if too much is abstracted out, teams might require continuous platform changes to support their new features. Instead of having autonomous teams you end up with a central bottleneck for all work. Fortunately there are practices like Wardley Mapping that teams can use to identify the commodity elements of their products.

Enabling Teams

Reducing scope is not the only way of increasing the performance of a team. Increasing the efficiency of the teams can also increase throughput. Individuals within empowered product teams are often isolated from their peers so upskilling can be difficult. This is where Enabling Teams come in. Enabling Teams are functionally aligned teams whose remit is to improve the efficiency of each value stream team.

A product supported by ProductOps, ResearchOps, DesignOps and DevEx enabling teams.

Every team is performing research, design and delivery activities. Rather than having every team re-invent the wheel Enabling teams can publish best practice guides for different circumstances to help speed up teams. They can also encourage the sharing of knowledge across teams through communities of practice.

Finally, they can define what is expected within their function for each level of seniority and create training curricula to help up-skill people to achieve the required competence levels.

Using a combination of these tactics, Enabling teams can improve the performance of the value stream teams.

Site Reliability Engineers (SREs)

When you empower product teams, one of the core principles is “You build it, you run it”. The idea here is that in the past developers would through sub-par code across the wall to operations and they would then be responsble for running the code in production. By making the team who wrote the code run it, there is a lot more emphasis put on making sure the code is performant, robust and maintainable. However, the overhead of running the code does detract from building new features.

Google had this issue, because the scale of their applications meant that teams needed very particular skills to operate their product effectively. They created an SRE role that would behave like the Operations teams before but with a difference. The SRE’s needed to be able to code as well as operate the infrastructure because they may need to alter the programs that they manage.

Most companies do not need this separation as most products do not have hundreds of millions of users. While this is a great pattern to offload work at very high volumes, most companies would be better investing in deployment automation, instrumentation and monitoring.

Product Separation / Duplication

Before software development, companies faced the challenge where the organisation was becoming too large to manage effectively. Their solution was either separation by business unit or even into completely separate companies. In effect, they decided that it was better to duplicate all of the functions of the business to reduce the overhead of coordination.

For example, GE decided to split into multiple different companies based on the industries that they were targeting such as GE Healthcare, GE Energy, Synchrony Financial, Current and more. Other companies such as Nestle and Proctor and Gamble are managed through independent geographical business units.

Tech companies have also followed a similar approach. Uber operates separate product offerings in different geographies with local autonomy over product development. Revolut have a split between their B2B and B2C product offerings, even with an underlying shared banking platform. And many companies offer product versions more specialised for the needs of industries verticals like public service, healthcare, finance and more.

This is the most expensive way of reducing complexity and can lead to inconsistencies in user experience across the different geographic, customer segment or industry categories so it should be chosen only as a last resort.

Conclusion

The best approach is to start with Enabling teams to better use the knowledge and skill sets that you already have. In parallel you can cautiously abstract items out into platforms, constantly monitoring to ensure that you have not crossed the boundary where changes that the stream teams want to make require changes in the platform. When you cross that boundary, don’t be afraid to remove some scope form the platform and return it to the teams.

As you continue to scale you can start looking at more drastic measures like separation into different business units, and even different product lines. There is a lot of efficiency that you can gain from the first approaches though, before you move into these areas.