How we’ve influenced organizational behavior with discussion, alignment, and tooling.
Note: Article originally featured on Medium and co-written together with César Lugo, Kao Felix, Javier Gómez, as part of the Engineering Intelligence Team.
The story is old: An engineer sees an opportunity for improvement. Maybe it’s the logs that look wildly different across every service they maintain, making everything harder to debug. Or it could be the error handling that is inconsistent: sensible in one service, confusing in another, and missing altogether in yet another. Turns out people are tired of repeating the same comments in code reviews spending time arguing about the form instead of the feature. “We should make it consistent” and “this should be a standard” are common reactions in this scenario.
In small teams with a few dozen repositories, adopting standards might require the work of just a couple of motivated engineers. However, in the context of a growing organization with hundreds of engineers split between dozens of teams, working across hundreds of repositories — with conflicting priorities and communication overheads — motivation alone won’t do the trick.
At Typeform, we have been tuning up for more than a year an inclusive framework for technical organization change. This framework involves the whole lifecycle: proposal, discussion, approval, tracking, and accountability.
Starts with the standard proposal
Anyone can submit a standard proposal. Having an open process encourages people with different roles and perspectives to participate. A proposal could come from platform engineers, product engineer leads, infrastructure, and even managers. This open process also helps to find possible drawbacks, blockers, and improvements. Comments could come both from the engineers implementing the changes, and the stakeholders of any potential improvements in quality, velocity, reliability, and so on.
Lifecycle of a proposal:
- Draft: Collaborative stage where authors work together to bring a proposal to life. They follow a template that guides them in creating a proposal that:
- Is well reasoned.
- Is supported by data.
- Has considered and compared alternatives.
- Has a plan for implementation.
- Has a strategy for measuring adoption.
2. Feedback: The proposal document is ready for async review by domain experts, and to be commented on by anyone that is interested. Observations, support, improvements, and objections are raised at this stage, shaping up the final version of the proposal. When enough feedback has been received, and all points have been addressed, the proposal moves to the next stage.
3. Final Decision: This stage is a meeting with the final decision maker (by default, our Director of Platform), the author(s) of the proposal, and any specialist in the area when needed. Pros and cons are weighted, feedback is reviewed, and a decision is made: Moving forward or Not for Now.
4. Moving Forward:
- The decision is communicated during the Engineering All Hands meeting.
- The documentation describing how to adopt the standard is rolled out.
- Plans and objectives for adoption are set.
- The tracking of the adoption is implemented and displayed in an open dashboard.
Set your adoption metrics
The evaluation of a new standard is defined and implemented in code.
- A dashboard with the visualization of the adoption rate of all standards over time is open to all in the organization.
- Company-wide and team metrics are available.
- The adoption rate is communicated frequently in all hands, and in weekly scheduled notifications.
Our Engineering Intelligence team has put in place a system to evaluate, visualize and communicate the state of standards adoption over time. This avoids approved ideas from falling under the rug of day-to-day work and important new features. The dashboard also gives us a way to quantify our accumulated technical debt and maintainability costs, which leads to more data-informed decision-making.
A key part of the success of this evaluation system comes from the fact that if a team owns a domain, this team also has the ownership of the code that evaluates a standard in that domain.
For example, one of our Platform Teams proposed and implemented a standard to migrate our repositories from Travis CI to Github Actions. The Intelligence team provides the infrastructure and the system for the evaluation to take place but, the Platform team owns the specific code to measure the adoption of this migration, and they also provide support to other teams to comply with this standard.
Report what matters
- Adoption rate objectives are set by management
- Capacity and roadmap planning is done by teams
- Progress is updated in recurring ceremonies
Rolling out a new objective needs to be done with enough time and visibility so teams have a good head start to comply with it. Initially, standards were required as soon as they were approved, but we received push-back from teams with lots of new features and priorities on their plate. So we reached a sweet spot: if a standard is approved in 1 quarter, it will be required for all teams to comply with it by the end of next quarter. This has worked well so far. We have a stream of new standards becoming mandatory each quarter, and teams are fully prepared to work on them.
Without the technical leadership behind this framework, no amount of effort would move the needle. Our leadership team has bet on reserving capacity, aligning stakeholders, giving clear communications, and setting the stage for teams to report and track their improvement. This allows us to impact the behavior of the engineering organization in a consistent, coherent, and measurable way.
A few bumps along the way
This idea has been around Typeform in many forms and shapes for a while now, but along the way, we faced some challenges.
Here is a non-exhaustive list of challenges we faced, and we hope you could avoid:
- Finding the right visualization was challenging. The secret was listening to our users. We started with ugly terminal screens. Then we moved to ugly tables in Google Spreadsheets. Finally, we landed at Looker where we built professional and very customized dashboards. Now we can visualize the adoption in a timeline, filter out teams, or drill into specific evaluations to get more details.
- We built a bot to automatically check for standards adoption. At first, we didn’t get many contributions from other teams because the bot was originally developed in Python, which isn’t widely used by our engineers. We had to rebuild the bot in our main language: TypeScript, and suddenly we received many more contributions. The teams’ response was so overwhelming that we had to create a process for submitting new checkers scripts.
- We were trying to get teams to adopt standards just by raising awareness. For a couple of months, we kept trying and got very low results. It was only when every team’s goal included a certain level of adoption, and also when teams were asked to report on the progress on a weekly basis, that the level of adoption dramatically increased in the whole organization.
- Keeping old codebases up to standard was a lot of work. Through this process, teams became more aware of the deep technical debt they had accumulated. This led to discussions about retiring old services vs. the cost of maintaining them, with real data to inform the conversation.
Results in unexpected places
So does any of this make a difference? So far we’re seeing good results. Here are a few examples:
- We created a mental framework for tracking adoption. Now everyone that has an idea for a standard also naturally thinks: “How are we going to measure the adoption?”
- Product and Platform engineering teams are proactively planning and working on adopting standards.
- Because new ideas create real impact and more work, more people are contributing to and challenging proposals.
- After our most recent framework improvements, we have increased our adoption rate from around 30% to over 90% in 5 months.
- We are currently tracking 18 approved company-wide standards.
- We start to have a common language when we speak about maintainability and technical debt. That makes it visible to non-technical stakeholders in the organization.
- By using this framework, we were able to migrate our CI/CD provider from TravisCI to Github Actions in a couple of quarters. We didn’t need to schedule a special migration operation or the typical constant reminder from Platform.
- The adoption of security standards is now done in a quick and transparent manner for the Security Team. They can even view the percentage of protection in real-time.
Recommendations for getting started
In the past, we’ve struggled with the same problems that most of you reading this right now have: it takes a lot of effort and patience for a standard to be fully adopted in an organization. If you’re looking for a way to make this process scale inside your company, we can only recommend you to:
- Create an open process for proposing standards. This helps to get everyone on board and align both managers and engineers to a common standard.
- Build a way to check the adoption automatically. This enables you to focus on the most important actions: drive conversations with the teams, understand their backlog and help them prioritize.
- Be creative. This solution is working great for us at the moment, but it might not transfer exactly into your org. Identify your blockers and bottlenecks, and adapt in ways that will work for your teams.
Conclusions
Producing change is not easy, especially when there is core product development competing for resources. But scaling an organization requires that the services and products that are offered perform at the elite level, and that can only be achieved with robust quality, performance, and reliability standards. And with the proper set of conditions, intentional, inclusive, and coherent change is possible.