Distributed Teams and Agile
Craig Knighton, Marcato Partners, http://marcatopartners.com
Principles and Practices
My career in software development has had many twists and turns, but looking back on it there are definitely a few key moments in which small decisions turned out to have lasting consequences. One such moment was the day back in 2002 when our CTO told me that during the last board meeting we had been asked to develop a plan for taking some portion of our development offshore. Most of my development experience had been working with other engineers that were in the same building, so it was hard for me to imagine what it would be like to create and sustain a distributed development team.
Since that time I have worked with many successful distributed teams with team members from Duluth, Minnesota to Guangzhou, China. My most recent team had team members in Minneapolis, Houston, Denver, and Minsk, Belarus. In the past, I've also worked with teams in Chile and India - while each experience is different, there are many common elements that contribute to a success of the effort. In hindsight, I can now see that our early decision to continue to use Agile methods to manage the flow of work to and from these teams was instrumental to that success, but at the time we were not nearly so confident that it would work.
When you are looking at your first distributed project, the most natural reaction is to try to figure out how to control the outcome through documentation. It's reasonable to assume that communication will be challenging (especially with other countries in different time zones) and that the risks are high, so you use a contract and detailed requirements and design to capture exactly what you want and your expectations for delivery. The road is littered with unsuccessful projects that begin this way, yet the distributed Agile teams we have worked with have all been successful. This seems counter-intuitive, yet the proof is in the results. The goal of this article is to share our experiences and the model for organizing and operating distributed Agile teams that evolved from these efforts, but the main message is much simpler - Agile is not only possible with distributed teams, but in all but a few situations, it is the BEST way to lead a distributed team.
Principles of Success
Instead of diving right into the execution details, let's first step back and look at the principles that should drive our structure and operation. Once we take the time to do this, you should find that you have the understanding needed to fill in the gaps with your own decisions rather than blindly following our specific recipe. Here are the principles that we found led to success - let's take some time to discuss each:
- Divide by feature, not function
- People are the same everywhere
- Individuals and interactions over processes and tools
- Choose team leads carefully
- Treat vendors as partners
- Invest in the distributed lifestyle
Divide by Feature, not Function
If you are looking at an organization chart with a typical reporting structure, then there is a natural tendency to consider distributing along the boundaries of these functional teams (product management, development, testing, technical writing, etc.) Outsourcing vendors will tell you that this will work fine, but while this view of your organization captures the reporting structure, it actually has very little to do with how work flows within your teams and how commitments form and get delivered.
While tempting, we chose instead to try to divide along the lines of functional feature teams. The most important goal is to group developers and testers together, but we also had great success distributing detailed story/requirements development to the remote team as well.
People are the Same Everywhere
Another possible structure for distributing work (especially when using outsourcing) is to divide along the boundaries of maintenance and support for existing legacy products versus strategic investment in new products and technologies. Your local team will compete vigorously to retain whatever work they view as interesting or exciting and will gladly let go the rest. While this solution seems politically expedient, this approach can lead to frustration throughout the organization when the remote teams lack the expertise and proximity to the support organization to provide the expected timely response.
Turnover within a team is a real issue no matter where the people are - the single most significant expense is in building the expertise of a team member in the domain and code base. If you distribute the unseemly work to other locations, then you should assume that you will have retention issues there as well. Instead, assume that every member of your teams no matter where they are located needs the same things - interesting and meaningful work and an understanding of the impact their work will have on the company as a whole.
Individuals and Interactions over Processes and Tools
One of the central values in Agile is "Individuals and Interactions over Processes and Tools" - if you look at how this value gets applied in methodologies such as XP or Scrum, what you see is that practices such as co-location, pair programming, and daily scrum meetings all serve to heighten the frequency and quality of team interactions. Given that this is so central to success with Agile, is this a fatal flaw to applying Agile to distributed teams?
In short, no - many of the other principles help to compensate for the separation in space and time. While Agile certainly works best with everyone on the team in the same room, we will show you later in the article what you need to do to extend this same high quality communication rhythm to a distributed team. Rather than being a detriment, staying true to this Agile principle is the key to managing a successful distributed team.
There is a time during team formation when it's a good idea for team members to meet and get to know each other face to face. We chose to travel to Chile and Belarus at the beginning of those relationships to build rapport within the team, and because it was a long term relationship we also alternated having team members travel between the geographies every 6 months or so. We also frequently use web cameras during discussions and sprint reviews just so we can all see each other. We'll discuss this in more detail below when we describe the distributed lifestyle.
Choose Team Leads Carefully
There is nothing more poisonous to any project, distributed or not, than team leaders that are not invested in making the structure work. Project work is always demanding, so if leaders are looking for an excuse to fail, they will find an easy excuse in blaming the remote teams for not doing their part. Instead, what you want to find are people that enjoy the experience of working through others to get work done and who believe they can make it work. You also need leaders like this in each locale, so having fewer groups with larger teams improves your odds that you can find the leadership talent you need.
Treat Vendors as Partners
Assuming that part of your distributed team includes using vendors in other geographies, this organizational divide reinforces basic trust issues that are naturally present in all companies. Informal and open ended agreements are hard to structure inside the boundaries of a company; if you now try to extend that to your vendor relationships, you can just imagine how your legal team will react.
Typical vendor contracts steer you towards building specific statements of work for each deliverable. The more specific the terms, the more likely it is that both sides of the contract will start to require more detail to be captured before the commitment is made, and before you know it you are back to "big upfront design". Instead, look to structure you statements of work around your iterations with scope and acceptance being built into the natural iteration process.
Invest in the Distributed Lifestyle
The career of a software developer has many possible paths - some choose to remain "technical", while others pursue leadership opportunities. Regardless of the path you have in mind, I am reminded of an important economic truth mentioned in "Practices of an Agile Developer":
"Machines and CPU cycles used to be the expensive part; now they are commodity. Developer time is now the scarce - and expensive - resource." (1)
This simple fact has been true for at least 5 years now, and it has changed just about everything. Technical expertise now requires a high degree of specialization in order to justify the cost of you as a resource. You will have to work harder and train longer to achieve a level of proficiency that you can sustain over a long period of time.
This has also leads to more mobility - companies may only need your particular specialization for a period of time. If your contribution is "just" project management, product management, coding, or testing, then it is likely that those skills can be acquired somewhere else at a cheaper price. There are, however, several very durable skills that companies will always value: domain expertise, leadership skills, and process development to name a few. Regardless of what you do, if you are able to position yourself as capable of delivering what a company needs in a cost efficient manner, then you'll have all the job security in the world. You just have to be prepared to listen to what they need and deliver it on their terms, not on your own.
In a recent meeting with a venture capitalist, I was surprised to learn that he considered the use of distributed teams in low cost countries (aka offshoring) a done deal. In his words: "I think that there are now enough people that know how to make it work that if I were to start up a new venture, I can assume that you would choose to use offshore resources from the very beginning."
That's a huge change from when this phenomenon first bumped into me about 10 years ago. At that time, it was a board-led dictate that we begin using cheaper offshore resources. It's a simple question of competition - when your competitors are able to spend the same percentage of revenue on development but can engage 2 to 3 times as many resources…I'll let you do the math. Yes, productivities differ, but not in the long run (6-12 months), and while you may be willing to work harder for a while to try to match pace, this also is not sustainable in the long run.
If you, like me, assume that this shift in development demographics is an irreversible reality, then the next step is to imagine how your work will change to accommodate this reality. Here's a description of a typical day in our world.
A Day in the Distributed Life…
When I show up at the office at around 7 am, there were usually one or two people already there. These were the distributed team leads. I walk by their offices and wave but not interrupt, as this is their quality time. We are working with teams in Eastern Europe and have 2-3 hours of overlap every morning. The remote folks had also adjusted their schedules and tend to arrive later in the morning and to work further into the evening (6 to 7 pm local time is typical).
The leads have their headsets on and are most likely talking to their liaison in the remote team - we use Skype or ooVoo and cheap cameras and headsets. Everyone is set up on Skype and Instant Messenger as well and we always have it running when we are working. Larger team meetings also include multiple cameras at different sites around the world and GoToMeeting or Webex for desktop sharing. With all of these technologies available all of the time, impromptu questions or issues can be quickly dispatched as if the other person is in the cube next to yours.
Routine is also important - regular scrums at the same time of day or larger group meetings on the same time and day really help communication flow.
If today happens to be Sprint review day, then everyone on all of the teams knows that we meet starting at 8:30 am in a virtual conference. Since the entire infrastructure for participating is there and remote access is pervasive, I might decide to sleep in and attend via phone, Webex, and camera. If I'm travelling, then I can still manage my schedule to be somewhere where I can participate.
What do the local team leads and members do? As already mentioned, a significant part of their day is spent in communication. They also review code, make architectural and design decisions, and often write code themselves. Once their remote team is done for the day, they still have half a day for their own use, and depending on their particular interests, they can use it as they see fit. Their job is to ensure that the team is successful - as leaders in a self-directed team, they are entitled to do this any way that works for them.
This is the trade-off I hoped to communicate - companies have to trade off flexibility on their part for flexibility on yours. If you need to be available to your remote team anytime and anywhere, then they need to give you the resources to do that (and they can afford to from the money they are saving). Once you have that infrastructure in place, you can participate at any time or place. You may realize that you only need to be in the office a few days a week - more power to you! You may head home from the office after lunch because you have a school function to attend but then you'll be back on the phone with your team that evening to wrap up a few project deliverables.
It's been a while since I made this transition myself, but I still remember it well. You may think at first that you have to try to do two jobs at once, but you'll quickly figure out that this is not sustainable. Invest your time instead in getting the infrastructure and flexibility you need, build a sustaining rhythm with the remote team, and then make sure you scale back your total hours to a reasonable amount of time. You should end up working a similar number of hours in a day, maybe just not all in a row or at the same place.
I'd like to wrap up by reiterating one final thought - your value in this new world comes not from what you do with your own two hands but from the value you liberate and the leverage you create by enabling 10 hands to work on your behalf. Once you get to see this in action, you will find that it is just as satisfying to see your ideas become reality through your team's efforts AND the dirty secret is that it's a lot less work! You really can ask for something before you go home at the end of the day and find it waiting for you when you get to work the next day. The people I have seen that latched onto this challenge and made it work for them have learned a skill that is valuable in any software organization in the world. Now THAT is job security.
Creating and Scaling the Distributed Agile Organization
As was mentioned earlier, we approached our first experience with distributed development with a fair amount of skepticism. We were concerned about what this meant to how we organize, how we manage delivery schedules, and what we would see for work quality. We chose to partner with an existing business representative in Chile; this team had shown the ability to use our tools to customize our reference application, so we knew that they could do the technical work, but they had no experience with the processes and tools needed to make offshore successful.
Fast forward to today; we've had a great strategic partnership for the last couple of years with an outsourcing firm based in Minsk, Belarus and we are in the process of transitioning from working with them to an even larger team that is part of our new merged company based in Bangalore, India. Now confident in our use of Agile, this methodology has proven excellent at managing the pace and communication needs when interacting with distributed teams. We've also figured out the structure, roles, and skillsets needed to make it work and we now have roughly two-thirds of our head count offshore. This article will focus on explaining the possible structures, those that we tried along the way, and finally the structure that has proven most productive for us.
We have over the years done a lot of experimenting with different organizational patterns. One possibility is to outsource a "function" such as product management, coding, project management, manual testing, architecture, or automated testing. I've seen organizations conclude that certain functions are less strategic to their organization, so they choose that function for outsourcing. Management literature recognizes this approach as a sound strategy, but to choose which functions are less strategic, you need to make certain value judgments about the work that these functions perform.
We have avoided this option for a couple of reasons - first, it doesn't scale very well. If you decide to only outsource testing, then you are inherently limited to only saving money on that portion of your R&D spending. If we assume a typical 3:1 to 5:1 developer to tester ratio, then obviously this strategy can leave you with no lower cost option for 75-90% of your resources. My second reason relates to the process and organization - we are firmly opposed to separating the test resources from the developers in the feature time. We use testers as an "in line" participant that works side by side with the developers to design and test the solution. The natural barriers to accomplishing this kind of cooperation are only reinforced if there is a geographic boundary separating the teams. Since the testers need to belong to the team and sit right next to their lead and fellow developers, this has never been an attractive option.
From here there are numerous details to consider - can requirements, development, and testing be done by remote teams? How will final product acceptance be done? How will we divide the resources? Have some teams local and others remote, or have all teams have some functions local and others remote?
These decisions form a crossroads for your engineering organization. I've seen a lot of energy go into rationalizing and justifying one plan over another, and it is certainly true that any decision made here is subject to strong political influence. Rightly so, as quite literally the future of some people's jobs lies in the balance, either because the job may be eliminated or the responsibilities change to the point that they may not choose to stay.
In the simplest and coldest of terms, every organization is looking to minimize the cost associated with product development and maintenance. There are certainly situations in which the product strategy of the company helps to justify the decision to work only with a local and more expensive team, but while most companies start with this in mind as they race to get their initial product to market, few companies choose to pursue this strategy in the long run.
The net result is that we use a hollow local organization comprised only of the following roles:
Senior Product Owner - the product manager that is responsible for the product or product lines; this person works with the director to negotiate release schedules and scope and writes the high-level stories (often epics) that drive the release content. This person may also have technical product owners that are embedded in the feature teams to provide local expertise and the additional detail needed by the team.
Director - this person is responsible for the setting the overall rhythm and sustaining the process; this person schedules and moderates the Sprint reviews, estimation meetings, and Scrum of Scrums and works with the product owner(s) to determine the release rhythm, set release scope and schedule, and to make sure there is always sufficient product backlog ready for each Sprint.
Feature Team Lead aka ScrumMaster - this person is responsible for the work product and for all communication with the remote feature team. This is usually a senior developer that has also made the transition to a leadership role; they drive the architecture and design, review code, and manage the flow of stories into the team throughout the Sprint. This person is responsible for coordinating all team functions including kickoffs, daily scrums, and retrospectives although they may offload some or all of this responsibility to the remote team leads.
Customer Response Team Lead - this person is responsible for triage and resolution for all immediate reaction and customer driven work that cannot be handled by the normal release rhythm. They may also provide the configuration management function for the team as they sit at the nexus of multiple code branches and are responsible for back- and forward-port activity needed to keep these branches functionally complete.
That's about it. Everything else, including the stabilization lead, is a candidate for outsourcing. In fact, let's dwell for a moment on one role that may be a surprise to you - technical product owner. First, to clarify - this role is responsible for converting epics determined by the senior product owner into digestible stories that are granular enough to fit within the Sprint interval. This person needs a detailed understanding of the existing product and will produce detailed user interface designs and requirements based on the overall direction of the senior product owner. They need to understand the users of the system, but they do NOT need to understand overall market direction or demand.
When we first formed this team structure, the most common complaint during our Retrospectives was that the developers did not have enough access to the product owner. We could have opted to try to convince marketing to spend more money to hire another local product manager, but instead we opted to use development funds to bring on staff a technical product owner on site with the remote team. The difference was phenomenal - not only did we start receiving much better defined stories, but the remote teams had access to a decision maker that also understood the details of the requirements they had defined. The local product owner was part of the daily scrum and would review progress on definition of detailed stories and to answer any questions that had been escalated over night by the remote team, but over time the remote product owner became confident enough that she could handle these questions without help and just get her decisions confirmed by the local product owner later.
Now let's take a look at this from the point of view of your board or CEO. Assuming that your organization scales somewhere from 15 to 100 FTEs, you are going to see the following distribution of headcount between local and remote resources:
Total Size (in FTEs)
As you can see most of the real coverage comes from scaling the team above 10 people; I have a hard time recommending the use of low-cost distributed teams for product organizations with less than 15 people as you are not getting a lot of leverage from the onshore staff. I've found that offshore efficiencies of at least 2.5 are very reasonable to expect, and in some places you can do even better than that. Using even 2 as a very conservative estimate of the cost multiplier you will experience for remote staff, let's look at the two extremes and summarize the leverage both as a potential cost savings and as increased head count for the same cost. In the case of an original local team of 10 people, you can either reduce your costs by 30% or keep your spending the same and increase your headcount to 16 people. At the other extreme, if you are comparing to a local team of 90 you can either reduce your cost by 40% or increase your headcount to 160 people! Of course, you are most likely going to choose some compromise between cost savings and increased headcount, but either way the business case is compelling.
This analysis assumes the use of an outside contractor organization to provide the resources - part of the efficiency comes from the fact that they are responsible for all aspects of compensation, performance, and benefits. As a result, your much leaner local organization is focused almost exclusively on determining what will be built and architecting, accepting, distributing, and supporting the final work product.
Location, Location, Location….
The most important part of making an offshore relationship work are the employees within your company that assume responsibility for the daily communication, work direction, and architectural guidance of the remote team. As we've described before, these people are allocated to the feature team in about a 6:1 ratio (I think there is diminishing returns at about this point, but depending on the skill of the individual you may be able to push this to 8:1) and they are the conduit for ideas, directions, and decisions flowing in both directions. In short, they are responsible for the quality of the ideas and the work product and are the life blood of a distributed Agile team.
For these people, flexibility is key. To be successful, the will need to find a new lifestyle that is different than the typical 9 to 5 working hours. This is possible, if not mutually rewarding, as long as your let them figure out what is going to work best for them and their team. It's a sure sign of trouble if you see them coming in to talk to the team early in the morning and they are still there at 5 pm every day - you need to encourage them to find a sustainable lifestyle.
Sustainability is essential - when it's new and fun and the project is just getting underway, it's energizing to be doing something different, but this is when the bad habits start. They schedule meetings early in the morning and late at night and think that they also need to be there in the office all day to stay connected with everyone else. In short, this is not possible to sustain - they need to adjust to become something of a remote employee themselves. The closest comparison might be a regional sales force, and just like any company with a remote sales force knows, you need to create regular communication channels and provide the telecommunication infrastructure for them to work anytime, anywhere.
Let's assume you do all of these things right and you now change only the location of the remote team or teams - you would expect the same result, right? I recently experienced watching just that situation play out as we switched from a long term sustainable structure to one with portions of the team on the west coast of the US and other portions in India. Where the leads had been able to maintain a stable communication overlap of 2-3 hours per day in a regular time period, they now had to work hard to juggle schedules on almost a daily basis to find ways to communicate. We would have meetings until midnight and then need to be in the office again by 7 the next day. The team thrashed its communication methods and times constantly to share the inconvenience roughly equally between all involved.
I've heard of other teams that try different alternatives - the offshore provider will say they can solve this problem for you and will likely propose one of two things: they will but one of their employees onsite to be the liaison or they will tell their team in the remote geography that they need to work strange shifts to accommodate their client.
Now, I've been fortunate enough to work with many people in many countries, and so far I have to tell you despite cultural, language, and economic differences, they are all just people. There is no reason to think that these people want to live a lifestyle that you would find intolerable. Instead, in the first alternative you only manage to insert another communication layer involving someone with less knowledge of your product and an increased likelihood of churn because they are not your employees. In the second alternative, you will experience a flight of quality as the most talented people in your remote team look for alternatives that will give them a sustainable lifestyle.
My conclusion? Despite years of wanting to believe that I could make any situation in any geography work, I now believe that the best predictor of long term success in a distributed team is location. If you can't create stable communication patterns, then it's only a matter of time until the circumstances deteriorate. If you are only in it for the short run (3-6 months), then it probably doesn't matter, but if you want to be able to retain quality employees at both ends of the distributed chain, then you need to make sure that they have at least 2-3 hours of overlap every business day. While other locales may have raw labor cost advantages, geographies such as South America, Eastern Europe, and someday soon Africa will continue to have structural advantage in providing lower cost, distributed Agile teams.
Tools and Techniques
If you are new to Agile, then one of the hallmarks of learning Agile is that you are taught to use simple tools and techniques to emphasize the interactions within the team. Daily standup meetings, story cards, task boards, agile estimation - these ceremonies are all designed to bring people together around simple visuals and to simplify the mechanics so that more time is spent as it should be - working together and solving problems as a team.
While it is possible to continue in this way with a distributed team, my experience is that more formality is needed earlier in order to support effective team communication. If I have a team in a significantly different time zone, then usually that team holds their own daily scrum at the beginning of their day and then I will scrum with any other US-based resources and the team lead at a mutually acceptable time later in their day.
You will also find that you want to select and deploy one or more Agile tools designed to support the full application lifecycle needed by your team. While reviewing the pros and cons of each goes beyond the scope of this article, there are several good alternatives from which to choose. At a minimum you will want either a suite or best-of-breed collection that provides the following capabilities:
- Version Control / Configuration Management
- Continuous Build
- Project Management / Tracking including product and sprint burndown charts
- Requirements Management
- Defect Tracking
- Automated Testing
- Unit Testing
I'm currently working with an Agile team where the team members that participate in estimation are in 4 different cities around the world. We started the process by estimating in our heads (or writing it down on something) and then we would randomly pick someone to go first, but we always felt like this approach tended to anchor everyone in whatever the first one or two people said as their estimate, and I was never sure if people were really communicating their first impression.
I was thinking that the ideal tool would be an online application that would allow people to vote and then reveal all of the votes together - and it turns out that such a tool has existed for some time: http://www.planningpoker.com
The site is maintained by Mountain Goat Software and uses their modified Fibonacci sequence. We now run our estimation meetings on that site, but also used GoToMeeting to share a desktop where we would have the story detail. We like this as the discussion almost always clarifies assumptions before the vote and it's nice to capture the important parts of that dialogue directly in your story in the system of record.
There are similar tools available for capturing stories and managing storyboards. Here are a few alternatives to consider if you are interested:
- Digaboard: http://www.digaboard.net
- Pivotal Tracker: http://www.pivotaltracker.com
- Scrumy: http://www.scrumy.com
- More at: http://www.opensourcescrum.com
You will also find online graphical story boards that are either built-in or available as add-ons to each of the mainstream Agile tools.
Agile and Scrum provide a concrete set of practices and ceremonies that support a set of core values designed to improve the likelihood of delivering commercially successful software. Although each group that adopts Agile tends to modify those practices as they see fit, it is hard to know when you cross that line to where the system is no longer working as intended.
After years of using the model described in this article, our conclusion is that distributed Agile does work and that if you spent a day working with one of these teams you would see the same high-functioning, healthy teamwork as you see in any co-located team. Because of this, we are confident that extending Agile to distributed teams can be done in a way that remains true to the values and practices of Agile. We also believe that once you have used Agile to manage a distributed team you will agree that it is a great way to increase the likelihood of success with that team.
Subramaniam, Venkat, and Andy Hunt, Practices of an Agile Developer: Working in the Real World, Pragmatic Bookshelf, 2006, p. 36.
Copyright © Marcato Partners, LLC, All Rights Reserved, August 10, 2010
Related Methods & Tools articles
More Agile and Scrum Resources