(9 of 14)
Stand-ups, which Mike Abbott says became a standard part of his playbook at Twitter, are Silicon Valley--style meetings where everyone usually stands rather than sits and works through a problem or a set of problems, fast. Then everyone disperses, acts and reports back at the end of the day at a second stand-up. Dickerson held the first one on Oct. 24. He would convene them every day, including weekends, in October and November, at 10:00 in the morning and 6:30 in the evening. Each typically ran about 45 minutes ("causing some of us to sit down," Dickerson concedes). An open phone line would connect people working on the website at other locations; in fact, the open line would remain live 24 hours a day so that everyone could immediately talk to the others if an issue suddenly came up.
Dickerson quickly established the rules, which he posted on a wall just outside the control center.
Rule 1: "The war room and the meetings are for solving problems. There are plenty of other venues where people devote their creative energies to shifting blame."
Rule 2: "The ones who should be doing the talking are the people who know the most about an issue, not the ones with the highest rank. If anyone finds themselves sitting passively while managers and executives talk over them with less accurate information, we have gone off the rails, and I would like to know about it." (Explained Dickerson later: "If you can get the managers out of the way, the engineers will want to solve things.")
Rule 3: "We need to stay focused on the most urgent issues, like things that will hurt us in the next 24--48 hours."
The stand-up culture--identify problem, solve problem, try again--was typical of the rescue squad's ethic. They worked stretches of three or four days during which they might have had five or 10 hours of sleep cumulatively, often changing clothes only when they made a shopping trip to the nearby mall. They and the dozens of willing, even eager, engineers they led--who worked for the contractors who had failed so badly to lead them in the run-up to Oct. 1--pounded away on the bugs that Dickerson had demanded they identify every morning, focus on and clear up in time for the evening stand-up. They began to sweep across increasingly big swaths of their punch list.
Well, actually, they hummed along happily for less than three days, until the whole site crashed at 1:20 a.m. on Sunday morning, Oct. 27, two days after Zients had announced that all would be well by Nov. 30. A switch had failed during maintenance work at a data center. The outage lasted 37 hours, during which Dickerson and his team could do little because they had no website to look at.
Then, two days later at 4:00 p.m. on Oct. 29, it went down again because of a malfunction in a data-storage unit. This outage lasted 40 hours, including the afternoon of Oct. 30, when HHS Secretary Sebelius testified about the website's troubles before a loaded-for-bear House of Representatives subcommittee, whose majority Republican members flashed images on their tablets and iPhones of the website being down as they questioned her. "In her testimony Ms. Sebelius came across as a hapless official," the New York Times reported. "Those outages were totally demoralizing," says Burt. "We thought we were on our way. We had gotten some momentum but lost it."