DROPBOX MAKES EPIC EXODUS FROM AMAZON

If you’re one of 500 million people who use Dropbox, it’s just a folder on your computer desktop that lets you easily store files on the Internet, send them to others, and synchronize them across your laptop, phone, and tablet. You use this folder, then you forget it. And that’s by design.

Peer behind that folder, however, and you’ll discover an epic feat of engineering.

Dropbox runs atop a sweeping network of machines whose evolution epitomizes the forces that have transformed the heart of the Internet over the past decade. And just recently, this system entered a remarkable new stage of existence.

In fleeing the cloud, Dropbox is showing why the cloud is so powerful. It too is building infrastructure so that others don’t have to.

For the first eight years of its life, you see, Dropbox stored billions and billions of files on behalf of those 500 million computer users. But, well, the San Francisco startup didn’t really store them on its own. Like so many other tech startups in recent years, Dropbox ran its online operation atop what is commonly called “the Amazon cloud,” a hugely popular service run by, yes, that Amazon—the world’s largest online retailer. Amazon’s cloud computing service lets anyone build and operate software without setting up their own hardware. In other words, those billions of files were stored on Amazon’s machines, rather than machines owned and operated by Dropbox.

But not anymore. Over the last two-and-a-half years, Dropbox built its own vast computer network and shifted its service onto a new breed of machines designed by its own engineers, all orchestrated by a software system built by its own programmers with a brand new programming language. Drawing on the experience of Silicon Valley veterans who erected similar technology inside Internet giants like Google and Facebook and Twitter, it has successfully moved about 90 percent of those files onto this new online empire.

It’s a feat of extreme engineering, to be sure. But the significance of this move extends well beyond Dropbox. Rather ironically, it highlights how cloud computing is rapidly transforming the way businesses operate. And at the same time, it reveals some enormous changes that have swept the worldwide hardware market over the last ten years.

Dan Williams, Dropbox Infrastructure Manager
Dan Williams, Dropbox Infrastructure Manager
Today, more and more companies are moving onto “the cloud”—not off. By 2020, according to Forrester, cloud computing will be a $191 billion market, with giants like Google and Microsoft challenging Amazon with their own cloud services. Amazon, which declined to comment for this story, just reported $2.41 billion in revenue for its Amazon Web Services division during the fourth quarter of last year, or more than $9.6 billion in annualized sales—and that’s pretty much after Dropbox’s move.

But some companies get so big, it actually makes sense to build their own network with their own custom tech and, yes, abandon the cloud. Amazon and Google and Microsoft can keep cloud prices low, thanks to economies of scale. But they aren’t selling their services at cost. “Nobody is running a cloud business as a charity,” says Dropbox vice president of engineering and ex-Facebooker Aditya Agarwal. “There is some margin somewhere.” If you’re big enough, you can save tremendous amounts of money by cutting out the cloud and all the other fat. Dropbox says it’s now that big.

That said, building a network of this size is a ridiculously difficult task. And it’s certainly not for everyone. “The right answer is to actually not do this yourself,” says Urs Hölzle, the former University of California, Santa Barbara, professor who, as Google employee number eight, oversaw the creation of the company’s global network and now helps run its cloud computing services. Most companies, he explains, lack the size and the sophistication needed to reach those economies of scale. And if a company’s growth stalls, a move like this could come back to haunt it. This point is particularly relevant with Dropbox. In recent months, pundits and investors have turned sour on the San Francisco-based company, saying that its $10 billion valuation is all out of whack and that it’s been slow to attract real business customers.

But Hölzle acknowledges that for some companies, the move still makes sense. And at least for right now, Dropbox is one of those companies. According to chief operating officer Dennis Woodside, the company gets “substantial economic value” by running its own operation. The irony is that in fleeing the cloud, Dropbox is showing why the cloud is so powerful. It too is building infrastructure so that others don’t have to. It too is, well, a cloud company. And in moving onto its own vast network, Dropbox is joining giants like Amazon and Google and Microsoft in pushing the worldwide hardware market—and all of information technology—in an entirely new direction.

The Future of the File

Amazon dominates the primary cloud computing market. And its primary competitors are Google and Microsoft. All three offer services that let businesses and independent coders build and run whatever software they want without setting up their own hardware. And all three bring the leverage you’ll only see in the world’s largest tech companies.

Akhil Gupta, Dropbox VP Infrastructure
Akhil Gupta, Dropbox VP Infrastructure
At the same time, there’s a growing secondary market centered around Dropbox, its arch-rival Box.com, Saleforce.com, Workday, and others. These companies fit into a different niche—offering pre-built software applications over the Internet. Like the bigger companies, they too deliver tools that businesses and developers can use without setting up their own hardware—the essential appeal of the cloud. “The next major era for this industry is a battle of platforms,” says Aaron Levie, the CEO of Box.com. “What are the next platforms that enterprises are going to build on top of?”

Dropbox wants to be one of them. And so it has taken a big chance on building a cloud of its own. But this won’t be easy. The company will face increasing competition from Amazon and Google and Microsoft as they continue to expand into pre-built software. In fact, these giants are already challenging the likes of Dropbox and Box with their own file-sharing tools. And the file-sharing market will likely to be less expansive in the future. The sharing of discrete files—standalone photos and videos and Word docs and spreadsheets—is becoming less important. Files aren’t at the center of how we use our smartphones. And with always-on messaging and collaboration services like Slack, the file is becoming less of a focal point on the desktop as well.

Dropbox knows all this. Its enormously high valuation has made it a target for pundits and investors decrying the rise of the “unicorns.” In recent months, no startup has received more heat than Dropbox, with many questioning its ability to compete in the business world against the giants of the Internet. Judging from extensive conversations with executives at the company, it’s clear that Dropbox very much realizes the world is changing. The question is whether—after all the time, money, and effort it’s spent moving itself onto its own global network—its own changes are in sync with where the world is headed.

Start Making Sense

James Cowling knew the creators of Dropbox from his days at MIT. As a graduate student at the university, he focused on distributed systems—computing systems that run across dozens, hundreds, or even thousands of machines—and he studied with some of the earliest Dropbox employees. That’s how he met Drew Houston, the Dropbox co-founder and CEO. As Dropbox grew, they kept in touch, and here and there, they mulled the hows and whys of a Dropbox that could operate on its own, without Amazon. “It seemed a moonshot,” Cowling says.

In 2012, Cowling says, Google—the Internet’s most moonshot-driven company—offered him a spot on the engineering team that oversees Spanner, the global database that drives so much of the search giant’s online operation. Spanner is probably the largest and most complex single database on Earth—one of the most distributed of distributed systems. But instead, Cowling went to work at Dropbox. “I wanted to build something,” Cowling says. Spanner already existed. The Dropbox empire did not.

For most of its existence, Dropbox ran partly on Amazon and partly off. If a bunch of people shared some files via Dropbox, the company stored the files on Amazon’s Simple Storage Service, or S3, while housing all the metadata related to those files—who they belonged to, who was allowed to download them, and more—on its own machines inside its own data center space.

Working alongside vice president of infrastructure Akhil Gupta, an ex-Googler, and others, Cowling designed a sweeping software system that would allow Dropbox to store hundreds of petabytes of data—enough data to fill hundreds of millions of USB thumb drives—and store it far more efficiently than the company ever did on Amazon S3. They called this system Magic Pocket. “Dropbox was envisioned as a place where you keep all your stuff, it doesn’t get lost, and you can always access it.” Gupta says. “A magic pocket.”

James Cowling, Dropbox Storage Team Lead
James Cowling, Dropbox Storage Team Lead
In essence, they built their own Amazon S3—except they tailored their software to their own particular technical problems. “We haven’t built a like-for-like replacement,” Agarwal says. “We’ve built something that is customized for us.”

Even while Dropbox was still on Amazon, the online retailer was also starting to act as a competitor to Dropbox, offering its own file-sharing service—an obvious concern for the smaller company, though Amazon’s version of this particular service lacks the user-friendliness and sheer brand recognition of Dropbox’s ubiquitous blue folder. But according to Agarwal, the main reason for moving off the Amazon cloud is raw economics—not politics. “You have to think of these large [tech] players as countries—friendly neighbors, though there might be some skirmishes going on here and there,” he says. “Amazon is many things, but I don’t think their primary thing is being a cloud storage provider like us.”

He’d better hope so. Because Dropbox has truly gone all-in. Yes, it created its own software for its own needs. But it also went a step further. The company tailored its hardware as well. Dropbox designed its own computers.

Too Big to Scale

For years, Internet giants like Google, Facebook, Microsoft, and Amazon have designed their own data center hardware—computer servers, networking switches, and, in some cases, hardware for storing massive amounts of data. These companies had no choice but to build all this stuff: Their online empires grew so large that using traditional gear was just too expensive and too difficult. They needed a new breed of hardware that was cheaper, more streamlined, and more malleable. So they built it, working alongside hardware manufacturers and parts suppliers in Asia and elsewhere.

Today, Google builds more servers than almost anyone on Earth—and it doesn’t even sell servers. Much the same goes for Amazon and Microsoft. And since those companies also run cloud computing services, many other businesses are now running their software on machines forged outside the grip of traditional hardware vendors. This is particularly true after Facebook open sourced the designs for its custom-built gear. Now a bunch of vendors, including Asian manufacturers like Quanta, sell stuff that’s based on Facebook hardware.

Rami Aljamal witnessed this movement firsthand. He built this new breed of streamlined machine inside Twitter and at the new DCS arm of Dell—an effort to recapture some of the market the company lost when companies like Google started designing their own hardware. Now, he designs machines at Dropbox. Like Google and Amazon and Microsoft, Dropbox decided it needed machines that fit its unique needs.

Dropbox stores enormous amounts of data, so it needed machines suited to that task. And that’s what Aljamal and his team built, working out of a lab inside Dropbox headquarters in San Francisco just across from AT&T Park, home of the Giants. They call these machines Diskotech. “The thing we care about the most is the disk,” says Aljamal. “That’s where all the bytes are.” Measuring only one-and-half-feet by three-and-half-feet by six inches, each Diskotech box holds as much as a petabyte of data, or a million gigabytes. Just 50 of these machines could store everything human beings have ever written.

Changing the Tires

Cowling and crew started work on the Magic Pocket software in the summer of 2013 and spent about six months building the initial code. But this was a comparatively small step. Once the system was built, they had to make sure it worked. They had to get it onto thousands of machines inside multiple data centers. They had to tailor the software to their new hardware. And, yes, they had to get all that data off of Amazon.

Rami Aljamal, Dropbox Engineering Manager
Rami Aljamal, Dropbox Engineering Manager
The whole process took two years. A project like this, needless to say, is a technical challenge. But it’s also a logistical challenge. Moving that much data across the Internet is one thing. Moving that many machines into data centers is another. And they had to do both, as Dropbox continued to serve hundreds of millions of people. “It’s like a moving car,” says Dan Williams, a former Facebook network engineer who oversaw much of the physical expansion, “and you want to be able to change a tire while still driving.” In other words, while making all these changes, Dropbox couldn’t very well shut itself down. It couldn’t tell the hundreds of millions of users who relied on Dropbox that their files were temporarily unavailable. Ironically, one of the best measures of success for this massive undertaking would be that users wouldn’t notice it had happened at all.

Once Cowling and crew built the initial code, they tested it on a network of pretty standard hardware—a kind of shadow version of Dropbox that juggled roughly 20 percent of the data that was housed on Amazon. They vowed to test the code for 180 days without finding a major bug, even hanging a countdown clock on the wall at Dropbox HQ. And when a bug turned up after two months—a bug that could have seen data stored in the wrong place—they reset the clock. In all, the testing took eight months.

Confident the system could run all of Dropbox, the team then moved the code on to more and more systems while copying more and more data from Amazon. Its main contracts with Amazon were set to expire in another six months, and the Dropbox braintrust resolved to complete the move by then, so that the company wouldn’t have to re-up. “There was a very short amount of time to open up the parachute,” Cowling says.

Just getting the bits out of Amazon and into other data centers was an epic task. Digitally moving petabytes of data from one machine to another isn’t exactly on the same scale as downloading a few songs for your laptop. Even the fattest Internet pipes only have so much bandwidth. Transferring four petabytes of data, it turned out, took about a day. “You’re restricted by the speed of light,” Agarwal says.

Meanwhile, computers must be moved into data centers and set up to receive all those bits. Picture the IT guy in your office trying to set up a new employee’s computer—but on the scale of Dropbox. And all that physical effort came with a time limit. If they couldn’t get the systems into the data centers fast enough, they couldn’t get the data off of Amazon fast enough. The company was installing forty to fifty racks of hardware a day, each rack holding about eight individual machines. At one point, they were slowed by some ill-timed crashes—and not the computer kind. In one twenty-four hour period, trucks carrying machines to Dropbox data centers in different parts of the country both had accidents.

Despite those accidents and everything else, Dropbox made its deadline. And it dropped those contracts with Amazon. The company continues to use the Amazon cloud in Europe—just because the business is growing in a less predictable way in Europe—but Gupta and team had moved ninety percent of all files into Dropbox data centers. And then came the really extreme engineering.

Okay Go

As all that data streamed off the Amazon cloud, hardware engineer Rami Aljamal pow-wowed with a coder named Jamie Turner. Magic Pocket—Dropbox’s version of Amazon’s file-storage system—was still running on run-of-the-mill machines. The next step was to move it onto the company’s custom-built hardware. Aljamal and Turner, an English major turned engineer who is now a veteran of multiple tech startups, joined forces to ensure this new hardware dovetailed with the software. Aljamal and his hardware engineers designed a single machine, Diskotech, that could hold a petabyte of data. But there was a problem. The Magic Pocket software didn’t quite fit this new hardware. So Turner rebuilt Magic Pocket in an entirely different programming language.

Michele Sordal, Dropbox Supply Chain Manager
Michele Sordal, Dropbox Supply Chain Manager
That may seem odd. Why put the code onto thousands of machines only to change the code and put it onto thousands of other machines? But in the largest Internet data centers, this is just how things work. Machines get old quickly. Parts fail constantly. And then you replace them. You’re always upgrading what you have. First, Dropbox made sure that Magic Pocket ran on ordinary gear—which was hard enough. Then it honed its hardware. Then it had to make sure the two worked well together.

Crowling, Turner, and others originally built Magic Pocket using a new programming language from Google called Go. Here too, Dropbox is riding a much larger trend, languages designed specifically for the new world of massively distributed online systems. Apple has one called Swift, Mozilla makes one called Rust, and there’s an independent one called D. All these languages let coders build software quickly that runs quickly—even executed across hundreds or thousands of machines.

Jamie Turner, Dropbox Software Engineer
Jamie Turner, Dropbox Software Engineer
But Go’s “memory footprint”—the amount of computer memory it demands while running Magic Pocket—was too high for the massive storage systems the company was trying to build. Dropbox needed a language that would take up less space in memory, because so much memory would be filled with all those files streaming onto the machine. So, in the middle of this two-and-half-year project, they switched to Rust on the Diskotech machines. And that’s what Dropbox is now pushing into its data centers.

Facing the Danger

It is extreme. But now that companies like Google and Amazon and Dropbox have gone through this kind of thing, most others don’t have to. That’s the power of cloud computing. No, Dropbox isn’t Google or Amazon. It doesn’t offer raw computing power and infrastructure that lets coders and businesses build and run any software they like. But it does let individuals and businesses share and store files without setting up dedicated hardware—which, as businesses grow, becomes harder and harder from them to do. Sharing, the company hopes, will become a platform. That’s why Dropbox has created an online text editor and collaboration tool called Dropbox Paper. Outside developers, from Microsoft on down, can plug their own apps into its service as well.

The danger is that as Amazon and Google and Microsoft expand their own services, they will restrict the growth of Dropbox. In that case, the company’s move into its own data centers could become more of a burden than a blessing. Famously, when San Francisco social gaming company Zynga reached its own hypergrowth phase, the company moved off of the cloud and into its own data centers. But then its business imploded, and it was left with infrastructure it didn’t really need. It’s now back on Amazon.

For Dropbox, one advantage is that people like Agarwal and Gupta and Williams and Sordal have all played the game before, and they’ve played it at the companies who play it best. Dan Williams says there’s a buzz that comes from this extreme engineering. “If you’ve experienced anything in your past like a Facebook or a Google, you sort of get addicted to that hypergrowth,” Williams says. “You miss it when you don’t feel it.”

That’s not an empty thing. It’s a buzz that can save a company millions upon millions of dollars. But like any addiction, this one comes with its own perils. It can lead to what those in the Valley call Not Invented Here Syndrome, where companies start building all sorts of new stuff just because they’re intent on building all sorts of new stuff.

Whether it creates the kind of business Dropbox is hoping to build, or it just ends up as a huge engineering high, the company now has its own invention. Dropbox has built its own box. This represents an attitude that began with Google and has gradually spread across Silicon Valley. Google was so successful not just because it built a pretty good Internet search engine, but because it built the underlying technology needed to run that search engine—and so many other services—at an enormous scale. Facebook, which recruited countless ex-Googlers, did much the same. And so did Twitter and its ex-Googlers. And, now, so has Dropbox.

Giant Dreams Require Giant Efforts
To become a giant, you may have to stand on the shoulders of others. But once you become your own giant, you start to feel like you need to build a home that’s just right for you.