Saturday 20 April 2024
Font Size
   
Tuesday, 18 October 2011 22:01

How Yahoo Spawned Hadoop, the Future of Big Data

Rate this item
(0 votes)

Pages:123 View All

How Yahoo Spawned Hadoop, the Future of Big Data

Eric Baldeschwieler, aka Eric14, CEO of Hortonworks

The email went to Eric14. His real name is Eric Baldeschwieler, but no one calls him that. At fourteen letters, Baldeschwieler is a mouthful, and he works in a world where a name takes a backseat to an online handle.

The sender was Rob Bearden, a serial entrepreneur from Atlanta, Georgia, famous for actually making money from open source software. He e-mailed Baldeschwieler because he was looking to build a new company around what is widely regarded as The Next Big Thing in corporate computing. The irony is that Baldeschwieler worked for an outfit few would associate with enterprise technology. And if you listen to the pundits, it wasn’t a technology company at all. He worked for Yahoo.

The two met for dinner at a Vietnamese restaurant in Palo Alto, California, just down the road from Yahoo’s Sunnyvale headquarters. By the end of the meal, they agreed that Yahoo — yes, Yahoo — held the seeds of a company that could reshape the way big businesses operate, and within six months, they convinced the Yahoo board to spin off Baldeschwieler and about 25 other engineers.

Dubbed Hortonworks, the new venture is by no means guaranteed success, but it certainly has its hands on the right technology. Much as the world can’t quite grasp Eric Baldeschwieler’s last name, the pundits are still struggling to wrap their heads around the fact that Yahoo bootstrapped one of the most influential software technologies of the last five years: Hadoop, an open source platform designed to crunch epic amounts of data using an army of dirt-cheap servers.

“There’s a change happening, driven by unprecedented volumes and velocities of unstructured data. Traditional relational databases and business intelligence software can’t handle this. The thesis is that Hadoop can”

Today, Hadoop underpins not only Yahoo, but Facebook, Twitter, eBay, and dozens of other high-profile web outfits. It analyzes the vast amounts of data generated by these online operations, but it also pumps data into live public applications, including Facebook’s new messaging services and the “Today” module that serves up news stories on the Yahoo homepage.

Last year, eBay erected a Hadoop cluster spanning 530 servers. Now it’s five times that large, and it helps with everything analyzing inventory data to building customer profiles using real live online behavior. “We got tremendous value — tremendous value — out of it, so we’ve expanded to 2,500 nodes,” says Bob Page, vice president of analytics at eBay. “Hadoop is an amazing technology stack. We now depend on it to run eBay.”

Thanks to its success on the web, the platform is primed for use in the business world — leading a wave of technologies spilling out of the net’s biggest names and into the corporate data center. “Hadoop is not just a laboratory curiosity,” says Jim Kobelius, an analyst with research outfit Forrester, who spent the past few months interviewing companies about their use the technology. “It’s actually in use in operational environments today.”

Giants such as IBM, EMC, Oracle, and even Microsoft are pitching Hadoop tools at corporate customers. An all-star startup dubbed Cloudera has sprung up around the technology, counting among its ranks Hadoop’s original developer, Doug Cutting, who once worked for Baldeschwieler at Yahoo. And now, Cloudera and Cutting have big competition from his former boss. Hortonworks officially opened for business in July with Rob Bearden as chief operating officer and Baldeschwieler as CEO.

In today’s internet-driven world, more and more data is hitting big businesses, and it’s hitting them faster. Hadoop is a way of dealing with that data, and Hortonworks aims to take the open source project mainstream. “There’s a change happening, driven by unprecedented volumes and velocities of unstructured data,” Rob Bearden says. “Traditional relational databases and business intelligence software can’t handle this. The thesis is that Hadoop can.”

Open Source Déjà Vu

If you had stumbled onto Rob Bearden and Eric Baldeschwieler as they sat down for dinner that night in Palo Alto, you might have wondered what on earth brought them together. Born and raised in Georgia before graduating with a degree in marketing from Jacksonville State University in Alabama, Bearden is a man who knows how to get a point across with his lilting Southern accent, whereas Baldeschwieler, aka Eric14, is very much the laconic software engineer. He gets his point across with code.

Peter Fenton — a partner with Hortonworks’ chief backer, Silicon Valley VC firm Benchmark Capital — sees the contrast. He describes Bearden as a “true Southern gentleman”, before referring to Baldeschwieler as an “introverted architect.” But for a venture like Hortonworks, Fenton says, Bearden and Baldeschwieler are ideally suited.

“Eric is the ‘editor in chief’, whereas Rob is the ‘publisher,’” Fenton explains. “There’s the person who’s the conceptual authority for the company, who knows where to go next, and then there’s the person who actually builds the business, who builds in rigor and general management and processes. They have to complement each other in such a way that they have to be different.”

Based in Sunnyvale, Hortonworks aims to expand the scope of the open source Hadoop project, adding all the tools the average corporate operation would need, before eventually settling on a way to make money through service and support. Baldeschwieler — who ran the Hadoop team at Yahoo from the very beginning — will oversee the coding. Bearden will hunt down the revenue.

How Yahoo Spawned Hadoop, the Future of Big Data

Rob Bearden

Bearden has played the “publisher” role before. Nearly a decade ago, Fenton hired him as president and chief operating officer at JBoss — an Atlanta-based outfit that grew up around the JBoss open source Java application server — and the project’s founder, Mark Fleury, served as CEO. Bearden then moved to another Benchmark-backed, open source outfit, SpringSource, where he worked as COO alongside chief exec and project founder Rod Johnson.

Fleury declined to discuss his time with Bearden, but another JBoss colleague, Joe McGonnell, says that underneath Bearden’s smooth matter is the sort of ruthlessness that makes things happen in the business world. “On the surface, he has a laid-back personality and a Southern charm if you will, but at the same time, he’s very intense,” McGonnell says. “He’s a very passionate guy. He really locks in on the opportunity he’s working on.”

At both JBoss and SpringSource, Bearden got results. In just two years, JBoss reached a $60 million revenue run rate, before it was purchased by Red Hat for at least $350 million in 2006, and SpringSource was scooped up by VMware for approximately $362 million in 2009. He didn’t have the same success with his venture in between those two big sales — the Atlanta-based OpenSpan — but two out of three ain’t bad.

In many ways, Hortonworks is déjà vu all over again. Open source project. Funding from Peter Fenton. Engineer as chief exec. Bearden as the businessman. But his latest venture is also a bit different. The immediate aim is to expand the open source project, not sell subscriptions to a distro based on the existing open source code. The open source code, Bearden says, isn’t yet ready for the enterprise, and Hortonworks aims to change that.

“We believe that for this market to grow and grow fast, everything has to happen in open source,” says Bearden. “We need to make the open source project enterprise viable. Right now, there are less than 50 production sites of Hadoop in the world, but nearly every Fortune 500 company is evaluating it. We have to make sure it’s ready for them to use.”

Bearden is coy about how the company intends to actually make its money, and on the surface, his let’s-just-expand-the-open-source-project pitch seems naively idealistic. But it’s hard to argue with his track record, and those who know him say he has knack for seeing where a market will eventually go. “There’s a lot of really talented people that have worked with Rob that keep following him around from company to company, and that says a lot about him,” says McGonnell. “They know he’s got a knack for finding companies that have an opportunty to change market, not companies that can grow at a respectable rate.”

Hortonworks is his most ambitious play yet. The ultimate goal is not to sell the company to a big-name tech outfit. Hortonworks just spun out of a big-name tech outfit. Bearden believes the market for Hadoop is so large that Hortonworks will eventually grow into a public company to rival the likes of Red Hat and VMware. “This is clearly one we build to go public,” Bearden says. “The whole data layer for the enterprise is shifting. That’s the market opportunity for this company.”

Pages:123 View All

Authors:

French (Fr)English (United Kingdom)

Parmi nos clients

mobileporn