Intel’s Radek Walcyzk, head of PR for the chipmaker’s server division, called Wired today with Intel’s official response to the ARM-based microserver news form Tuesday. In a nutshell, Intel would like the public to know that the microserver phenomenon is indeed real, and that Intel will own it with Xeon, and to a lesser extent with Atom.
Now, you’re probably thinking, isn’t Xeon the exact opposite of the kind of extreme low-power computing envisioned by HP with Project Moonshot? Surely this is just crazy talk from Intel? Maybe, but Walcyzk raised some valid points that are worth airing.
Intel is way way ahead of ARM in this segment
The first thing Walcyzk was keen to emphasize was that Intel has officially been on the microserver bandwagon with since 2009, when the company announced support for the idea and began talking about it as a segment. Of course, what the term meant then was lower-power Xeons (Lynnfield to be specific). It was only this past march that Intel began talking about Atom in the server space, well after SeaMicro had paved the way by launching an Atom-based server product that the startup had been working on since 2007. Intel also came around on the Atom issue well after SupeMicro, SGI, and other system vendors had been shipping Atom-based server products since 2009. But Intel’s lateness to this game aside, what is clear is that the chipmaker is still very far ahead of ARM in this market by every conceivable metric.
When you combine the vast x86-based software ecosystem with the fact that multiple companies have been shipping Atom-based micro-server products for over two years now, the nascent ARM-based microserver movement is playing catch-up and will be for quite a while. Certainly there has been quite a bit of hype around ARM in the cloud for the past two years or so, but only now are we seeing prototypes like HP’s Redstone materialize, and real products won’t even ship until next year.
Xeon still wins at Big Data
Related to the point above is the fact that all of the ARM microserver numbers for efficiency gains that are being stuffed into slide decks and fed to the press are based on simulations (often with FPGA), and not on shipping systems. Meanwhile, Intel has had two years to gather data from live cloud workloads running on x86-based production systems, and the company claims that this data suggests something surprising about which chip is best choice for the majority of these types of power-sensitive, I/O-bound workloads.
Walcyzk told Wired that Intel definitely sees a huge demand for microservers. “This is a cateogry that we think may be up to 6 to 10 percent of the entire x86 market by 2015.” And for two thirds of this growing microserver segment, Intel estimates that Xeon will actually be the best performance per watt per dollar choice, with Atom serving the remaining third.
The idea that Intel’s giant Xeon processors could be a better fit for Hadoop-style, “slow Big Data”, I/O-bound workloads will sound like heresy to anyone who’s watching this space closely right now, but I’m not so quick to dismiss the idea. This is because something has been bothering me about this microserver, “physicalization” trend since it first gained traction two years ago, and that’s the fact that higher levels of die-level integration (i.e. Xeon) should trump board- and chassis-level integration (i.e. a bunch of ARM chips in a rack) every time in terms of cost, efficiency, and performance.
I raised this point two years ago at Ars Technica, and I’ve yet to read a good response to it from the physicalization crowd:
In the final accounting, it still seems that if you’re going to buy, say, 32 cores worth of… processing power, then you’re better off in terms of both cost and wattage with eight quad-core sockets than you are with 32 single-core sockets, no matter how low-power the single-core parts are individually. But if this is true, then why would any vendor use board-level integration to produce a “physicalized” server solution? The likely answer has to do with how processor vendors like Intel price their products.
I ultimately answered the question above by attributing physicalization’s success to the vagaries of Intel’s market segmentation strategy:
In conclusion, Moore’s Law overwhelmingly favors die-level integration, and in theory this should give the price advantage to multicore products. But the real-world pricing structure of [chip] vendors in the multicore era makes it cheaper to buy cores individually. The rationale for physicalization, then, is that it exploits this margin difference in order to pit [Intel's] low-end parts against its high-end parts, in spite of the fact that the vendor’s entire pricing structure depends on the idea that these parts don’t compete with one another.
I then concluded that physicalization was a fad, and that a multicore plus virtualization combo would beat it in the long run. So I predicted what Intel claims that their data is now showing, i.e., that a single, properly priced, low-power, many-core Xeon socket can still beat a fistful of ARM sockets for non-compute-bound, highly parallel workloads.
I have to admit that I’ve recently moved a bit away from this position and toward the point-of-view of the ARM camp on this question. I now think that the optimal architecture for Hadoop is a bunch of RAM banks with a cheap CPU core attached to each bank, which is essentially what Calxeda has in the EnergyCard prototype, and what SeaMicro is selling. So for batch Big Data workloads, the real power consumption is going to be on the storage and I/O side of the machine—i.e., in moving data into RAM and keeping it there while a lightweight core grinds through it.
But the jury is definitely still out what the best architecture is for these workloads. All we know for sure is that Intel’s current Xeon is definitely not what the Hadoop crowd wants, hence the excitement around ARM and Atom. This doesn’t mean that a future Xeon iteration couldn’t slide into this space comfortably, but would have to be designed for a much lower set of power and price points than the current Xeon family.
Intel seems to be getting this message, because past March the chipmaker updated its microserver roadmap to include Intel Xeon parts that range from 45W down to 20W. The company will also have a sub-10W an Atom part available next year.
In the long-run, I agree with Intel’s Justin Rattner who told over a year ago that something like the company’s Single Chip Cloud Computer prototype represents the best architecture for these kinds of workloads. Instead of Xeon with virtualization, I could easily see a many-core Atom or ARM cluster-on-a-chip emerging as the best way to tackle batch-oriented Big Data workloads. Until then, though, it’s clear that Intel isn’t going to roll over and let ARM just take over one of the hottest emerging markets for compute power.