This week’s issue of EE Times carries a story Pflops here; now what? about IBM’s new 1 petaFLOPS supercomputer, the Roadrunner, and how its designers are scrambling to run benchmarks in advance of the annual International Supercomputing Conference (ISC) being held June 17th-20th. It’s an article (dare I say, a puff piece?) about IBM, but it does mention competing supercomputers by Japanese vendors. However, it makes no mention of distributed computing projects like SETI@Home or, more importantly, of the Google computing cluster.
The BOINC projects (which include SETI@Home) are averaging 1.1 petaFLOPS on a sustained basis, day in and day out. Of course these are specific algorithms tailored to match the extremely distributed nature of a system where individual computer users volunteer their spare cycles for a good cause. So maybe this approach doesn’t count with the real supercomputer folks.
But what about Google? There are several approaches to supercomputing including vector processors (one instruction applies to many data elements), multi-processors (typically one OS controlling multiple processing cores) and clusters (multiple processors and multiple OS instances connected with a high-speed network). Over time, all the big machines have migrated to the third approach. Indeed IBM’s new Roadrunner is made up of 3,240 separate compute modules, each of which is a multiprocessor. Well that’s exactly what Google has been doing since their inception. While they don’t tout their technology, they did publish a paper in 2003 describing The Google Cluster Architecture which described how the early system worked (less than 15,000 servers in the early days).
Today, Google has perhaps 20 to 100 petaFLOPs of processing power in their distributed computing system. In mid-2006, the New York Times estimated Google had 450,000 interconnected servers in their various server farms. Their capital budget continues to expand, they continue to hirer (including for very super-computer specific jobs) and they are building a global fiber optic network to better connect their distributed server farms, so it’s reasonable to assume Google has well over 500,000 servers on-line today. None of these machines is more than 3 years old with an average age nearer 15 months based on the economics described in the 2003 paper. A new server for late 2007 and early 2008 has dual quad-core Xeon processors at 2.5 GHz or 3 GHz. Intel claims the quad-core Xeon provides 77-81 gigaFLOPS and today’s servers have two such processors, i.e. 160 GFLOPS. Let’s discount that for Intel hype and the fact that the average Google server is whatever commercial machines of 1/2007 could do — say 100 GFLOPS. And lets assume they haven’t added new buildings and new servers and have only 500,000 machines in their cluster. That’s still 50 petaFLOPS.
Note that Google also has an A team of researchers who, occasionally, publish fascinating glimpses into what’s going on.
I’ve never attended an International Supercomputing Conference — it’s a little out of my field — but I’d be interested to know if there is any public recognition, at the ISC, of what’s going on within the Googleplex. I don’t see any speakers from Google or any mention of Google on the ISC website. Have the supercomputer folks been bypassed and they don’t even know it?