Earth Simulator, which is a massively parallel implementation of NEC’s SX series of vector supercomputers, was not only the most powerful supercomputer ever built for the past few years, but it is also the largest vector supercomputer ever made. To oversimplify, vector supercomputers are designed from the ground up to have high memory bandwidth, low-latency, and to do floating point calculations and matrix math very efficiently, whereas normal parallel supercomputers are built from stock RISC/Unix engines and clusters are built from standard x86 iron and generally – but not always – do not have the same level of high-speed interconnect that vector or parallel Unix boxes do. If these are the three architectures that are competing for the HPC budget money, the x86 clusters are winning – and winning big.

Having said that, the top of the Top 500 list is generally dominated by more exotic designs, and the first quarter of the Blue Gene/L supercomputer that IBM Corp has just completed building in its Rochester, Minnesota factory now ranks as the most powerful supercomputer in the world, with 91.75 teraflops of peak floating point performance as measured by the Linpack Fortran benchmark test and 70.72 teraflops of sustained performance.

It wasn’t all that long ago that the entire Top 500 list was measured in tens of teraflops, and when Lawrence Livermore National Laboratory takes final delivery of Blue Gene/L early next year, this behemoth will have 131,072 customized PowerPC 400 cores running at 700MHz and it will deliver over 360 teraflops of peak computing power. Blue Gene/L, as you might have guessed from the name, runs a cut down version of Linux on its compute nodes and Novell’s SuSE Linux Enterprise Server 9 on its I/O and management nodes.

The number two system on the Top 500 list is a new machine as well, the Columbia parallel Altix system made by SGI Inc for NASA’s Ames Research Center. Columbia uses 10,160 of Intel Corp’s 1.5GHz Madison Itanium 2, and delivers 51.9 teraflops of sustained performance; it is also a Linux machine.

Earth Simulator drops to number three on the list. The vector supercomputer was the first supercomputer, coming first from Control Data in the 1960s and Cray in the 1970s (both companies had Seymour Cray as their main designers), and while there is still a reasonably large installed base of applications that run on vector supercomputers, as far as the Top 500 list is concerned, vector machines are not a significant factor.

In the November 2004 list, Earth Simulator is among 21 vector machines made by NEC, Cray, and Hitachi. Earth Simulator, with 5,192 processors, has an aggregate of 35.9 teraflops of sustained processing power, and the remaining 20 vector systems in the list have a combined 32.8 teraflops. While Cray has been enthusiastic about getting vector customers to move to the Cray X1 and the future X1E, you can see why it has productized the Red Storm Linux-Opteron cluster for high-end HPC machines and acquired OctigaBay for midrange and moderately high-end Linux-Opteron clusters. This is where the big money will be, even if the Cray X1E allows Cray to make its installed base happy.

Earth Simulator is a big, hot, expensive machine to operate, and it represents perhaps the last vector supercomputer of its kind. Even if the Japanese government keeps pumping money into NEC for the support of the SX series of vector machines, which are used by plenty of businesses (albeit in much smaller machines), it is highly doubtful that anyone will acquire a machine anything like Earth Simulator.

Coming in at number four on the Top 500 list is another new machine, a cluster built from IBM’s PowerPC 970-based BladeCenter JS20 blade servers that is located at the Barcelona Supercomputer Center in Madrid. This machine has 3,564 of IBM’s 2.2GHz PowerPC 970 chips, and delivers 20.5 teraflops of sustained computing.

Number five on the list is an existing Linux-Itanium cluster called Thunder built by California Digital for LLNL that is rated at 13.9 teraflops, followed by the System X cluster built by the Virginia Tech out of PowerPC 970-based Xserve machines from Apple Corp. Number eight on the list is a prototype of the Blue Gene/L machine that uses 8,192 of IBM’s 500MHz PowerPC 440 cores and is rated at 11.7 teraflops.

Of the 500 systems on the list, 320 of them are based on Intel Xeon or Itanium processors and another 31 are based on the AMD Opteron processors, which means the x86 architecture now accounts for 70% of the systems in the list. That is almost double the number of systems as were on the list a year ago, which just goes to show you that x86 systems are clearly usurping RISC/Unix architectures. IBM’s Power processors were used in 54 systems in the Top 500, followed by 48 systems using Hewlett-Packard Co’s PA-RISC processors.

By vendor, IBM accounts for 43.2% of the systems and 49.3% of the aggregate 1.127 petaflops of computing power represented in the list. HP is the number two vendor represented on the list, with 34.6% of systems and 21% of the total teraflops of computing power.

By geographic region, the number of machines in the list that are located in the US has increased, up to 267 from 247 a year ago. Europe accounts for 127 systems, followed by Asia and Japan together with 87 systems. By architecture, 296 systems are clusters. In the list, 100 systems are massively parallel computers (which means tighter links between nodes), including IBM’s Blue Gene/L, SGI’s Columbia, and NEC’s Earth Simulator.

All of the vector machines are technically MPPs. All of the hyperplexed Superdome machines in the Top 500 list are classified as constellation architectures, as are two reasonably large PrimePower HPC 2500 supercomputers made by Fujitsu-Siemens for the National Aerospace Laboratory of Japan and Kyoto University and the Franklin Wildfire cluster made from Sun Fire 15000 servers for Cambridge University in England.

The Top 500 list is based on the Fortran Linpack benchmark that was created decades ago by Jack Dongarra of the University of Tennessee. The Top 500 list is compiled by Dongarra as well as Hans Meuer of the University of Mannheim, Germany and Erich Strohmaier and Horst Simon of NERSC/Lawrence Berkeley National Laboratory. Dongarra is working on a new benchmark for supercomputers that he hopes will be a better gauge of how they will perform on real-world workloads. Many applications do not perform as well on clusters as their Linpack numbers would lead everyone to believe.