Blue Gene was conceived in 1999 by IBM Research as a project to test out a minimalist, vastly expansive parallel supercomputer design that would combine a very large number of fairly simple dual-core PowerPC processors to create an energy-efficient Linux box capable of breaking through the 1 petaflops barrier using as many as 1 million processors.
By 2002, IBM had talked the US Department of Energy into spending $290m to get a slice of the Blue Gene box, now known as Blue Gene/L and installed at Lawrence Livermore National Lab, as well as to build the ASCI Purple parallel Power-AIX supercomputer. Both machines are involved in managing America’s nuclear weapons stockpile and in designing a new generation of warheads, which must be designed and tested virtually rather than physically because of the Nuclear Test Ban Treaty. The current iteration of Blue Gene/L is based on 131,072 dual-core PowerPC 440 chips running at 700 MHz and has a Linpack Fortran benchmark rating of 280.6 teraflops.
With the second-generation Blue Gene/P, IBM is moving to a faster 850 MHz clock speed on the PowerPC processor cores and also upgrading from dual-core PowerPC 440s to quad-core PowerPC 450s. IBM is expecting that the peak aggregate computing power of Blue Gene/P will be around 3 petaflops, with sustained performance in the range of 1 petaflops. The Blue Gene/P design puts 4,096 processor cores in a rack, and the full configuration will have 216 racks and 884,736 cores to have that peak 3 petaflops rating. IBM is also adding symmetric multiprocessing capability onto the Blue Gene/P boards, but it is unclear how far that SMP will scale.
IBM puts four single-core PowerPC 450 chips in a single chip package, which has 8 MB of L3 cache memory, double that of the prior machine. Then, each Blue Gene/P system board has 32 sockets, and one of these quad-core modules, QCMs, is plunked into the socket, giving 128 cores per board. Each board can support 2 GB of main memory (four times that of prior boards), and a rack of Blue Gene/P has 32 of these boards in it. That gives 4,096 cores per rack. Each Blue Gene/P QCM can do 13.6 gigaflops, which gives 13.92 teraflops per rack. Add up 216 of these racks with a high-speed optical network and you hit 3 petaflops.
Linux applications written for Blue Gene/L will run on Blue Gene/P, and importantly, given the amount of electricity supercomputers consume and the heat they generate, Big Blue is claiming that Blue Gene/P will be a factor of seven more power-efficient than other supercomputers. This comparison presumably excludes Blue Gene/L, which is very energy efficient compared to X64 clusters or parallel vector supers.
The Department of Energy is footing the bill for the first Blue Gene/P machine, and that government agency will install that machine at its Argonne National Laboratory later in 2007. The Max Planck Society and Forschungszentrum Julich in Germany are also planning to each install Blue Gene/P machines later this year, and Stony Brook University and Brookhaven National Laboratory in New York and the Daresbury Laboratory in Cheshire, England, have also placed orders for Blue Gene/P machines.
IBM is not providing pricing information on Blue Gene/P. The initial Blue Gene/L machine came from a $200m investment by IBM into its research arm back in 1999 and then Uncle Sam paid $100m for the first box in 2002, which eventually reached a peak performance of 360 teraflops. That worked out to $278 per gigaflops for Blue Gene/L. If you assume some price/performance improvements, then a 1 petaflops Blue Gene/P should set you back about $150m or so.