NCube Corp yesterday came out with the third generation of its massively parallel line, the nCube 3, claiming it to be the industry’s first practical TeraFLOPS class machine, claiming that a large configuration costs as little as $40,000 per GigaFLOPS. First ships, aimed at the scientific and technical markets, will be available for shipment in the second quarter next year. Entry-level nCube 3 systems cost about $500,000. The hypercube machine – previewed in detail all of two years ago (CI No 1,936) uses the fourth iteration of the company’s custom processor, a 3m transistor 0.5 micron CMOS part that is described as integrating all the components for parallel computing on a single chip. The chip – made for nCube by Hewlett-Packard Co – is clocked at 50MHz and is rated by the company at 100 MFLOPS; it can directly address up to 1Gb of main memory and has an 800M-byte per second memory interface. It has a translation lookaside buffer for virtual memory support, 64-bit data paths, and on-board communications and input-output processors. The basic building block components of the system are the Porcessor Module, the I/O Module, and the Disk Module, and these can be mixed and matched to create the desired configuration. The Processor Module contains up to 512 nodes, for up to 50 GFLOPS performance and takes up to 32Gb of memory; it takes up nine square feet, and up to 20 can be linked to create the 1.0 TeraFLOPS configuration of 10,240 nodes. The I/O Module comes with up to 128 discrete input-output channels of 44M-bytes per second bandwidth each, 5.6G-bytes per second aggregate, and up to 10 can be linked in a single system. The Disk Module can take up to 120 1Gb, 2Gb or 4Gb 3.5 fast and wide SCSI drives and offers hot-swappable disks, error monitoring, and redundant power supplies, and a system can have up to 20 Disk Modules. In the hypercube topology, an adaptive routing mechanism increases interprocessor bandwidth by automatically seeking the most efficient path between nodes, skipping any failed nodes, and unused input-output channels can become hypercube interconnections. The Foster City, California company’s Parallel Software Environment for program development and execution contains the nCX microkernel, which runs on every processor and input-output node, and takes under 512Kb memory, while providing low-overhead messaging and integral message acknowledgement, and a Posix agent is available. NCube Languages compilers offered are for C, C++, Fortran 77, and High Performance Fortran.