The European Commission is convinced that, in this age of cost- conscious, time-conscious, customer-focused business, the use of high performance parallel computing will bring huge benefits and a competitive edge to European business. However, in order to convince people that this is the case, a Catch 22 situation had to be addressed. While parallel computer hardware is available, there is very little software to exploit its potential. Application developers argue that there is no end-user market for parallel code, and end users feel there are no applications to run on parallel architecture. So to break this stalemate, the European Commission established Europort, an initiative to produce commercially usable parallel code, and to demonstrate the benefits of parallel computer technology for industry. Europort, which is nearing the end of its two year life span, is part of the Community’s Esprit III research program. Its goal was to take 38 real world industrial applications, and to parallelize them in a portable way, enabling them to run on anything from a cluster of workstations to a massively parallel system.
Head-on crash
Each project within Europort involved the code owners, parallel software experts and end users. The aim was to produce robust, portable, parallel applications with immediately obvious industrial advantage, and to benchmark the resultant software on at least two different parallel systems. The project was split into Europort 1, which deals with fluid dynamics and structural mechanics, and Europort 2, managed by Smith System Engineering Ltd, which includes 24 code sets from a range of applications including computational chemistry, databases, oil and gas, electromagnetics, radiotherapy, animation and drug design. With the two-year project nearing completion, participants have been reporting the results of their experiences and benchmark tests, and most of those reporting seem to agree that the effort involved in parallelising code is worth it. The CAMAS-Link consortium’s code owner is Engineering Systems International Ltd, author of PAM-CRASH car crash simulation software. German car makers Volkswagen Audi and BMW were the project’s end users. Electronic Systems’ project manager Guy Lonsdale said the car industry is demanding ever more complex crash simulations, to be used in the design stage of new models. Typical simulations include a head-on crash at 20 miles per hour, and a 40% offset crash at 10 and 30 miles per hour. The PAM-CRASH code was migrated to distributed-memory computers using portable message- passing libraries, to make it completely machine independent. The goal was to see whether parallel computing could provide significant cost-performance advantage over both mainframe sequential supercomputers and symmetric multiprocessing systems. The Europort mission was not to compare different manufacturers’ computers, but to show the benefits of writing software that truly exploits the power of parallel computing. CAMAS-Link ran each of the crash simulation models on a Cray Research Inc Y-MP single processor supercomputer, and either a Meiko Scientific Ltd CS2 parallel computer, a Parsytec GmbH GC PowerPlus, or an IBM Corp SP2. The 30-mile per hour offset crash ran with an elapsed time of 96,930 seconds on the Cray single processor, 39,380 seconds on a 16-processor IBM RS/6000 SP2 and 31,730 seconds on an SP2 with 24 processors.
By Joanne Wallen
These results are before systems are optimized for the parallel code, and the consortium was very impressed with the results. Another consortium delighted with the results of parallelization is the Paramation consortium, which parallelized the Animo animation software from UK company Cambridge Animation Systems Ltd. Animo provides paint and trace and rendering capability, tasks which traditionally take thousands of hours to do by hand, and which when computerized are still highly processor-intensive. Cambridge Animation used its existing network of personal computers running NeXTstep, and converted Animo to run i
n parallel mode using a modified version of Parallel Virtual Machine message passing, to enable rendering software to run in parallel on a cluster of workstations. The consortium also benchmarked its software on Silicon Graphics Inc workstations running Irix, and a Digital Equipment Corp Alpha machine with four nodes. Animo ran between 2.8 and four times faster in parallel than on single processors. Cambridge Animation project consultant Peter Stansfield said since it parallelized the software, sales have greatly increased, and the company has an order for 200 licenses from Time Warner Inc’s Warner Feature Films, and 100+ licenses from Stephen Spielberg’s Dreamworks SKG studio (CI No 2,708). He says that parallel rendering is a feature with enormous benefits to animators, bringing lead times down from months to weeks. This, in turn, he says, will enable animation studios to turn out far more topical content than has previously been possible. It is perhaps little surprise that such industrial applications involving heavy, mathematical computation, benefit from running in parallel. The type of application involved in Europort typically lends itself to being split into discreet tasks, which have minimal need to talk to each other while processing. However, even the Camas project ran into difficulties when attempting to calculate contact impact, or which part of the car will impact with which other parts during a crash. The parts of the car were split across different processors. In order to calculate which parts would impact with each other, the system initiates a global search, which can cause load imbalances on the processors and a communications bottleneck. While parallel processing may well suit certain types of applications, it is not necessarily the right answer for everyone. In the massively parallel versus symmetric multiprocessing debate, most people seem to agree that it is another case of horses for courses. Even the vendors of such systems admit that massively parallel working is suited to many scientific and industrial applications, but that day-to-day commercial so-called ‘mission critical’ applications such as accounting systems, order processing, financial and banking systems are not as well suited. Since most transaction-based systems run processes that need to interact with others, they do not lend themselves as easily to being split up into discreet tasks that will run in parallel on multiple processors.
People are expensive
In addition, how many commercial companies would be prepared to invest the time and money needed to parallelize their existing code? Sequent Computer Systems Inc says the answer is very few. It is cheaper to build a massively parallel machine than a symmetric multiprocessor says Sequent’s UK product marketing manager Steve Wanless, but today, hardware is relatively cheap, people are expensive. He says that few firms can justify the cost of the human resources involved in writing parallel code, or in converting existing code to run in parallel. For this reason, both Sequent and Data General Corp have developed NUMA, Non- Uniform Memory Architecture, which enables some degree of distributed memory to run on symmetric multiprocessing systems. NUMA, the companies claim, will address the fact that symmetric multiprocessing systems do not scale up in the way massively parallel systems do, and may well bridge the gap between the two, without any changes to existing application code. Since the European Commission partially funded the Europort companies to convert their code, it is perhaps understandable that most companies were very pleased with the results. There seems to be little doubt that high performance parallel computing can provide significant performance benefits to computational chemistry, fluid dynamics calculations, structural mechanics applications and the like. But whether companies not funded by the European Commission will think that the benefits outweigh the costs of writing parallel code remains to be seen.