Fault tolerant computing is not easy to implement, and the problem becomes considerably tougher when you abandon the controlled environment provided by a proprietary architecture in favour of a standard, open systems approach. So far the market has been dominated by vendors such as Tandem Computers and Stratus Computer that have implemented fault tolerance within a proprietary, closed architecture. In the 1980s, a whole string of newcomers has opened things up somewhat by combining their own architectures with varying degrees of fault tolerance on Unix-based systems – Tolerant Systems, Parallel Computers and Computer Consoles are examples, while Stratus also offers a Unix implementation. But a project underway under the Esprit programme of collaboration between European companies and researchers is designed to open up the area completely by allowing replicated general purpose computer systems – regardless of architecture and operating system and not, in themselves, fault tolerant configurations – to be linked in an Open Systems Interconnection-conformant network to provide a fault-tolerant distributed system. The extraordinarily ambitious project is called Delta-4, for open Dependable Distributed computer systems architecture, and in addition to defining the required architecture will result in the development of specific configurations as demonstrations of the architecture in practice – and predictably, the project has settled on Unix as a common denominator for the demonstration systems. Dependability is a word that is coming into use in technospeak as a term that embraces more than just fault tolerance. It covers, in addition, areas such as fault avoidance, security and safety. The Stratus Continuous Processing systems and Tandem NonStop machines are based on proprietary architectures and they are US-designed systems. Delta-4 aims to develop and market European expertise in the area, which is undoubtably there in the words of senior product manager for the project at Ferranti, Dave Drackley. Any project that is not only attempting to break new ground but to do so by combining the efforts of a large number of partners in different countries is up against every technical and organisational problem in the book, but an indication of the progress made so far and the faith in the project is that Delta-4 has just picked up UKP5m second phase funding from the EEC Commission following the demonstration in April of a reliable file server. A further demonstration is due at the Esprit conference in Brussels at the end of September; this will involve Unix machines from the project’s industrial partners Bull SA and Ferranti Computer Systems Ltd, linked by an IEEE 802.5 token ring, according to Dave Drackley, Delta-4 project manager at Ferranti. Research bodies Inesc of Portugal, IEI of Italy, LAAS – Laboratoires d’Automatisme et d’Analyse des Systemes of France, the Fraunhofer Institute-IITB and First-GMD of Germany, are also participating in the Delta-4 project. And if all goes well, it is hoped that future Esprit funding will support the implementation of a demonstration system at the gigantic BASF factory in Ludwigshaven, West Germany.
Open Distributed Processing Group
The project involves the development of a computational model for a network where the interactions between processes active on different nodes in the network can be validated, to enable replicated systems to detect failures. The project members will be providing input to the recently-established Open Distributed Processing group at the International Standards Organisation, and the project is likely to result in extensions to Open Systems Interconnection protocols to support fault-tolerance in a network. The current phase of Delta-4 incorporates not only the results of the first phase but also a parallel project, Concordia, which was led by the Microelectronics Advanced Research Institute, MARI, the Newcastle-based supplier of the Newcastle Connection Unix networking software; other partners are Jeumont-Schneider of France with Telettra and the University of
Bologna in Italy. Concordia focussed on passive replicated services as opposed to the active replication – systems operating in parallel and continually comparing results – of the first phase of Delta-4; the combined project is intended to support both approaches.