By the end of October Sun Microsystems Inc will begin to deliver production versions of its Full Moon clustering technologies that it’s been talking about for more than a year and which will enable customers to link up to four 64 processor Ultra Enterprise 10000 servers together for optimum reliability and high availability (CI No 3,118). In addition to running their own applications, each node in the cluster – which could be any Sun server – can also keep copies of mission critical applications running on other nodes so that if one node fails the application can be restarted on another node. Sun says the primary aim of Full Moon is to increase application uptime (high-availability), not increase the number of processors that can be physically strung together (scaling), which it says it the primary aim of Microsoft Corp’s NT Cluster Server (Wolfpack) software. Sun claims 80% of its clustering sites use the technology for high- availability, only 20% for scaling. The company’s mantra is SMP for scalability – up to 64 processors in the case of the high-end Ultra Enterprise 10000 Starfire – not ccNUMA distributed shred memory architectures, although Sun has previously told Computergram it will use Full Moon as one of the jumping-off points for its future Serengeti distributed SMP environment (CI No 3,252).

Finer grain

The physical interconnect, or switch, is OEMed from Dolphin Interconnect Solutions Inc and transfers data at 100Mb per second with a latency of four microseconds over the SCI Scalable Cohernet Interconnect protocol. It’s called Cluster Channel and uses an Sbus card slot on the server. A PCI bus version is due in a few months. Sun says its Cluster 2.0 software, which runs on top of a copy of the Solaris operating system present on each node, offers finer-grain failover than its previous offerings – now at the application and domain level – as well as failover and parallel database support in the same cluster, the Veritas VxVM volume manager and the dynamic addition of nodes. Sun’s also offering a version of the BEA Systems Inc Tuxedo OLTP monitor customized for Full Moon. Cluster 2.0 requires Oracle 8, Oracle Parallel Server or Informix XPS shared nothing parallel database products. It supports Ultra SCSI I/O with Sun’s FC-AL fibre channel arbitrated loop network storage products due shortly. Sun also supports failover using slower Fast Ethernet and FDDI links utilizing first-generation Solstice HA software. Sun says that it will work with its Solaris x86 OEMs such as NCR Corp to make the technology available on clusters of Intel Corp servers. The Cluster 2.0 software and high-availability application agents alone run to between $4,000 and $66,000 per node at the high-end. The interconnect is an additional cost.

Cluster-aware

Over time Sun says it will integrate Full Moon technologies more fully into the operating system a la Wolfpack. Java/browser-based cluster monitoring is due mid-1998. The key global cluster file system which will make file services available across nodes is due in Solaris at the end of 1998, along with global networking, global name services, Java-based cluster controls and support for eight nodes. A full single system image will follow, plus global process management, Java-based single system image management and a cluster-aware version of Solaris by the end of 1999. In short, there’s still a long way to go. Sun claims the majority of investment in its Solaris operating system R&D is going into Full Moon.