The challenge in all this is that no two Unix vendors test their machines with exactly comparable machines on the same exact suite of benchmark tests – and this is absolutely intentional on the part of vendors. They get the benefits of bragging about performance on industry standard benchmarks, but they don’t have to show how different configurations of many machines in the product line perform on a variety of tests.

If I was benevolent dictator of the world – there’s a scary thought – I would create a consumer law that says all servers would have to run on a suite of benchmark tests before they could be sold on Earth, and that list prices for all components of all servers would have to be publicly available so people could do their own price/performance analysis across a wide variety of workloads. But, there is no such benevolent dictator–and the malevolent pseudo-dictators of their countries have other things on their minds besides server performance–so we have to make do with the benchmark test results that the server makers do, to their credit, spend big bucks to run.

Fujitsu-Siemens was first up when it announced that a 128-way PrimePower 2500 server was able to handle 21,000 users on the SAP’s ERP suite’s Sales & Distribution (SD) test running in a two-tier mode. (That means the SAP database and SAP application servers are running on the same box; in three-tier mode, the application servers are pulled off the database box, and the database server can therefore support more users.)

That PrimePower 2500 server was configured with 2.08GHz Sparc64 V processors, which have 256KB of L1 cache and 4MB of L2 cache, as well as 512GB of main memory and 452GB of disk capacity. The PrimePower 2500 handled just under 6.35 million SAP dialog steps per hour, and ran at 98% of total processing capacity with an average response time of 1.91 seconds.

The machine was running Sun Microsystems’ Solaris 9 operating system and Oracle Corp’s 9i database, which is one generation back on both pieces of software; presumably when Fujitsu-Siemens gets its tuning experts finished with Solaris 10 and Oracle 10g, it will be able to boost performance further with the same iron, since both Sun and Oracle are claiming that the new software has significant performance enhancements. (Then again, vendors always say this, and then sell us a more powerful computer than we should need given all of this software performance enhancement.)

Fujitsu-Siemens posted a pretty decent result on that 128-way. In April 2003, a 128-way PrimePower 2500 with 1.3GHz Sparc64 V processors (with 2MB L2 caches) and 512GB of main memory posted a result of 13,000 SD users on the SAP two-tier benchmark test running Solaris 8 and Oracle 9i. I was guessing that with 1.89 GHz processors, Fujitsu-Siemens might deliver 16,000 to 18,000 SD users; the PrimePowers did a little better than that, possibly because of the move to Solaris 9, but more likely because that larger L2 cache and the slightly faster clocks (2.08GHz is about 10% more oomph over 1.89GHz) have a big effect on big workloads like SAP software. (It is hard to make even educated guesses, which is why I want the tests to be mandatory for all servers.)

Last fall, IBM was showing off a 64-way p5 595 running its AIX V5.3 Unix and its DB2 8.2 database, which was able to process over 6 million dialog steps per hour with an average response time of 1.92 seconds. That worked out to 20,000 SD users in the two-tier test on a 64-way machine running at 97% of peak capacity. That p5 server was equipped with dual-core Power5 processors running at 1.9GHz; every two cores shares a 1.92MB L2 cache and a 36MB L3 cache. Those Power5 chips support symmetric multithreading, which the Sparc64 Vs do not, which means that both the Fujitsu-Siemens and IBM boxes were running a total of 128 hardware threads for the software on the systems to play with.

Last summer as well, Sun Microsystems supported 10,175 users on its top-end Enterprise 25000 server running Solaris 9 and Oracle 9i. That machine was configured with 144 UltraSparc-IV processor cores running at 1.2 GHz, however. That high core count is why, in part, Sun had to partner with Fujitsu-Siemens to deliver more powerful SMP servers with fewer processors. The other reason is that Sun can’t deliver a 2GHz or faster UltraSparc-IV chip. If Sun could do that, a 144-core Sun E25K could break the 20,000 SD user barrier.

Just to remind everyone how far Windows has to go, on the two-tier SD test, a 32-way Itanium server configured with 128GB of main memory, 931GB of disk capacity, and 32 of Intel’s 1.6GHz/9MB cache Itanium 2 processors from NEC Corp was able to support 5,210 SD users. That setup processed 1.572 million dialog steps per hour (dialogs are portions of SAP transactions) at an average response time of 1.93 seconds. This server was running Microsoft’s Windows Server 2003 Datacenter Edition and SQL Server 2000 database, and it ran at 94% of peak processor performance. That is a little less than twice as much performance as the 16-way Windows-Itanium servers that have been tested by Hewlett-Packard Co, Bull SA, and Unisys Corp using 1.5GHz Itanium 2 chips, and a little more than twice as much as the 16-way xSeries 445 servers using 32-bit Xeon MP processors from IBM running the same Windows stack.

IBM came out swinging, saying that its eServer p5 boxes not only demonstrated great performance running its own DB2 database on AIX, but that they could set records running Oracle 10g, too. On the TPC-C online transaction processing benchmark test, a 32-processor (that’s 64 threads) p5 595 server using the top-end 1.9GHz Power5 dual-core chips could process a little more than 1.6 million transactions per minute (TPM). That is almost exactly linear scaling up from the 16-way p5 570 machine that IBM tested last September running the same AIX 5L V5.3 operating system and Oracle 10g database on the TPC-C test.

The latest 32-way p5 595 had 32 1.9GHz Power5 chips, 16 of the 36MB L3 cache units for the servers, and 512GB of main memory. The machine was configured with SATA disk controllers and drives (which reduced the cost a lot on 112.9TB of storage), and cost $15.2m at list price (including hardware, software, and three years of support) and $8.4m after a 45% discount. Thus, the 32-way p5 595 box running Oracle yielded a price/performance of $5.27 per TPM.

The undisputed record holder on the TPC-C test is the 64-way p5 595 using 1.9GHz Power5 cores, which is able to process 3,210,541 TPM on the TPC-C test–more than three times the work of that the prior generation 32-way Regatta pSeries 690 could handle running earlier versions of the AIX operating system and Oracle and DB23 databases. The entire high-end p5 server line (p5 570 and p5 575) delivers nearly linear scalability at a little more than $5 per TPM after all the discounting shenanigans are done, including 4, 8, 16, 32, and 64 processor machines running alternately DB2 and Oracle databases. (IBM doesn’t want encourage too much direct comparison, after all.)

HP can do a little more than 1 million TPM on its Itanium-based Integrity servers running either HP-UX or Linux, and it has shied away from top-end SAP SD comparisons in the two-tier test. Fujitsu-Siemens is also a little shy about TPC-C, and its most recent big iron test is on a 64-way PrimePower 2500 that could do just under 600,000 TPM using 1.3GHz Sparc64 V processors and running Solaris 8 and an early edition of Oracle 10g. This machine was tested in April 2004. And if the PrimePower box could scale to 128-way well on the TPC-C test, you can bet Fujitsu would have put that configuration in the field.

Using the faster 2.08GHz Sparc64 V chips, a 64-way box should do more than 1 million TPM on the test, and a 128-way could do as much as 65% more work than that if the memory and I/O can scale well on the TPC-C workload on the PrimePower machines. That puts Fujitsu-Siemens in the ballpark of around 1.7 million to 1.8 million TPM with 128 threads, which is about half of what IBM is showing with 128 threads on the p5 595.

On a final note, IBM is talking up the idea that DB2 and the p5 server line have won the IBM-designated triple crown in benchmarking, becoming the first vendor to top the TPC-C and SAP SD three-tier online benchmarks and the TPC-H data warehousing benchmark all at the same time. IBM says that this has never happened before, and the company is probably right, but it would take a very long time and a lot of sifting through mountains of old benchmark data to prove it. Moreover, while IBM is boasting that a 32-way Power5 server can support 50.9 million dialog steps per hour on the SD three-tier test–running AIX V5.3 and Oracle 10g–if you make the two-tier test part of the triple crown, IBM only won it for a while since Fujitsu-Siemens just snuck by it on that test.

There ought to be a rule–any vendor that does a three-tier SD test has to do a two-tier test on the same core iron. But that would probably make far too much sense, now wouldn’t it?