When relational database management systems started to encroach into the commercial database space in the mid 1980s, which was then dominated by players like Cullinet, critics were quick to point out that one of the flaws in the new technology was performance. While allowing the relational revisionists their claimed advantage in terms of greater flexibility and friendliness to user-framed enquiries than the earlier network or hierarchical approaches – mainly due to the use of front-end query languages like Berkeley University’s Quel, or IBM’s SQL – these observers pointed out that relational databases would be poor choices for real heavy-duty OLTP (on line transaction processing) apps. Which were in fact the ones that customers wanted more than the neat, but ultimately non-mission critical, decision support systems it looked as though relational was really being herded into. The main problem: architecturally, to achieve the splitting of data into tuples, or rows and columns, you have to split the data into tuples, or rows and columns: meaning, under the hood a relational system is simply not as economically implementable as other kinds of databases (of which there are many other options, incidentally). This was pointed out by Cullinet, and later by the short-lived pure play object database guys, to little or no avail. The relational people’s hour had come, and they stormed the Winter Palace of the database world carrying all before them. But there was still this nagging performance problem, which hadn’t gone away (how could it, since it was so endemic?). This was kind of solved by both huge efforts spent in development dollars in software (particularly cutting down on I/O bottlenecks, work on building better query optimization engines) or a related improvement in the hardware itself. This reporter remembers in particular hearing that with commercial MPP (massively parallel processing) machines in the early 1990s the true platform for relational was finally available.
By Gary Flood
Now we have the curious twist that we have yet another true platform for relational, with last week’s official launch of Mountain View, California based TimesTen Performance Software Inc’s TimesTen Main-Memory Data Manager (CI No 3,340). Now the latest is that yup, those nay-sayers were right all along, and relational is ‘bad’ for data sensitive applications, according to TimesTen anyway. And the best way really is… a combination of software and hardware jiggery-pokery. Only this time this is through ultra cheap memory, allowing larger database images at any one time to be physically available, a situation which will only get ‘better’ as more commercial 64-bit Unix systems come on stream this year and on. The software bit sounds a bit like the principle behind RISC computing; cut down to the bone and it’ll go faster. TimesTen talks about the fact that it can offer magnitudes greater performance through short cuts it can afford to make and merchant databases can’t. Much of the work that is done by an rdbms is done under the assumption that data is primarily disk-resident. Optimization algorithms, buffer pool management, and indexed retrieval techniques are all weighted towards this fundamental assumption, says the company on its rather impressive launch web site. TimesTen, on the other hand, knows the data resides in main memory and can therefore take more direct routes to data, reducing code-path length, simplifying both algorithm and structure. For example, Query optimization algorithms are different for disk-based than for main memory- based systems, [assuming] that data is primarily resident on disk or cached… If that is the assumption taken, then a disk-based optimizer will not produce the optimal plan for data that is primarily resident – or fully-resident in – main memory. Since merchant databases always have to make these kind of worst-case scenario assumptions, TimesTen is arguing, its offering has to be faster. These special considerations are claimed to reduce by a factor of ten the number of instructions required to execute a transaction, while the thing is said to also be able to offer useful (read, required) database features such as row-level locking, support for multi-threaded applications, T-Trees (a variant of B-Trees) and hash indexes, large object fields, group commits, a cost-based optimizer, automatic recovery, and full support for the database ACID (which as we all know, class, stands for atomicity, consistency, isolation, and durability) properties. Plus, the memory footprint of the TimesTen product itself is less than 3 Megabytes, due to the effect of the compact nature of the way the product implements these algorithms.
Database design street cred
And for what its worth – which, though it may sound cynical, we think not too much – the company makes great play of the fact that on the industry standard (and standardly controversial, as all benchmarks really are) Wisconsin database benchmark, TimesTen 2.0 completed performance tests up to 22 times faster than a traditional rdbms, even though this well known product had its entire database cached in memory the best possible scenario for maximum performance, as TimesTen itself notes. Not that we’re arguing – these guys have a lot of database design street cred, after all. The founders have more degrees between them than a thermometer, the whole project is a spin off from HP Labs, which tends not to employ idiots, and there is theoretical work back of this from AT&T Labs, two IBM research centers, Princeton and Stanford. And then there’s the fact that, surprising as it may seem, main memory is a standard solution anyway, and this product itself is basically a rewrite of an already available commercial product from HP, Opencall IM. The OpenCall IM telecommunications platform was developed at HP’s Grenoble, France campus, under the aegis of a project called Smallbase (which is why the TimesTen product is called version 2.0, as a nod to its Gallic predecessor. TimesTen added row level locking and the SQL interface as well as majorly tinkering with the thing, it seems. This telecoms product is also the prototype for one of the three customers bases TimeTen sees as appropriate to try and sell to. These are telecoms, for so-say intelligent networking, or the quick routing of customer information through several database records, financial services (doing the right work on a transaction before the price changes) and what its VP marketing Tim Shetler, before June 1997 vice president of product management at Informix Software Inc, calls enterprise ISVs like one of its beta customers, Prism Solutions Inc, which will use TimesTen to speed up loading data transformed with its product into a Data Warehouse. It turns out that the first two communities are already incredibly familiar with main memory database systems – it’s just that in the past they’ve been hard- wired, 100% bespoke, and without a query language (back to one of the first stated advantages of rdbms!). This should come as no surprise – we’ve always known memory is better. Memory is 100,000 times faster than searching for data on magnetic disk. It always has been. What’s new is that memory is now inexpensive and plentiful and widespread use of 64-bit systems is on the horizon. We are within two years of a fundamental change in the composition of computer systems, which will demand a change in the composition of data management architectures, said Shetler on joining last year. What TimesTen is really bringing to the table, under all the talk of magnitudes better performance and real time relational, is the chance to get this hitherto boutique software in a more commercially viable form. And now, quickly, from TimesTen’s perspective – before the big guys spot that, with trailers for cheap memory and 64-bit computing now previewing, hardware has once again stepped in to help relational out. HP and Sun very likely to be offering systems in a couple of years that will have a tremendous amount of physical memory and additional option of very cheap extra memory. We’ll have our market share built up by the time these [merchant database] guys arrive to try and take advantage of that, says Shetler.
64-bit malarkey
A lot, it turns out, is riding on this emergent marketplace for TimesTen, all this 64-bit malarkey. For at the moment, the amount of data that can practically be stored entirely within main- memory using TimesTen is only between 1.5 and 2.5Gbytes, the company says, before confidently predicting that, When 64-bit versions of major operating systems are released commercially (in the middle to later-half of 1998) this limit will be eliminated and data stores of any real-world size will be achievable. There is even the limit that in certain Unix hardware a 32-bit address space allows any application no more than 2Gybtes anyway. Furthermore, due to issues surrounding data store resizing and shared memory management… Our current recommendation is that TimesTen data store not exceed 1 Gbyte in size. In other words, don’t try and build a Data Warehouse using the product; though of course this size covers the majority of applications, one wonders if this might not be a constraint to some of its more ambitious potential customers. TimesTen does have a lot of market potential, though its apparent uniqueness does turn out to be not so impressive with a close look. But Shetler sees this as no issue – If we were the only such company doing this we’d have a problem, since by definition a market that consists of only one company is not a market. What strikes us most about its well- managed debut, at the end of the day, is not the idea of whizzy 64-bit driven databases – since by definition (Moores Law?) hardware always gets bigger and faster – but the fact that despite all the hype and PR and controversy, relational critics in retrospect seem to have had a lot more strength to their arguments than were given at the time. What, we wonder, will be the ultimate true platform for this technology – a $1 palm- sized supercomputer?