When it comes to databases in the 1980s, its a relational world, even in the top-end territory dominated by Big Mother. That is the popular view – until you listen to any of the devoted disciples of Ted Codd and Chris Date, whereupon the picture dissolves and everything suddenly seems a great deal more complicated. Peter White reports.
Codd and Date have become almost synonymous with relational database technology, so when someone comes along with the bravery to claim to be a co-worker, derived from the same database camp within IBM as the two gurus, if she can back it up, you listen to what she says. Shaku Atre is the latest such consultant to emerge under this banner, and backs up her credentials with some hard talking, sprinkled with quips like When IBM says SF it stands for Science Fiction, not systems facility. The two most important areas of criticism she focussed on when in London the other day related naturally to IBM’s DB2 top-end relational offering, and came from a confidence of having spent time working on what IBM has in the pipeline. She clearly understands that IBM has a long way to go, but has seen the work first hand, and has an idea of when it could be with users at the earliest.
All relational systems should have referential integrity, which means that where you have two pieces of data that have a relationship, such as one being dependent on another, then if you delete one from the system, the other should vanish automatically. For instance if a customer number has a specific customer name associated with it, then you shouldn’t be able to delete just one of them. DB2 doesn’t have referential integrity, and, she glibly points out, neither do those from any of the other vendors. Which means you are asking for trouble if you don’t monitor updates very carefully, because databases will get corrupt all the time. ‘But we’re used to that’, say all the VSAM programmers, ‘it’s no problem. It just means there’s no point having a relational database’. Referential integrity should clearly be high on IBM’s agenda, as should be the Outer Join, which is the ability to investigate all of the outer fields when two tables have one field of related data. She wants the user to have the ability to override the optimiser on DB2, which sorts out in what order any particular table is going to be searched. It’s general-purpose and sometimes the people who put the data there know better how to get at it so it should be optional, she insists. And she expects a directory which can store logical views to be added, but then again maybe that’s what DBRAD, announced last month, is: it wasn’t called that when I worked there. That indeed is what DBRAD appears to be – see report in CI No 687. DB2 users will also be wanting an on-line performance monitor – at present there is only a batch monitor. All of these facilities could be with the DB2 user soon, if IBM decides that its software is mature enough and the time is right. Half of these could be with us in one year, the rest in two. The shock suggestion she made was that R*, IBM’s distributed relational database, could also be out inside two years. That suggests that IBM is putting a lot more behind the project that it has previously. Companies are now rushing to commit to introducing distributed databases, but until the past year ago the problems created by scattering bits of a database all over the place on machines sitting in widely dispersed locations were regarded as well-nigh insuperable, so much so that the game was considered not to be worth the candle. Distributed relational databases are still acknowledged to be extremely tricky and Ms Atre carries a list of 30 key features that none of the products on the market have, all of which they need before the products can be said to be really helpful. A couple of key points are critical. When one part of a distributed relational database sends data to another part, it should first ask if the other node is ready to receive it, then send it, then ask if it arrived safely. This is called a two ph