With a market that’s forecast to be worth in the order of $9,000m by 1997, every vendor and integrator worth its salt is clamouring for a few crumbs. And not just vendors, but countless analysts, market researchers and consulting outfits too. Following its recent investigation into so-called shared-nothing architectures, UK research company ButlerBloor Ltd has filled a new tome called Data Warehousing: Strategies & Technologies with its thoughts on the matter. It concludes that despite the hype and relative immaturity of the technology and the market, it is the most significant trend to emerge in information technology in a decade. Moreover, information technology, it believes, has come of age. While traditional approaches have served the administrative and production side of business well, data warehousing promises not just efficiency, but a flexibility and responsiveness geared towards gaining business advantage rather simply automating tasks. It takes to task some of the misnomers about data warehousing, outlines a general approach to implementing data warehouse strategies and considers some of the technology currently available. The hype surrounding data warehousing is attributed in large part to the established database vendors, which have been re-invigorated by its advent after an apparent loss of influence in recent years to the tools community. Data warehousing of course brings databases back to centre stage. As a result, vendors have rushed headlong to embrace it.

Off-the-peg

Oracle Corp, IBM Corp, Software AG, Computer Associates International Inc, Informix Software Inc and Sybase Inc are expected to lead the market next year and each has rewritten or added to its code base to improve its warehousing standing. They have day-one data warehousers such as Prism Solutions Inc, VMark Software Inc and Red Brick Systems Inc waiting to pounce, though BulterBloor believes these companies will probably capture only small market shares, being small fishes in a big pond. And as befits the ecology of such an environment, the smaller are likely to be swallowed by the bigger. Probably only IBM is able to provide a total system from its own product set, the report notes: most others require partnerships for best of breed technologies. Many have also allied with niche suppliers to develop pre-packaged offerings. This kind of one-stop offering now available is capable although the report recommends mix’n’match approaches that combine warehouse management software, an optimised parallel database and data access tools plus parallel hardware. Indeed in this light data warehousers need not concern themselves with trying to drum up a total system it concludes. Many so-called total system providers have in any case to call on third parties for help with replication, data bridging and other specialised techniques. It does however commend off-the-peg providers for their high level of integration and consulting skills. For those looking to get warehoused: think big, start small it suggests. Technologies evaluated include decision support tools, middleware and the database. Decision support tools are proliferating but are relatively immature, it advises. Newcomers such as Data Analyser from Harvard, Massachusetts-based Attar Software Inc and the data mining KnowledgeSeeker from Toronto, Canada-based Angoss Software Inc get a nod, though Cognos Inc’s PowerPlay software gets top marks. Software AG Esperant comes a close second. Esperant’s support of heterogeneous joins is cheered, along with its ease of use (no need for Knowledge of SQL or database structure) while security and administration is weak. MicroStrategy Inc DSS Agent and Oracle Discoverer/2000 come bottom of the pile. Discoverer/2000 has key drill down and graphical reporting functionality but little feature support.

By Ray Hegarty

Others considered were Planning Sciences Corp’s Gentium, Informix New Era, Sybase PowerBuilder, Business Objects and IBM Visualizer. Connecting data sources is best done by middleware. In general, Information Builders Inc’s EDA/SQL gets ButlerBloor’s vote although the report omitted it from the product ranking, which includes IBM CICS and MQSeries, Sybase Enterprise Connect, Software AG Entire, HyperStar from VMark Software, MitemView from Mitem Corp, OpenLink Software ODBC Drivers, PeerLogic Pipes, Sequelink from TechGnosis, Top End from AT&T Global Information Solutions and Novell Inc Tuxedo. Of those ranked, Enterprise Connect from Sybase came top of the class overall, with strong third party support plus accommodation of Open Data Base Connectivity, DB-Library, Distributed Relational Database Architecture and all types of relational and flat file database topologies. Its downside however is its lack of support for a global directory service, although Sybase said it had plans to provide support in future releases. The report gave Tuxedo runner-up status for its high performance, strong development tool support, standards observance, strong market share and wide distribution and vendor alliances, but expressed reservations about its limited support outside Unix. MQSeries and CICS were considered only average while bottom of the product ranking was Praxis International Inc’s Omnireplicator and VMark’s HyperStar, which was criticised for the limited number of databases supported – Oracle, Informix, Sybase, Gupta, uniVerse and Ingres – limited client support with Windows and Unix and the need for greater marketing exposure in the face of increasing competition. The new breed of databases such as Red Brick’s Warehouse VPT combine high performance, query optimisation and multiple dimensions, and are, ButlerBloor says, particularly suited to data warehouse applications. The report evaluates IBM’s DB2 family, Computer Associates CA-OpenIngres/Replicator, Informix DSA, Sysdeco Mimer, Oracle7 release 7.2, Red Brick VPT and Velocis from Raima Software. Evaluation of architecture, data types, data structures, indexing, performance tuning, concurrency control, distributed database, management and interoperability left DB2 and Oracle on top of the report’s list, followed closely by Sybase. ButlerBloor says IBM’s often confusing and conflicting marketing message – providing varying images for the same product family – handicaps DB2. It does not believe a unique instance of the database for each system best serves customer requirements. Oracle7 supports the broadest range of applications, including data warehousing, distributed database and transaction processing. Oracle supports increased ad hoc query optimisation, as well as transaction processing applications through stronger performance, the report concludes. Less impressive was its limited support for complex data types and third party databases.

Bridging data

Bottom of the pile came Essbase and Red Brick VPT. Databases for the warehouse such as AT&T’s Teradata have delivered very high capacity parallel database technologies, where database management systems such as Oracle and Sybase run in parallel configurations. Other high capacity databases such as OmniWarehouse from Praxis International are also up to the job. Specialist hardware suppliers such as White Cross Systems Ltd, Bracknell, Berkshire offer a highly cost-efficient route to data warehousing the report concludes. Bridging data from one location to another, even correcting and validating it en-route, is dominated by three companies, Prism Solutions Inc, Sunnyvale, California with Prism Warehouse Manager, Carleton Corp with Passport and ETI Corp’s Extract. It’s the process of updating the warehouse where most of the operational problems occur and as a result nearly every data warehouse offering incorporates at least one of these three vendor’s systems. These processes include data retrieval, consolidation where various data types are merged into a master set, scrubbing, where data is cleaned moving inaccuracies, summarising, in order to obtain a reasonable response time from any query, and the updating of the repository for current and consistent meta data. The report concludes that it’s safest to buy a recognised offering from a single vendor for non-technical applications where there are few complicated relationships to manage. Integration is likely to be smooth, and the overall price negotiable. Where the technical requirement is greater, the report advises users to spend some time watching the market and choose the products that provide the best fit for the organisation. If the resource is available, go for ‘best of breed’ and reap the benefits.