Over the last 20 years, data manipulation tools have become more and more advanced, but this aid to people’s analytical capabilities has not been matched by companies throwing open their data to employee scrutiny. Now, however, more and more companies are considering making their information accessible and meaningful to a select group of staff who could use the information to give their company a competitive edge over rivals. And the medium many are investigating to facilitate this is a data warehouse. The concept is fairly simple: a data warehouse is a subject-oriented database containing historical data that has been modelled and fine-tuned for analysis and decision making. It is separate and distinct from the operational database and is capable of storing data from disparate sources. At least that’s the classical (if something that’s only a decade old can be called classical) definition, but as more and more people jump on the bandwagon, so the definition is mutating to suit each company’s particular spin.
Blur
AT&T Corp, which gained the founding fathers of commercial data warehousing when its subsidiary NCR Corp bought Teredata Corp in 1992, has begun to blur the distinction between an operational database and a warehouse database, with its Enterprise Information Factory. Oracle Corp which, since it failed to buy Lotus Development Corp and had to abandon its ambition to go up against Microsoft Corp for dominance of the desktop, is attempting to dominate the data warehouse market, and is bending the definition to suit its purposes. It has even gone the lengths of buying Information Resources Inc for $100m, to get its hands on high-end data analysis tools, and with not a hint of modesty, the company reckons that now it’s in the data warehouse business, warehouses are here for real. But it has some fairly unusual ideas: Oracle wants to extend access to the warehouse to all levels of users. This flies in the face of the idea that the warehouse is the company’s propeller heads’ tool. But even more controversially, Oracle sees nothing wrong in allowing this multitude of end users to amend the data in the warehouse, potentially giving rise to a cacophony of unintelligible information. This amend- as-you-read philosophy is anathema to the traditional view that the records in the warehouse are non-volatile. When data is transfered into a warehouse from a legacy system, operational database or external sources, it’s stripped of transactional data, unified if it comes from different sources, and may be annotated to make more sense, or provide pertinent information to the end user. Once that has been done, that is it and any changes that might need to be made have to be done by a data administrator so that there is an audit trail. But whatever definition one has for the data warehouse, its use is the same and is such a concept right for everyone?
By Maya Anaokar
The early innovators of data warehousing all shared similar characteristics. They were in highly competitive fields; all had high transaction counts, even if each transaction was often not worth very much; and in every business, costs had reached a base level under which it would be suicidal to go, and so services to the customer had to be improved in order to attract business. This resulted in telecommunications firms using warehouses to offer cost segmentation; retailers introduced customer loyalty schemes; banks used them to gain a complete picture of their customers, rather than viewing them as a mortgage payer or a credit card holder; and insurers worked on risk assessment. They were also the types of company that could afford to build a warehouse, which is a bespoke system involving products from numerous companies, and is expensive. Although prices have dropped and there is a myriad products with which to build a data warehouse, it’s still a complicated and expensive procedure and is unlikely to become an off-the-shelf offering any time soon. The movement of a company’s historical data, its unification and its transformatio
n can be a long and arduous physical process. Although a departmental database could be established by a few Cobol programmers writing some simple scripts to move the stuff over from legacy systems, a large warehouse, being created from numerous sources, would take programmers months to establish. That’s why companies that provide tools that automatically transfer, transform and scrub data – like Evolutionary Technologies Inc, Carleton Inc and Prism Solutions Inc – have all been sought after as partners by database vendors wanting to tell potential customers they have best of breed packages with which to build warehouses. And like most things associated with warehousing, these tools do not come cheap: Evolutionary reckons an average sale is $250,000.
Who drives?
If a company decides it’s worth the cost and and effort, the next hurdle usually turns out to be office politics. Who owns the data that’s going to populate the warehouse; what’s to be done if the data from the operational database is wrong; who will have access to the warehouse; who will be the developer and who will administer it? This, more than the disparate nature of the hardware and software involved, is likely to keep warehouses bespoke products rather than off-the-shelf kit. The consensus about who drives warehouse development within an organisation is that it should be the end users so that the warehouse can be designed around the functions that it will be expected to fulfil and users can identify which bits of the existing database they will require. But even if the push for a data warehouse comes from the business end, a company’s information systems department is going to have to supply an administrator. Finally, an in-house job for the technical guys, after seeing all their work farmed out to facilities management companies over the last few years, few companies installing a warehouse will be able to leave its management to outsiders, given the fact that it’s supposed to contain highly valuable information.