Data warehousing: It’s not about data, it’s about measuring performance

Susan Andre2003

Lawrence Corr of DecisionOne Consulting visits SA again this September to run data warehouse design training, hosted by Sagent Technology. Here he discusses the importance of key performance indicators (KPIs) to data warehouse design.

Many data warehousing projects that go 'off the rails' do so because they lack any serious attempt to understand how data will be used prior to designing and loading the warehouse. Even 'well designed' dimensional data warehouses that separate all data into descriptive dimensions and numeric facts or measures can fail to connect with high-level user requirements and strategic objectives. Often this is because analysis and design concentrates on the physical detailed data storage and relegates the definition of high-level business measures to individual analytical applications or reports.

A serious requirement of a data warehouse should be to allow business users to easily drill down from high-level key performance indicators (KPI) to detailed operational measures on precise segments of the business, market or customer base.

KPIs offer a rapid assessment of the current state of the organisation and answer the 'How are we doing?' questions. Good KPIs are designed to make it obvious when a significant measure of company performance has changed. They represent the jump off point for analysis and can act as a guide to where and when to start looking into the detail.

The detailed operational measures or facts that data warehouses are now so good at storing can unlock the hidden customer behaviour or reasons behind the KPI figures, answering the 'Why is this happening?' questions. This is especially true if they can be used in conjunction with precise constraints on a rich set of descriptive dimensions.

However, KPIs without the detail and the detailed atomic facts without the KPIs don't work well. KPIs sourced from outside the data warehouse which can't be drilled upon have the terrible effect of teasing the business users, who get told something is wrong but can't get down to the reason or to an actionable set of data. This was the fate of executive information systems in the late eighties and early nineties.

A data warehouse that delivers only detailed facts and dimensions can potentially answer most questions but often fails to speak the business user's language and quickly loses its link to the strategic business aims. Users don't get the guidance to know why and when to formulate these wonderful new questions. What they invariably get instead is a stack of starter reports (those that certain individuals have shouted loudest for) that might summarise the information in interesting ways. How many users venture into analysis of data beyond their current areas? How many regularly view or run reports that they didn't have any hand in specifying or building? How do you know which report to run when? With the analytical potential we have today, users don't need a taller stack of reports they need analytical applications. Rather than just being a new name for reports, these analytical apps need to tie reports and charts together in meaningful ways that offer guided analytical workflow and allow users to concentrate on only what is important by setting exception criteria on KPIs at different levels. To explain this, we need to understand the difference between facts, measures and KPIs, and see why they are missed out of the data warehouse design.

Facts are very close to the raw numeric captured in each transaction by the operational systems. Typically fact values exist in the millions and are therefore not easy to ignore. Ideally they are additive quantities, counts or sums of money. Somewhat less ideally, they can also be the semi-additive balances or counts or non-additive measures of intensity that are an important measure of a business process. They are atomic, detail level data stored in a fact table. Often operational in nature, facts are spoken of and well understood by IT.

Measures might well be another name for facts but I believe they are best described as facts summarised or aggregated to common level of summarisation suitable for comparison. They are typically what business users ask for, eg daily store revenue totals or monthly claim figures. Measures are typically calculated in a report query but may also be physically stored in an OLAP cube or aggregate fact table for efficient access and further aggregation.

KPIs are measures expressed as self-contained ratios or percentages such as revenue growth or customer churn. Their important feature is users can gain understanding from viewing a single figure without having knowledge of previous values or performing further analysis in conjunction with other measures. They embody all the necessary comparison logic. They are what users really need. If a system initially delivers reports containing only straightforward measures or facts at different levels of summarisation, users will spend their time in the comparison-intensive ad hoc construct of KPIs or even more time trying to assimilate the raw data.

KPIs have broad similarities across different industries. This is because most commercial organisations are intent on measuring the same core figures of revenue, cost, profit, margin by comparison and relating them to their operational measures.

KPIs define these important comparisons, many of which fall in a few basic types:

  • Time comparisons – comparison of same measure over consecutive and equivalent time periods such as day of week, day of month, month, season.
  • Target, budget, forecast comparisons – comparison of measure against plan figure.
  • Competition comparisons – market share.
  • Ratios – relationships between different measures.

To provide the basis for active analytical applications, KPI definitions can also include rules based on thresholds and actions. Thresholds define how a KPI should be tested, eg: what if the KPI reaches X or drops below Y – this might assume limits are known and that change is unusual. Or what if KPI increases or decreases by an absolute amount or percentage – where limits are less known, change is expected but acceleration or deceleration needs to be monitored.

If such thresholds are reached, some action needs to be performed ranging from passive notification such as dashboard traffic lighting, or report highlighting to more active forms of notification such as e-mail report delivery or SMS text messaging. Some actions may try to explain the KPI change via reports charts, data visualisation, OLAP, data mining or respond by closed loop and feedback to operational systems.

Because KPIs are typically ratios or percentages, they are not additive. They also don't exist in large quantities; few people would think of drilling down to millions of KPI figures. As such, they get missed by the data architects and even the dimensional modellers but by understanding what measures and KPIs are most valuable, you can accelerate the requirements definition process and prioritise important requirements. This helps the data warehouse team control the scope; they don't have to pull in all data elements from all data sources and then decide what is important.

Lawrence Corr (lcorr@decisionone.co.uk) is a global authority on data warehousing and specialises in dimensional design. He has taught data warehousing classes in Europe and SA for Kimball University and now reviews data warehouse designs for clients worldwide through DecisionOne Consulting.

Sagent

Sagent is a leading global provider of enterprise business intelligence solutions. Sagent's suite of enterprise business intelligence solutions enables companies to measurably impact their business by implementing highly successful customer relationship and financial management initiatives.

Through Sagent's powerful enabling technologies, organisations can easily and rapidly turn company data into relevant information that can be used for effective decision-making, analysis and reporting. Information can be extracted from multiple sources (internal and external), optimised for decision support and delivered in a customised format for Web-based or client applications.

More than 1 500 companies have selected Sagent software to enhance customer retention, cross-sell/up-sell, improve customer service, increase efficiencies of marketing campaigns, streamline business operations, analyse financials and reduce costs.

Customers include AT&T, BP Amoco, Boeing Employees Credit Union, Bristol Meyers, British Telecom, California State Automobile Association, Citibank, GPU Energy, Kinecta Federal Credit Union, Johnson & Johnson (UK), Kemper National Insurance, Provident Central Credit Union, Safeway, Siemens, Telkom, Credit Guarantee and Gensec.