|
For the seventh time running, the ever-popular and highly respected authority on Data Warehousing, Lawrence Corr, will return to South Africa in September this year to present a master class series on Data Warehouse Design and Development. Joining him again in South Africa is equally respected industry colleague and information integration specialist, Joe Caserta, who co-authored and collaborated with Ralph Kimball, to bring the global BI industry the world’s very first in-depth book on Data Warehouse ETL design and development, The Data Warehouse ETL Toolkit.
This course offers two unique design workshops from international data warehousing experts Lawrence Corr and Joe Caserta:
- Dimensional Modelling, Analysis and Design Workshop (part 1)
and
- ETL Architecture and Design Workshop (part 2)
Dates: Part One 3rd to 5th September 2007
Part Two 5th to 7nd September 2007
Venue: Johannesburg - Protea Hotel Balalaika Sandton
Maude Street
Sandown
http://www.balalaika.co.za/location.htm
Why Dimensional Analysis and Design?
Dimensional modelling is the proven technique for developing understandable, high-performance data warehouses and data marts. Dimensional analysis and design closes the gap between business requirements and traditional dimensional modelling. The rigorous and practical use of dimensional techniques throughout analysis and design improves productivity and communication between business users and IT by better supporting incremental development and more fully capturing analytical requirements.
Why ETL Architecture and Design?
Extract, Transform and Load (ETL) is the vital process by which data warehouses take disparate and disordered transactions data and present it in a cohesive, orderly way for intelligent decision making. Unfortunately it is universally recognised that ETL consumes the most time and money of any aspect of data warehousing and demands high levels of ongoing maintenance. By precisely designing and building reusable processes to extract, clean, conform and deliver dimensional data, the ETL team provides the foundation for a successful, reduced cost data warehouse implementation.
Audience
This course is appropriate for anyone involved or interested in learning the latest techniques for planning, designing and managing dimensional data warehouses and ETL processes.
Beginner, intermediate and experienced data warehouse practitioners, data architects, DBA’s and ETL designers & developers will benefit from this course.
Part 1: Dimensional Modelling, Analysis and Design Workshop
Duration: 2.5 days top
Instructor: Lawrence Corr
Date and Time :
03 September 2007 08h00 - 16h00
04 September 2007 08h00 - 16h00
05 September 2007 08h00 - 12h00
Overview
This workshop covers introductory to advanced dimensional modelling techniques focusing on the real-world design challenges typical of large, complex multinational data warehouse projects. Topics are taught through a combination of lecture, instructor-led examples, case studies and extensive team exercises that review existing dimensional designs, model dimensional solutions to common business problems and plan the design of entire data warehouses.
Objectives
Upon completion students will be able to:
- Participate in rapid incremental data warehouse design
- Establish clear and concise communication with business users
- Maximise the usability and performance of their data warehouse designs
- Find supporting articles and template documents on the web
Contents
Dimensional Modelling Fundamentals:
- Modelling for measurement – the case for dimensional modelling
- Fundamentals of stars, snowflakes, facts and dimensions
- Slowly changing dimensions – accurately reflecting history, supporting current, historically correct and alternative views
- Common dimensional modelling techniques – Time, multi-role and degenerate dimensions, surrogate keys, value chains and other common multi-star design patterns
- The Data Warehouse Bus Architecture of conformed dimensions and facts, how data marts can enable incremental data warehouse development
Dimensional Analysis:
- Gathering Analytical Requirements –asking the right questions
- Identifying business events and processes that must be measured
- Identifying business dimensions by classification (Who, What, Where, When and Why or People, Things, Places, Timestamps and Reasons)
- Identifying and documenting the relationships between business events and dimensions – the dimensional matrix (reloaded)
- Identifying Key Performance Indicators (KPIs) and Metrics - aggregates, comparisons and exceptions
- Defining granular facts – additive, semi-additive and non-additive measures
- Identifying and classifying dimensional attributes and hierarchies – completeness checks
Advanced Dimensional Design:
- Rules for combining and separating dimensions
- Flexible date handling, ad-hoc ranges and multiple simultaneous events
- Dealing with Very Large Dimensions – B2C customers
- Hot-swappable and ultra-hot-swappable dimensions
- Multi-valued dimensions – allocation problems, impact and correctly weighted analysis
- Bitmap dimensions – supporting complex combination constraints
- Advanced slowly changing dimension techniques – multiple alternate realities - ‘as is’, ‘as was’ and ‘as at’ reporting
- Variable-depth hierarchies – organization structures, bill of materials recursive relationships, dynamic hierarchies, and generic hierarchy maps
- CRM measures – using recency, frequency and intensity
- Multinational support – multiple languages, currencies and units of measure
Part 2:ETL Architecture and Design Workshop
Duration: 2.5 days
Instructor: Joe Caserta
Date and Time :
05 September 2007 13h00 - 16h00
06 September 2007 08h00 - 16h00
07 September 2007 08h00 - 16h00
Overview
This workshop offers an in-depth understanding of extract, transform and load (ETL) techniques essential for building dimensional data warehouses. It focuses on proven methods and best practices to successfully implement, manage and maintain the most challenging task of any data warehouse project – the ETL.
Objectives
This workshop teaches the practical detailed steps of data warehouse ETL including extracting, cleaning, conforming, and delivering data and its associated metadata. Students will learn the design and development, architecture, operations and management aspects of scheduled and real-time ETL. Upon completion students will be able to design each of the steps, and sub-steps, required to successfully obtain, prepare and publish data in a dimensional data warehouse.
Content
Functional Practices:
- Planning and designing your ETL system
- Choosing the appropriate architecture
- Managing the implementation
- Managing the day to day operations
- Building the development/test/production suite of ETL processes
- Building a data cleaning subsystem
- Understanding the tradeoffs of various staging data structures, including flat files, normalized schemas, XML, and dimensional schemas
- Analyzing and extracting source data
- Creating the logical data mapping
- Structuring the data into dimensional schemas for the most effective delivery to end users
- Conforming heterogeneous data from multiple sources into standardized dimension tables and fact tables
- Building ETL modules for handling the three distinct types of slowly changing dimensions (SCDs)
- Building ETL modules for multi-valued dimensions and hierarchical dimensions
- Running high-performance surrogate key pipelines
- Loading the three fundamental fact table grains – transaction, periodic snapshot and accumulating snapshot
- Handling late arriving dimensions and facts
- Optimizing ETL processes to fit into highly constrained load windows
- Structuring and presenting metadata
- Converting batch and file-oriented processes into continuously streaming real-time ETL systems
In addition to the course material all students attending the ETL workshop will receive a copy of Ralph Kimball and Joe Caserta’s book, The Data Warehouse ETL Toolkit (Wiley, 2004).
 |
Lawrence Corr is a leading data warehouse design specialist and highly experienced educator. He has taught data warehouse design courses across Europe, North America and Africa. In the dimensional modelling workshop Lawrence covers the latest techniques for aligning data warehouse design with business analytical requirements. |
 |
Joe Caserta is the co-author with Ralph Kimball of the first in-depth book on data warehouse ETL design and development, The Data Warehouse ETL Toolkit. He is an influential data warehousing veteran whose expertise is shaped by years of industry experience. In the ETL workshop Joe shares his extensive knowledge and enthusiasm for designing, deploying, and managing the data warehouse ETL process. |
Costs
Full Master Class Part One and Part Two
For delegates attending the full five day master class:
- R 13 000 per delegate
- R 12 600 per delegate for more than 3 delegates from one company, booked simultaneously
SPECIAL - Early Bird Discount
R 12 700 per delegate – if booked and paid before 23 July 2007
Group Discount
R 12 600.00 per delegate for more than 3 delegates from one Company, booked simultaneously
Master Class Part One or Part Two:
For delegates wanting to attend either Master Class One or Two for 2.5 days:
- R 7 400 per delegate
- R 7 000 per delegate for more than 3 delegates from one company, booked simultaneously
SPECIAL - Early Bird Discount
R 7 100 per delegate - if booked and paid before 23 July 2007
Group Discount
R 7 000.00 per delegate for more than 3 delegates from one Company, booked simultaneously
Refreshments, lunch on each day and course materials are included.
|