A Data Warehouse Technical Architecture_v3.0

Preview:

DESCRIPTION

isng

Citation preview

  • Enterprise Data Warehouse A Technical PerspectiveTony DalwoodInformation Architecture & ManagementUniversity of South Australia

  • IT StructureISTS Information Strategy & Technology ServicesInformation Strategy Corporate Information SystemsE-Business Information Architecture & ManagementTechnical ServicesCustomer ServicesNetwork ServicesSystems Infrastructure

  • Information Architecture & Management (IAM)Merger of DBA team & Information Integration team in Feb 2006IAM managesCorporate System Databases (3 DBAs)Operational Data Store ManagementMiddle Tier Apps Student Portal (myUniSA)Staff Portal (UniSAinfo)UniSAinfo ReportingEDW

  • Project GovernanceSteering GroupIncludes Directors of ISTS, Planning and Assurance Services (PAS), Student & Academic Services (SAS), FinanceSponsors GroupDirector of Planning & Assurance ServicesDep. Director Information StrategyBusiness Project ManagerTechnical Project ManagerReference Group Senior Officers from PAS, HR, Research, SAS, Finance

  • Project GovernanceProject TeamBusiness Project Manager (PAS)Technical Project Manager (ISTS)Design Architect/Dev Team Leader (ISTS)Business Analyst (x1.5) (PAS)Data Quality Manager (0.5) (PAS)Developers (x3 variant) (ISTS)

  • EDW Project MilestonesAug 2004 - Business Case submitted by Planning & Assurance Services (PAS) and ISTS to extend current reporting environment to an EDW ($150K)Feb 2005 Project CommencedFeb-July 2005 Data Gathering Workshops Sep-Dec 2005 Technical Research & Proof of Concept (0.5 IT Resource)Jan-Feb 2006 External Consultancy (1 IT Resource)May 2006 First Star Schema complete (Research Publications) (4 IT Resources)July 2006 Three more Star Schemas complete (Research Income, AVCC Data, Research Staff Supervision) (4 IT Resources)August 2006 First Soft Production Release (2.5 IT Resources)Beyond Student Data & Finance Data (min 2 IT Resources)NB: IT Resource not including part time Tech Project Manager

  • Project Goals

  • By-Products of an EDW ProjectData Discovery What data do we haveHow data is used and maintainedWhat is the quality of the dataHow data can be utilised by more of the organisation

    Enhanced CollaborationIntra and Inter communication between business units, system owners and IT

  • Technical Project Plan

    Warehousing ResearchProof of Concept exerciseExternal AssistanceImplementation of an Architecture Development Standards & ProceduresBuild & Implementation of Stage 1Review

  • Proof of Concept

    Validate Warehouse research findingsProof of Concept covered the following topics:Project methodologyTechnical architectureDesign methodologyETL methodologyMetaData optionsData Quality approachSecurity implementation options

  • Project Methodology

  • Technical ArchitectureInputs into ArchitectureBusiness GoalsExisting Reporting EnvironmentsTechnologyTime$$Resources/Skills

  • Data Flow Architecture

    Data

    Source System

    Replicated Data Table

    24 Hour

    Snapshot schema

    Previous Snapshot Schema

    Previous Snapshot Table

    Diff Process

    ODS

    Staging Tables

    EDW

    Deltas

    Target EDW Tables

    Transform & Load

    Source System

    Source Systems

    External Files

  • Design MethodologyDimensional Modelling chosen as the design philosophy Star Schemas/SnowflakesFactsDimensions MeasuresBridgesHistory Retention for Slowly Changing DimensionsWarehouse records are versioned i.e. never deleted or overwritten. Views to identify current records

  • Transformation of Design - Source

  • Transformation of Design - Target

  • ETL MethodologyScripts Vs Tool decisionTool chosen for following reasons:Already licensed for Oracle Internet Developer Suite that includes Oracle Warehouse BuilderOracle Database environmentOracle technical skills Visibility of Development EnvironmentAuto technical Meta Data generationAuto and accessible code generation using PL/SQL Ability to include custom codeIntegration with Oracle database and related Oracle technologyFramework for BeginnersDifficult to evaluate other products without expertiseSmarts & Effort into Modelling and Design ETL should be a no brainer

  • MetaDataData about DataOracle Warehouse Builder provides technical metadataBusiness MetaData facility currently restricted to documentation and Cognos catalogsEvaluation of MetaData methods to be reviewed at the completion of Stage 1 development

  • Data QualityPre-ETLTechnical profile to ensure physical design has mapped appropriate data elementsBusiness profile of source data to identify data attributes e.g. data type, patterns, nulls, min, max, outliesETLTransform to conformed data setsForeign Key checksReporting of anomoliesPost ETLFinal Business profile to validate transformations of data

  • SecuritySecurity options implemented are:Database LayerOracle roles to grant or deny access to database objects based on Business rulesOracle views for granular data security where appropriateUser Layer Access to end user Cognos catalogues/cubes controlled via Cognos security mechanisms and filesystem access

  • Development LifecycleBusiness RequirementsDesign ProcessLogical DesignPhysical DesignData MappingData Profiling

  • Development LifecycleDesign & Build ETL Objects & ProcessesExtraction routinesDiff routinesTag records as Inserts, Updates or DeletesBuild Staging tablesBuild Target warehouse tables

  • Standard ETL ProcessScheduled Extract/Diff process runs to populate a Diff table in the Staging AreaETL process then performs a standard set of steps Load Staging from Diff tableStamp Staging record according to Diff type (U, D or I)Updated Record Tag staging record as new version of core recordDeleted Record Tag staging record Retired record in warehouseInserted Record Tag staging record to be new record (version 1)Update Core End date existing current record Load new Core New current record from Staging

  • Development LifecyclePost ETLMeasures Summary data Process Flows to execute ETLSecurity viewsEnd User Layer e.g. Catalogues

  • ETL AuditingWhen did a process last runHow long did it run forDid it Succeed, Fail or produce WarningsHow many records did it alter or insertWhat were the data exceptions

  • UniSA EDW Toolset

    Oracle DatabaseOracle Warehouse BuilderOracle WorkflowOracle Enterprise ManagerDatiris Data profilerCognos Impromptu/Powerplay Whiteboard and lots of A3 Paper!!!

  • Oracle Database Options assisting Warehouse implementationExternal tables Materialised ViewsQuery RewriteBitmap indexesPartitioningStar Query optimizer options

  • Oracle Warehouse BuilderProvides the design and development environment and framework for the build and deployment of Warehouse objects and transformation processesConsists of Design Repository and Runtime components

  • Oracle WorkflowOptionally used for job execution with dependency managementExists as an optional install with RDBMS Run as Client/Server or HTTP browser based application Workflow engine is a service on the warehouse database server administered by a workflow schema

  • Oracle Enterprise ManagerOptionally used as the scheduling option for submitting and monitoring Warehouse builder processes or workflowsBase OEM comes with RDBMS Optionally run as standalone install or Management Server mode using a web console

  • Cognos 7.3 Reporting SuiteCataloguesReport Developer access layerImpromptuReporting capabilityPowerplayMulti-dimensional analysisUpfrontWeb interface

  • Oracle Warehouse Builder Demonstration

  • OWB 10g Release 2 - ParisNew Features:Design ToolGraphic Interface ImprovementsBuilt in Slowly Changing Dimension propertyData Profiling/Quality utilitiesBetter Integrated Workflow EngineJob Scheduling within OWB via OEM

  • Project Review

    Sanity Check on whole process, architecture, methodologyBusiness & TechnicalEvaluate ROIQuantify metrics on time to deliverProposed Future phasesUsage StatisticsHardware adequacy & capacity

  • Useful Technical ReferencesLinksOracle Business Intelligence & Technical Siteshttp://www.oracle.com/solutions/business_intelligence/index.htmlhttp://www.oracle.com/technology/tech/bi/index.htmlRittman Bloghttp://www.rittman.net/Kimball Tipshttp://www.kimballgroup.com/html/designtips.htmlTextsOracle 9iRel2 Data Warehousing - HobbsKimball TextsThe Data Warehouse Lifecycle ToolkitThe Data Warehouse Lifecycle ToolkitThe Data Warehouse ETL Toolkit

  • Questions ?

Recommended