6
Overview A Data Warehouse is a storage area, which holds the complete data specific to the organization so that efficient queries can be built on it for Analysis and Reporting purpose.Data Warehouse is a subject-oriented database where data is associated to a single organizational process, often called entity. A Data Warehouse is integrated with various source databases, which provides enormous strategic implication for Business Intelligence.The Data Warehouse on the other hand does not cater to real time operational requirements of the enterprise. It is a storehouse of current and historical data extracted from either internal or external data source(s). Pre- requisites Knowledge of Oracle is essential. Applications Course Brochure Peers Technologies Pvt. Ltd. Page 1 DATA WAREHOUSING

Data Warehousing CANTENTS

Embed Size (px)

DESCRIPTION

dw

Citation preview

  • Overview

    A Data Warehouse is a storage area, which holds the complete data specific to theorganization so that efficient queries can be built on it for Analysis and Reportingpurpose.Data Warehouse is a subject-oriented database where data is associated to asingle organizational process, often called entity. A Data Warehouse is integrated withvarious source databases, which provides enormous strategic implication for BusinessIntelligence.The Data Warehouse on the other hand does not cater to real timeoperational requirements of the enterprise. It is a storehouse of current and historical dataextracted from either internal or external data source(s).

    Pre-requisites

    Knowledge of Oracle is essential.

    Applications

    Course Brochure

    Peers Technologies Pvt. Ltd.

    Page 1

    DATA WAREHOUSING

  • COURSE CONTENTS

    Introduction to Data warehousing

    What is RDBMS Features of RDBMS What is Data warehousing Architecture of Data warehousing Difference between RDBMS & DATA

    WAREHOUSE DATABASES.

    Definitions ETL Process Reporting Types of Tables in D/W Types of FACTS tables Types of DIMENSION tables Types of Schemas in D/W What is Data Mart Warehouse Approaches

    INFORMATICA

    Definitions Informatica Tools Informatica Architecture Types of ports Predefined functions

    Types of Transformations:

    Source Qualifier T/r Expression T/r Filter T/r Router T/r Joiner T/r Union T/r Rank T/r Sorter T/r

    Aggregator T/r Connected Lookup T/r Unconnected Lookup T/r Connected Stored Procedure T/r Unconnected Stored Procedure T/r Normalizer T/r Sql T/r Transaction Control T/r Xml T/r

    Using Source Tables Using Source Flat Files SCD1 Mapping SCD2 Mapping SCD3 Mapping Mapplets Mapping Parameters Mapping Variables File List Dynamic Connections Session Variables Parameter files

    Workflow Tasks:

    Start Task Session Task Event Wait Task Event Raise Task Email Task Decision Task Assignment Task Control Task Command Task Timer Task

    Page 2 DATA WAREHOUSING

  • COURSE CONTENTS

    Session Properties Worklet Types of Caches Types of Partitions Sequential Batch Processing Parallel Batch Processing ETL Testing Performance Tuning Optimization Debugging Backup Recovery Installation steps

    COGNOS

    What is program Intelligence What is Business Intelligence Types of olap engines Cognos Architecture

    Types of tools:

    Framework Manager Creation of project Creation of data source Relations among tables Resolving infinity loop Applying filters

    Data scrubbing Data Cleansing Handaling null values Creation of Dimension objects Creation of Measure objects Creation of Package

    Query Studio:

    Introduction Simple reports Filters Sort Summarize Format Data Calculate Define Custom Groups Drill down Drill Up Goto Charts Define Conditional Styles Apply Styles Set web page size Set page breaks Group Pivot Sections Swap rows and columns Collapse group Expand group

    Page 3 DATA WAREHOUSING

  • COURSE CONTENTS

    Report Studio:

    Introduction Simple reports Query Calculations Layout Calculations Parameterized reports Dynamic Headings in a report Cascading reports Reusability of a query Master-Detail reports 2 Layouts with 2 Queries 2 Layouts with 1 Query Grouped reports Group filter reports Page break reports Applying sections Removing sections Calling one report from another report Data format reports Group Section reports Creating headers Merging data cells Cross tab reports Date Range prompt Value prompt Inline prompts Repeater Table reports Charts Dash board reports

    Analysis Studio:

    Introduction Summary reports Ranks

    Event Studio:

    Introduction Scheduling reports

    DATA STAGE

    Server Jobs and Parallel Jobs difference Parallel Pipeline Parallelism Partition Parallelism Partitioning and Collecting Configuration file Fastname, Pools, Resource Disk, Resource

    Scratch Disk

    Running Job with different nodes Symmetric Multi Processing Massively Parallel Processing Partition techniques Round Robin Random Hash Entire Same Modulus Range DB2 Auto Datastage components Server components Clients components Datastage Server Datastage Repository Naming Standards of jobs Document preparation ETL specs preparation Unit testcases preparation

    Page 4 DATA WAREHOUSING

  • COURSE CONTENTS

    Datastage Administrator:

    Server properties Datastage project Administration Editing projects and Adding projects Deleting projects Cleansing up project files Upgrade licences Auto purging Permissions to users Runtime Column Propagation Enable Remote Execution of Parallel jobs Add checkpoints for sequencer Project protect .APT Config file

    Datastage Director:

    Introduction to Datastage director Datastage Director window Jobs status view Datastage director options Running Datastage jobs Validating a job Running a job Batch Running Stopping a job and resetting job Monitoring a job Job scheduling Unscheduling a job Rescheduling a job Deleting a job Unlocking jobs View Logfile Clear log Fatal error description Warning description

    Info description Difference between Compile and Validate Difference between Validate and Run

    Datastage Designer:

    Introduction to Datastage Designer Importing table definitions Importing flat file definitions Managing the meta data environment Dataset management Deletion of Dataset Routines Importing jobs Exporting jobs(Back up) Configuration file view Explanation of Menu Bar Palette Passive stages Active stages Database stages Debug stages File stages Processing stages Mutiple Instances Runtime Column Propagation(RCP) Job design overview Designer work area Annotations Creating jobs,deleting jobs Compiling jobs Batch compiling Aggregator stage ,Copy stage Change Capture stage,Compress stage Decode stage,Encode stage,Difference

    stage

    Filter stage,Funnel stage Modify stage

    Page 5 DATA WAREHOUSING

  • DATA WAREHOUSING

    COURSE CONTENTS

    Join stage,Lookup stages Difference between join and Lookup

    stages

    Merge stage Difference between Lookup and Merge

    stages

    Remove duplicate stage Sort stage,Pivot stage Surrogate key stage, switch stage Types of Lookups Types of Transformer stages Basic transformer stage Transformer stage Null handling in Transformer stage If Then Else in Transformer Stage variables Constraints Derivations Peek stage, Head stage, Tail stage Job properties Local variables Functions in Transformers String,Date,Null handling functions All properties in all stages Slowly changing Dimensions (SCD) SCD Type-1 SCD Type-2 SCD Type-3 Implementation of SCD T ype-1 in

    Datastage

    Implementation of SCD T ype-2 in Datastage

    JOB SEQUENCER:

    Arrange job activities in Sequencer Triggers in Sequencer Reset method Recoverability Notification Activity Terminator Activity Wait for file Activity Start Look Activity Execute Command Activity Sequencer

    CONTAINERS:

    Reusability Minimizing complexity Local container Shared container Some jobs in container

    Page 6 DATA WAREHOUSING