11
Presentation on Building a data warehouse Department of Computer Science and Applications Panjab University SSG Regional Centre,Hoshiarpur Submitted By: Neha Kapoor Submitted To: Mrs. Neeru Mago Semester: 5 th (Assistant Professor) Roll No.: 1329

BUILDING A DATA WAREHOUSE

Embed Size (px)

Citation preview

Page 1: BUILDING A DATA WAREHOUSE

Presentation on Building a data warehouse

Department of Computer Science and Applications Panjab University SSG Regional Centre,Hoshiarpur

Submitted By: Neha Kapoor Submitted To: Mrs. Neeru Mago

Semester: 5th (Assistant Professor)

Roll No.: 1329

Page 2: BUILDING A DATA WAREHOUSE

Steps For Building a Data Warehouse:1.Extracting the transactional data from the data

sources into a staging area2.Transforming the transactional data3.Loading the transformed data into a

dimensional database4.Building pre-calculated summary values to

speed up report generation 5.Building a front-end reporting tool

Page 3: BUILDING A DATA WAREHOUSE

1. Extracting Transactional Data

->A large part of building a DW is pulling data from various data sources and placing it in a central storage area.

->Microsoft has come up with an excellent tool for data extraction. Data Transformation Services (DTS), which is part of Microsoft SQL Server 7.0 and 2000, allows you to import and export data from any OLE DB or ODBC-compliant database as long as you have an appropriate provider

Page 4: BUILDING A DATA WAREHOUSE

2. Transforming Transactional Data

• After extracting is transforming and relating the data extracted from multiple sources.

• Most companies have their data spread out in a number of various database management systems: MS Access, MS SQL Server, Oracle, Sybase, and so on.

• When building a data warehouse, you need to relate data from all of these sources and build some type of a staging area that can handle data extracted from any of these source systems.

• After all the data is in the staging area, you have to give it a common shape. you need to figure out a way to relate tables and columns of one system to the tables and columns coming from the other systems.

Page 5: BUILDING A DATA WAREHOUSE

3. Creating a Dimensional Model• The relational database is highly normalized;perform well in

On-Line Transaction Processing (OLTP) environment. But the relational format is not very efficient when it comes to building reports with summary and aggregate values

• The dimensional approach, provides a way to improve query performance without affecting data integrity

• The dimensional model consists of the fact and dimension tables.

• The fact tables consist of foreign keys to each dimension table, as well as measures. The measures are a factual representation of how well your business is doing. Dimensions, on the other hand, are what your business users expect in the reports—the details about the measures.

Page 6: BUILDING A DATA WAREHOUSE

4. Loading the Data

• After you've built a dimensional model, it's time to populate it with the data.

• Loading data involves transfer of large amount of data from source to data warehouse database.

• Identify the inconsistencies before the final loading operation.

Page 7: BUILDING A DATA WAREHOUSE

4. Generating Precalculated Summary Values

• The generating the precalculated summary values are commonly referred to as aggregations

• This step has been tremendously simplified by SQL Server Analysis Services. After you have populated your dimensional database, SQL Server Analysis Services does all the aggregate generation work for you.

• Depending on the number of dimensions you have in your DW, building aggregations can take a long time. As a rule of thumb, the more dimensions you have, the more time it'll take to build aggregations.

Page 8: BUILDING A DATA WAREHOUSE

Prior to generating aggregations, you need to make an important choice about which dimensional model to use:

• ROLAP (Relational OLAP),• MOLAP (Multidimensional OLAP)• HOLAP (Hybrid OLAP).

Page 9: BUILDING A DATA WAREHOUSE

• In ROLAP model the data stored in a relational database, builds additional tables for storing the aggregates but this takes much more storage space than a dimensional database.

• The MOLAP model stores the aggregations as well as the data in multidimensional format, which is far more efficient than ROLAP.

• The HOLAP approach keeps the data in the relational format, but builds aggregations in multidimensional format, so it's a combination of ROLAP and MOLAP.

Page 10: BUILDING A DATA WAREHOUSE

5. Building a Front-End Reporting Tool• The last step comprises of building the reporting,

query and tools for the end users so that the required information can be fetched easily.

• Microsoft Office 2000 on their desktops, the Pivot Table Service of Microsoft Excel will do the job. If the reporting needs are more than what Excel can offer,

• In addition to the third-party tools, Microsoft has just released its own tool, Data Analyzer, which can be a cost-effective alternative.

Page 11: BUILDING A DATA WAREHOUSE

for your attention!