Upload
herbert-james
View
222
Download
0
Embed Size (px)
DESCRIPTION
Stories from the trenches - Partitioning as a design pattern About the FLIS project Data warehouse heroes and architecture Partitioning in many disguises
Citation preview
SQL
Kennie Nybo Pontoppidan
Stories from the trenches - Partitioning as a design pattern
Head of R&DRehfeld
Agenda
Stories from the trenches - Partitioning as a design pattern
• About the FLIS project• Data warehouse heroes and architecture• Partitioning in many disguises
The FLIS Project
You are here
The FLIS project - Team • Netcompa
ny75%
• Rehfeld25%
• TDC hosting
The FLIS project - Mission• Jointly defined KPI’s
• View your own KPI’s• Benchmark with ”twins”
• Access to your own data • Common datamodel• Raw data
Access to your own data?• ~ 72 mill. kr.
to get data for 4 years
• (10 mill. euro)
The FLIS project – DW Challenges• Data from 30 different it-systems
• 7 subject areas• Citizens• Employees• Salaries• Absence
• ERP• Budgets• Postings
• Schools• …
• Conformity• Within area• Between areas
Data warehouse heros
Bill
Ralph
Dan
Bill Inmon• Inventor of the word
Data warehouse
• Oracle reference architecture
• Top down approach
• Hub and Spoke
Ralph Kimball• ”Invented” dimension
modelling
• Microsoft ”reference architecture”
• Bottom up
Dan Lindstedt• Hybrid model
• Hubs, satellites and Links
FLIS Data Warehouse Architecture
Technologies
Squirrel server ?
SSIS package XML
Informatica PowerCenter XML
Look up the wordpar·ti·tion (pär-tshn)n.
1.a. The act or process of dividing something into parts.b. The state of being so divided.2.a. Something that divides or separates, as a wall dividing one room or cubicle from another.b. A wall, septum, or other separating membrane in an organism.3. A part or section into which something has been divided.…
Source: http://www.thefreedictionary.com/partitioning
MCM videos
Sqlskills.com
Partitioning as a design patternDividing something into parts
• Files• Database• Table
State of being so divided• ETL developers vs. Customer• Developers vs (evil?) DBA’s• Backups (what is a full backup anyway)
Dividing something into parts - files• Xml, csv, xls, fixed
format• 1 file• Data from • 1 or more tables• Data from 1 or more municipality
• 30 different file naming schemes
Dividing something into parts - files• Csv• 1 file• 1 table• Data from just 1 municipality
• 1 file naming scheme
Files (comple
x)
File-dsa mappin
gs(comple
x)
Dsa tables
Dividing something into parts - files
Files (complex)
Preprocessor
(complex)
File-dsa mappings (simple)
Dsa tables
Dividing something into parts - Database1 data warehouse layer = 1 database
• Scaling• IO-pattern for a data flow
Dividing something into parts - Database1 database
• 4 LUNS, RAID 10• 4 datafiles
Dividing something into parts - DatabaseFilegroups
• PRIMARY • 160MB
• BIG ONE• A lot of data
Dividing something into parts - TablesTable partitioning (2005 EE feature)
• Pruning• Divide the data warehouse into 100 parts 1/100 the size
• Switching• Separation of readers and writers• Fast
Dividing something into parts - Tables
• DSA• Partition by (municipality_id,
month_year)
• EDW and data marts• Partition by municipality_id
Ask Kristian!
CREATE TABLE [dbo].[partition_test]( [Kommune_id] [varchar](11) NULL, [Kommunenummer] [nvarchar](4000) NULL, [Distrikt_kode] [nvarchar](4000) NULL, [Distrikt_type] [nvarchar](4000) NULL, [Distrikt_tekst] [nvarchar](4000) NULL, [Cpr_Dist_Tekst_TS] [nvarchar](4000) NULL) ON [partition_test_pt_sc]([Kommune_id])GO
Dear Mark! Please, please, please, pl
Dividing something into parts – CPU’s
• Affinity masking
• Not really possible in Oracle – even on Windows
State of being so divided
State of being so divided• Informatica PowerCenter (ETL vs.
Customer)
• Meta data • File definitions• Table definitions in all layers• Simple transformations
• Autogenerate mapping code from meta data• Go on – please change requirements
Mapping metadata model
MappingID (int)
DestinationColumnFK (int)SourceColumnFK (int)
Mapping
ColumnID (int)
ColumnName (varchar)
Column
DWLayerID (int)DWLayerName (varchar)
DWLayer
TransformationID (int)FlowFK (int)
Transformation
ColumnTypeFK (int) ColumnTypeID (int)ColumnTypeName (varchar)
ColumnType (source data)
TableFK (int)
TableTypeID (int)TableTypeName (varchar)
TableType
TransformationSchemaID (int)TransformationTypeID (int)
TransformationSchema
StandardColumnID (int)
StandardColumnName (varchar)
StandardColumn
DWLayerFK (int)
Logic (varchar)
TransformationTypeID (int)TransformationTypeName (varchar)
TransformationTypeTransformationSchemaName (varchar)
FlowID (int)Flow
FlowName (varchar)FlowTypeFK (int)
FlowTypeID (int)FlowType
FlowTypeName (varchar)
TransformationFK (int)
TransformationSchemaFK (int)Logic (varchar)
DataTypeFK (int)TableID (int)
TableName (varchar)
Table
DWLayerFK (int)TableTypeFK (int)
Order???
DataTypeID (int)DataTypeName (varchar)
DataType
State of being so divided• Developers vs. • (evil) DBA’s
• Meta data on table definitions• Script ddl from meta data• Hide partitioning in ddl• Partition Scheme• Change File group design
State of being so divided• Backups: DBA’s vs. Backup administrator
• Partitioning helps us divide data into • Hot• (C)old (and therefore read only)• Only backup hot partitions
State of being so divided• Restoring: DBA vs. SQL server • PRIMARY filegroup• 150 MB
• Default filegroup• Big
• Restore database • only primary filegroup => online
State of being so divided• DW vs OLTP and OLAP
• DW server• ETL• processing OLAP databases/cubes• Reporting server• Sharepoint databases (OLTP)• restoring OLAP databases/cubes
We also use these cool features…• Page compression• Backup compression
Page compression results
tableFactor (size) Num rows num reads
cpu time (ms)
elapsed time (ms)
dg1 1 89132 372 31 739dg1 2 178264 743 31 1517dg1 10 891320 3781 281 7551
dg1_page_compr 1 89132 143 15 748dg1_page_compr 2 178264 285 62 1512dg1_page_compr 10 891320 1427 421 7596
code vartype varvalDA000 DGALT 00K00DA000 DGCAT 06M38ADA000 DGPROP 26X01DA001 DGALT 00K00DA001 DGCAT 06M38ADA001 DGPROP 26X01DA009 DGALT 00K00DA009 DGCAT 06M38ADA009 DGPROP 26X01
or
EvaluationCreate a Text message on your phone and send it to 1919 with the content:
DB301 5 5 5 I liked it a lotSession Code
KenniePerformance (1 to 5)
Match of technical
Level(1 to 5)
Relevance(1 to 5) Comments
(optional)
Evaluation Scale: 1 = Very bad 2 = Bad 3 = Relevant 4 = Good 5 = Very Good!
Questions:• Speaker Performance• Relevance according to
your work • Match of technical level
according to published level• Comments
© 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Thank you