View
1.960
Download
1
Category
Preview:
Citation preview
Big Data, NoSQL & Data Modeling
10 Tips for Data Modeling Success on Modern Data Projects
Karen Lopez, InfoAdvisorswww.datamodel.com
©InfoAdvisors - infoadvisors.com
Data Models – Traditional Process
Conceptual (Data)
Model
Logical Data Model
Physical Data
Model(s) OLTPOLTP OLTP OLTP
OLTP
MARTMART
OLTP
OLTPOLTP
Aug 2014
©InfoAdvisors - infoadvisors.com
Relational
Aug 2014
Data Models started with relational
modeling, so they look like relational
database structures.
©InfoAdvisors - infoadvisors.com
But….
That doesn’t mean they can’t be used to model data that goes into a non-relational format.All that formatting happens at build OR consumption time, not requirements time.
Aug 2014
©InfoAdvisors - infoadvisors.com
The Big Data Story
Lots of dataComing at us fastLots of variety in format & qualityWe want all the dataHighly available“It’s web scale”
Aug 2014
©InfoAdvisors - infoadvisors.com
What do we really mean by scale?
Bringing computing to the data
Massively parallel processing
Cheap, commodity hardware, but lots of itOptimized for Query/Reads/Questions/Telling stories
Aug 2014
©InfoAdvisors - infoadvisors.com
We’ve been down this road before…
Traditional transactional applications
Reporting-optimized
tables/structures
Data Warehouse / Dimensional
Modeling
Aug 2014
Highly normalized Highly Denormalized
ETL
Classic DW ArchitectureOLTP DB
External Data
OLTP DB
EDWStaging/ETL DB
On Premises
Data Mart
Data Mart
HadoopETL
Modern DW ArchitectureOLTP DB
OLTP DB
OLTP DB
EDWstaging
Cloud And/Or On Prem
Analytics Mart
Data Mart
Distributed Processing
(MapReduce)
Distributed Storage (HDFS)
Distributed Storage (Blob
Storage)
External Data
©InfoAdvisors - infoadvisors.com
NoSQL, Not Only SQL
Relational Graph Columnar/Column Family
Key Value Document Databases Others
Aug 2014
©InfoAdvisors - infoadvisors.com
Sample Hive Statement
CREATE EXTERNAL TABLE TaxRebateUsage ( state string, zipcode string, agi_class int, n1 int, mars2 int, prep int, n2 int,)ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE
Aug 2014
©InfoAdvisors - infoadvisors.com
Sample JSON/MongoDB Notation
Aug 2014
©InfoAdvisors - infoadvisors.com
Sample FoundationDB Statement
Aug 2014
©InfoAdvisors - infoadvisors.com
Sample Cassandra Statement
Aug 2014
©InfoAdvisors - infoadvisors.com
Sample Vertica Statement
Aug 2014
©InfoAdvisors - infoadvisors.com
Sample Neo4j Statement
Aug 2014
©InfoAdvisors - infoadvisors.com
Those weren’t SCHEMALESS….
They had data facts, which had meanings. And sometimes expected formats, precisions, and types.
In the NoSQL world, we don’t apply those necessarily at write time, but at read time.
SCHEMALESS really is MULTIPLE SCHEMAs (Polyschematic) or VARYING SCHEMAs.
Aug 2014
©InfoAdvisors - infoadvisors.com
The Big Data Big Lies
Schemaless• Schema on
Read, not Schema on Write
• Polyschematic
Big • New data
stories• New
technologies• Not just
volume
Aug 2014
©InfoAdvisors - infoadvisors.com
10 Tips For Modeling in a Hybrid World
1. Models require a modeler2. Data modeling tools are essential 3. There are many types of data models: know
which ones you need4. Modeling does not have to happen at the same
time in every project. It should happen at the right time
5. Modeling is not just schema design. Think outside the boxes and lines Aug 2014
©InfoAdvisors - infoadvisors.com
10 Tips for Modeling in a Hybrid World
6. A data model is much more than a diagram7. You will need training. 8. Team members may not understand
modeling. They will need training9. NoSQL is not one thing. Learn many patterns10.Modern data architectures are likely hybrid
solutions. You can’t just support one part.Aug 2014
©InfoAdvisors - infoadvisors.com
What does this mean for data modelers?
There will be jobs for traditional, ERD, relational modelers….….just like there are still jobs of RPG and COBOL programmersAll data has a data story. Many data stories.
A good modeler is a an architect at heart – finding the right solution for the data story.
Aug 2014
©InfoAdvisors - infoadvisors.com
Business Intelligence Journal
Look for September 2014 Issue Article on
Modern Data Architectures
Aug 2014
©InfoAdvisors - infoadvisors.com
Thank You!
www.infoadvisors.comwww.datamodel.comwww.dataversity.netcommunity.embarcadero.com
#TEAMDATA
Aug 2014
Follow me!
Recommended