Upload
planet-cassandra
View
758
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Using concrete, real-world examples, the presenter will show the following: How abandoning modeling altogether is a recipe for disaster, even in—or especially in—NoSQL environments; How experienced relational modelers can leverage their skills for NoSQL projects; How the NoSQL context both simplifies and complicates the modeling endeavor.How lessons learned modeling for NoSQL projects can make you a more effective modeler for any kind of project.
Citation preview
Data Modelers Save Their Careers: Surviving and Thriving with NoSQL
Joe Maguire
Data Quality Strategies, LLC h=p://www.DataQualityStrategies.com/
© 2013 Data Quality Strategies, LLC
Thesis
• RelaIonal DBMS’s have dominated, • ...so relaIonal modeling subsumed other forms, including conceptual modeling.
• As R-‐DBMS wanes, so does relaIonal modeling – and sadly, whatever it subsumed.
• Conceptual modeling must be saved. • RelaIonal modelers can step in to save it... • ...with some significant effort.
#Cassandra13 © 2013 Data Quality Strategies, LLC 2
My PerspecIve • Over three decades in industry • Career is a three-‐legged stool – Product development for soVware vendors – SoluIon design for enterprises – Author, Industry Analyst, Thought Leader
• Specialize in – Modeling – Requirements analysis – Data architecture – Data quality
#Cassandra13 © 2013 Data Quality Strategies, LLC 3
Agenda
• History • Current Events • Your Future as a Data Modeler • Q&A
#Cassandra13 © 2013 Data Quality Strategies, LLC 4
A Big-‐Picture Framework
#Cassandra13 © 2013 Data Quality Strategies, LLC 5
Meta-‐model
Data Perspec1ve
Conceptual • EnIIes • A=ributes • RelaIonships • IdenIfiers
Logical • Tables • Columns • Primary and foreign keys
Physical • Indexes • Table spaces • VerIcal and horizontal parIIoning • DenormalizaIons
Good Ideas in the Framework • InformaIon Hiding – e.g., conceptual excludes implementaIon details
• The Type/Instance disIncIon – Models describe categories, data describes members
• ApplicaIon/Data Independence – Data modeling is separate from process modeling
• User Requirements ≠ System Requirements – Users should not parIcipate in logical and physical
• Model-‐Driven Development – Forward and reverse engineering across model levels #Cassandra13 © 2013 Data Quality Strategies, LLC 6
A Big-‐Picture Framework, distorted
#Cassandra13 © 2013 Data Quality Strategies, LLC 7
Meta-‐model
Data Perspec1ve
RelaIonal • EnIIes / Tables • A=ributes / Columns • RelaIonships / FKs • IdenIfiers / PKs
Physical • Indexes • Table spaces • VerIcal and horizontal parIIoning • DenormalizaIons
How the DistorIon Happens • Tool Vendors Dismiss Conceptual Modeling – Because their tools cannot support it anyway
• Info Mgmt Specialists Confuse Models w Reality – E.g., believing the relaIonal model suffices to describe the universe
• InsItuIonalized Expediency – We know about conceptual modeling, but to save Ime, we combine it with relaIonal modeling...
– ...then we formalize that into our dev processes... – ...and eventually, that becomes the “best pracIces.”
#Cassandra13 © 2013 Data Quality Strategies, LLC 8
DistorIons, Revisited
• Summary of DistorIons: – DistorIon: Conceptual means vague – DistorIon: Logical implies relaIonal
• Rather than XML, OO, KV Store, Array Database, Graph Database
• Results of DistorIons: – Two levels only: relaIonal and physical – RelaIonal modeling used for user requirements
#Cassandra13 © 2013 Data Quality Strategies, LLC 9
Agenda
• History • Current Events • Your Future as a Data Modeler • Q&A
#Cassandra13 © 2013 Data Quality Strategies, LLC 10
Current Events: NoSQL • The “Just Say No” InterpretaIon
#Cassandra13 © 2013 Data Quality Strategies, LLC 11
Meta-‐model
Data Perspec1ve
Logical RelaIonal
• EnIIes / Tables • A=ributes / Columns • RelaIonships / FKs • IdenIfiers / PKs
Physical NO LONGER RELATIONAL: • Schemas Based on Big Table ImplementaIons • Alien DDL language • Limited Support from Modeling Tools
Current Events: NoSQL
#Cassandra13 © 2013 Data Quality Strategies, LLC 12
• The “Not Only SQL” InterpretaIon – Okay, so there might be some work for you – But you’re at risk of being marginalized
Agenda
• History • Current Events • Your Future as a Data Modeler • Summary • Q&A
#Cassandra13 © 2013 Data Quality Strategies, LLC 13
Your Future as a Modeler
#Cassandra13 © 2013 Data Quality Strategies, LLC 14
• Remaining Relevant – Selfishly: Saving your career – Nobly: Serving your client / company / customer
• What you can do: – Wait for relaIonal projects – Become a NoSQL database designer – Help your client choose data plasorms
• That starts with understanding the problems – which starts with CONCEPTUAL MODELING.
A New (?) Modeling Framework
• Conceptual Modeling • Choosing a Logical Meta-‐model • Logical Modeling • Physical Modeling
• Tool Support?
#Cassandra13 © 2013 Data Quality Strategies, LLC 15
Conceptual Modeling
• Behaviors and constructs will compare to RelaIonal Modeling: – Keep some – Discard some – Stress some – Change some
#Cassandra13 © 2013 Data Quality Strategies, LLC 16
Conceptual Data Model Example
#Cassandra13 © 2013 Data Quality Strategies, LLC 17
Keep Some
• Keep EnIIes • Keep A=ributes • Keep RelaIonships • Keep IdenIfiers • Keep Maximum Cardinality of RelaIonships
#Cassandra13 © 2013 Data Quality Strategies, LLC 18
Keep EnIIes
• Minimum Expressiveness • EnIIes, Not Tables – Don’t express Horizontal or VerIcal ParIIoning for performance • But yes is moIvated by privacy/security/risk
• EnIty names, not table names – Honor user vocabulary, not IT naming standards
#Cassandra13 © 2013 Data Quality Strategies, LLC 19
Keep A=ributes
• Honor User Phenomenon – A=ributes are part of user discourse
• A=ributes, not columns – Worry about scale (nominal, numeric, ordinal, Boolean, cyclic), not data type
– A=ribute names, not column names
• Support in-‐progress models – During which a=ributes can become enIIes
#Cassandra13 © 2013 Data Quality Strategies, LLC 20
Keep RelaIonships
• Minimum Expressiveness – A=ributes are part of user discourse
• Allow many-‐many and collecIon enIIes – If the la=er seem strange, you’ve been in IT too long
• RelaIonships, not FKs
#Cassandra13 © 2013 Data Quality Strategies, LLC 21
Keep IdenIfiers
• IdenIfiers, not PKs – IDs are not moIvated by computerizaIon, but by typography
– IDs predate the informaIon revoluIon • and the automoIve revoluIon, for that ma=er
• Support in-‐process modeling – IDs help the modeler ferret out the homonym problem
#Cassandra13 © 2013 Data Quality Strategies, LLC 22
Discard Some
• Discard Foreign Keys – They’re relaIonal
• Discard Minimum Cardinality – A funcIon of process or policy, not data – Over-‐reported by users
• Discard Most Constraints – A funcIon of process or policy, not data – Are over-‐reported by users
#Cassandra13 © 2013 Data Quality Strategies, LLC 23
Keep/Discard Rule of Thumb
• Keep – Anything that helps you and the users together discover and name the user categories
• Discard – Anything else
#Cassandra13 © 2013 Data Quality Strategies, LLC 24
Conceptual Data Model Examples
#Cassandra13 © 2013 Data Quality Strategies, LLC 25
Stress Some
• Stress Consistency Requirements – RelaIonal modelers (of non-‐distributed databases) have not been asking about these.
• Stress Data Volume / Velocity Requirements – Can lead or force your to relax applicaIon-‐data independence
#Cassandra13 © 2013 Data Quality Strategies, LLC 26
Change Some
• Change your process – From math-‐y normalizaIon to English-‐y conversaIon with users
– Very difficult to achieve rigor conversaIonally
#Cassandra13 © 2013 Data Quality Strategies, LLC 27
• More help: – Mastering Data Modeling: A User-‐Driven Approach by Carlis & Maguire
– DataStax Webinar: 25 June
A New Modeling Framework
• Conceptual Modeling • Choosing a Logical Meta-‐Model • Logical Modeling • Physical Modeling
• Tool Support?
#Cassandra13 © 2013 Data Quality Strategies, LLC 28
Choosing a Logical Meta-‐Model
• Don’t Assume RelaIonal (Duh...) • Don’t Assume Big Table • Lots of Choices – RelaIonal – Big Table – XML/Document Database – Graph database – Array database – ...
#Cassandra13 © 2013 Data Quality Strategies, LLC 29
A New Modeling Framework
• Conceptual Modeling • Choosing a Logical Meta-‐Model • Logical Modeling • Physical Modeling
• Tool Support?
#Cassandra13 © 2013 Data Quality Strategies, LLC 30
Logical, Physical, and Tool Support
• Community needs to develop a roster of shapes – And the a=endant transformaIons from conceptual shapes to Big-‐Table shapes
• During Logical Big-‐Table modeling, process requirements will infiltrate – including things like minimum cardinality
• Minimal support from modeling tools – Because few tools support conceptual modeling – Because vendors have not caught up to NoSQL yet
#Cassandra13 © 2013 Data Quality Strategies, LLC 31
Agenda
• History • Current Events • Your Future as a Data Modeler • Summary • Q&A
#Cassandra13 © 2013 Data Quality Strategies, LLC 32
Summary
• Re-‐commit to conceptual modeling for requirements analysis – Some but not all relaIonal-‐modeling skills will apply
– Must learn to focus on user communicaIon, not nerdy stuff like intermediate normal forms
#Cassandra13 © 2013 Data Quality Strategies, LLC 33
Summary
• Remember the fundamentals, so that you can make informed decisions about relaxing them – ApplicaIon-‐data independence – Consistency level as a user requirement – DeclaraIve data retrieval (from informaIon hiding)
• AddiIonal benefits – Users will like you be=er – Agile developers will like you be=er – This framework works in tradiIonal, all-‐SQL environments
#Cassandra13 © 2013 Data Quality Strategies, LLC 34
Q&A
• [email protected] • www.DataQualityStrategies.com
#Cassandra13 © 2013 Data Quality Strategies, LLC 35