35
Data Modelers Save Their Careers: Surviving and Thriving with NoSQL Joe Maguire Data Quality Strategies, LLC h=p://www.DataQualityStrategies.com/ © 2013 Data Quality Strategies, LLC

C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Embed Size (px)

DESCRIPTION

Using concrete, real-world examples, the presenter will show the following: How abandoning modeling altogether is a recipe for disaster, even in—or especially in—NoSQL environments; How experienced relational modelers can leverage their skills for NoSQL projects; How the NoSQL context both simplifies and complicates the modeling endeavor.How lessons learned modeling for NoSQL projects can make you a more effective modeler for any kind of project.

Citation preview

Page 1: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Data  Modelers  Save  Their  Careers:  Surviving  and  Thriving  with  NoSQL  

 Joe  Maguire  

Data  Quality  Strategies,  LLC  h=p://www.DataQualityStrategies.com/    

©  2013  Data  Quality  Strategies,  LLC  

Page 2: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Thesis  

•  RelaIonal  DBMS’s  have  dominated,  •  ...so  relaIonal  modeling  subsumed  other  forms,  including  conceptual  modeling.  

•  As  R-­‐DBMS  wanes,  so  does  relaIonal  modeling  –  and  sadly,  whatever  it  subsumed.  

•  Conceptual  modeling  must  be  saved.  •  RelaIonal  modelers  can  step  in  to  save  it...  •  ...with  some  significant  effort.  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   2  

Page 3: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

My  PerspecIve  •  Over  three  decades  in  industry  •  Career  is  a  three-­‐legged  stool  –  Product  development  for  soVware  vendors  –  SoluIon  design  for  enterprises  –  Author,  Industry  Analyst,  Thought  Leader    

•  Specialize  in    – Modeling  –  Requirements  analysis  –  Data  architecture  –  Data  quality  

•  [email protected]    

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   3  

Page 4: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Agenda  

•  History  •  Current  Events  •  Your  Future  as  a  Data  Modeler  •  Q&A  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   4  

Page 5: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

A  Big-­‐Picture  Framework  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   5  

 Meta-­‐model  

 Data  Perspec1ve  

Conceptual   •  EnIIes  •  A=ributes  •  RelaIonships  •  IdenIfiers  

Logical   •  Tables  •  Columns  •  Primary  and  foreign  keys  

Physical   •  Indexes  •  Table  spaces  •  VerIcal  and  horizontal  parIIoning  •  DenormalizaIons  

Page 6: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Good  Ideas  in  the  Framework  •  InformaIon  Hiding  –  e.g.,  conceptual  excludes  implementaIon  details  

•  The  Type/Instance  disIncIon  – Models  describe  categories,  data  describes  members  

•  ApplicaIon/Data  Independence  – Data  modeling  is  separate  from  process  modeling  

•  User  Requirements  ≠  System  Requirements  – Users  should  not  parIcipate  in  logical  and  physical    

•  Model-­‐Driven  Development  –  Forward  and  reverse  engineering  across  model  levels    #Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   6  

Page 7: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

A  Big-­‐Picture  Framework,  distorted  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   7  

 Meta-­‐model  

 Data  Perspec1ve  

RelaIonal   •  EnIIes  /  Tables  •  A=ributes  /  Columns  •  RelaIonships  /  FKs  •  IdenIfiers  /  PKs  

     

Physical   •  Indexes  •  Table  spaces  •  VerIcal  and  horizontal  parIIoning  •  DenormalizaIons  

Page 8: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

How  the  DistorIon  Happens  •  Tool  Vendors  Dismiss  Conceptual  Modeling  – Because  their  tools  cannot  support  it  anyway  

•  Info  Mgmt  Specialists  Confuse  Models  w  Reality  – E.g.,  believing  the  relaIonal  model  suffices  to  describe  the  universe  

•  InsItuIonalized  Expediency    – We  know  about  conceptual  modeling,  but  to  save  Ime,  we  combine  it  with  relaIonal  modeling...  

–  ...then  we  formalize  that  into  our  dev  processes...  –  ...and  eventually,  that  becomes  the  “best  pracIces.”  

 #Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   8  

Page 9: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

DistorIons,  Revisited  

•  Summary  of  DistorIons:  – DistorIon:  Conceptual  means  vague  – DistorIon:  Logical  implies  relaIonal  

•  Rather  than  XML,  OO,  KV  Store,  Array  Database,  Graph  Database  

•  Results  of  DistorIons:  – Two  levels  only:  relaIonal  and  physical  – RelaIonal  modeling  used  for  user  requirements  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   9  

Page 10: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Agenda  

•  History  •  Current  Events  •  Your  Future  as  a  Data  Modeler  •  Q&A  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   10  

Page 11: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Current  Events:  NoSQL  •  The  “Just  Say  No”  InterpretaIon  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   11  

 Meta-­‐model  

 Data  Perspec1ve  

Logical  RelaIonal  

•  EnIIes  /  Tables  •  A=ributes  /  Columns  •  RelaIonships  /  FKs  •  IdenIfiers  /  PKs  

     

Physical   NO  LONGER  RELATIONAL:  •  Schemas  Based  on  Big  Table  ImplementaIons  •  Alien  DDL  language  •  Limited  Support  from  Modeling  Tools  

Page 12: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Current  Events:  NoSQL  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   12  

•  The  “Not  Only  SQL”  InterpretaIon  – Okay,  so  there  might  be  some  work  for  you  – But  you’re  at  risk  of  being  marginalized      

 

Page 13: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Agenda  

•  History  •  Current  Events  •  Your  Future  as  a  Data  Modeler  •  Summary  •  Q&A  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   13  

Page 14: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Your  Future  as  a  Modeler  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   14  

•  Remaining  Relevant  – Selfishly:  Saving  your  career  – Nobly:  Serving  your  client  /  company  /  customer  

•  What  you  can  do:  – Wait  for  relaIonal  projects  – Become  a  NoSQL  database  designer  – Help  your  client  choose  data  plasorms  

•  That  starts  with  understanding  the  problems  – which  starts  with  CONCEPTUAL  MODELING.  

 

Page 15: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

A  New  (?)  Modeling  Framework  

•  Conceptual  Modeling  •  Choosing  a  Logical  Meta-­‐model  •  Logical  Modeling  •  Physical  Modeling      

•  Tool  Support?  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   15  

Page 16: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Conceptual  Modeling  

•  Behaviors  and  constructs  will  compare  to  RelaIonal  Modeling:  – Keep  some  – Discard  some  – Stress  some  – Change  some  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   16  

Page 17: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Conceptual  Data  Model  Example  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   17  

Page 18: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Keep  Some  

•  Keep  EnIIes  •  Keep  A=ributes  •  Keep  RelaIonships  •  Keep  IdenIfiers  •  Keep  Maximum  Cardinality  of  RelaIonships  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   18  

Page 19: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Keep  EnIIes  

•  Minimum  Expressiveness  •  EnIIes,  Not  Tables  – Don’t  express  Horizontal  or  VerIcal  ParIIoning  for  performance  •  But  yes  is  moIvated  by  privacy/security/risk  

•  EnIty  names,  not  table  names  – Honor  user  vocabulary,  not  IT  naming  standards  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   19  

Page 20: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Keep  A=ributes  

•  Honor  User  Phenomenon  – A=ributes  are  part  of  user  discourse  

•  A=ributes,  not  columns  – Worry  about  scale  (nominal,  numeric,  ordinal,  Boolean,  cyclic),  not  data  type  

– A=ribute  names,  not  column  names  

•  Support  in-­‐progress  models  – During  which  a=ributes  can  become  enIIes  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   20  

Page 21: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Keep  RelaIonships  

•  Minimum  Expressiveness  – A=ributes  are  part  of  user  discourse  

•  Allow  many-­‐many  and  collecIon  enIIes  –  If  the  la=er  seem  strange,  you’ve  been  in  IT  too  long  

•  RelaIonships,  not  FKs  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   21  

Page 22: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Keep  IdenIfiers  

•  IdenIfiers,  not  PKs  –  IDs  are  not  moIvated  by  computerizaIon,  but  by  typography  

–  IDs  predate  the  informaIon  revoluIon  •  and  the  automoIve  revoluIon,  for  that  ma=er  

•  Support  in-­‐process  modeling  –  IDs  help  the  modeler  ferret  out  the  homonym  problem  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   22  

Page 23: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Discard  Some  

•  Discard  Foreign  Keys  – They’re  relaIonal  

•  Discard  Minimum  Cardinality  – A  funcIon  of  process  or  policy,  not  data  – Over-­‐reported  by  users  

•  Discard  Most  Constraints  – A  funcIon  of  process  or  policy,  not  data  – Are  over-­‐reported  by  users  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   23  

Page 24: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Keep/Discard  Rule  of  Thumb  

•  Keep  – Anything  that  helps  you  and  the  users  together  discover  and  name  the  user  categories  

•  Discard  – Anything  else  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   24  

Page 25: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Conceptual  Data  Model  Examples  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   25  

Page 26: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Stress  Some  

•  Stress  Consistency  Requirements  – RelaIonal  modelers  (of  non-­‐distributed  databases)  have  not  been  asking  about  these.  

•  Stress  Data  Volume  /  Velocity  Requirements  – Can  lead  or  force  your  to  relax  applicaIon-­‐data  independence  

 

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   26  

Page 27: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Change  Some  

•  Change  your  process  – From  math-­‐y  normalizaIon  to  English-­‐y  conversaIon  with  users  

– Very  difficult  to  achieve  rigor  conversaIonally  

 

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   27  

•  More  help:  – Mastering  Data  Modeling:  A  User-­‐Driven  Approach    by  Carlis  &  Maguire  

– DataStax  Webinar:  25  June  

Page 28: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

A  New  Modeling  Framework  

•  Conceptual  Modeling  •  Choosing  a  Logical  Meta-­‐Model  •  Logical  Modeling  •  Physical  Modeling      

•  Tool  Support?  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   28  

Page 29: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Choosing  a  Logical  Meta-­‐Model  

•  Don’t  Assume  RelaIonal  (Duh...)  •  Don’t  Assume  Big  Table  •  Lots  of  Choices  – RelaIonal  – Big  Table  – XML/Document  Database  – Graph  database  – Array  database  –  ...  

 #Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   29  

Page 30: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

A  New  Modeling  Framework  

•  Conceptual  Modeling  •  Choosing  a  Logical  Meta-­‐Model  •  Logical  Modeling  •  Physical  Modeling      

•  Tool  Support?  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   30  

Page 31: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Logical,  Physical,  and  Tool  Support  

•  Community  needs  to  develop  a  roster  of  shapes  – And  the  a=endant  transformaIons  from  conceptual  shapes  to  Big-­‐Table  shapes  

•  During  Logical  Big-­‐Table  modeling,  process  requirements  will  infiltrate  –  including  things  like  minimum  cardinality  

•  Minimal  support  from  modeling  tools  – Because  few  tools  support  conceptual  modeling  – Because  vendors  have  not  caught  up  to  NoSQL  yet  

 #Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   31  

Page 32: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Agenda  

•  History  •  Current  Events  •  Your  Future  as  a  Data  Modeler  •  Summary  •  Q&A  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   32  

Page 33: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Summary  

•  Re-­‐commit  to  conceptual  modeling  for  requirements  analysis  – Some  but  not  all  relaIonal-­‐modeling  skills  will  apply  

– Must  learn  to  focus  on  user  communicaIon,  not  nerdy  stuff  like  intermediate  normal  forms  

 

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   33  

Page 34: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Summary  

•  Remember  the  fundamentals,  so  that  you  can  make  informed  decisions  about  relaxing  them  – ApplicaIon-­‐data  independence  –  Consistency  level  as  a  user  requirement  – DeclaraIve  data  retrieval  (from  informaIon  hiding)  

•  AddiIonal  benefits  – Users  will  like  you  be=er  – Agile  developers  will  like  you  be=er  –  This  framework  works  in  tradiIonal,  all-­‐SQL  environments  

 #Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   34  

Page 35: C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire

Q&A  

•  [email protected]  •  www.DataQualityStrategies.com  

#Cassandra13   ©  2013  Data  Quality  Strategies,  LLC   35