Upload
theodore-carroll
View
222
Download
2
Embed Size (px)
Citation preview
Data Modeling
Man is a knot, a web, a mesh into which relationships are
tied. Only those relationships matter
Saint-Exupéry
Data modeling
A technique for modeling dataA graphical representation of a databaseThe goal is to identify the facts to be stored in the databaseData modeling is a partnership between the client and analyst
Modeling Scope Model Technology
Motivation
why Goals Business plan Groupware
People who Business units
Organization chart Systems interface
Time when Key events PERT chart Scheduling
Data what Key entities Data model Relational database
Function how Key processes
Process model Application software
Network where Locations Logistics network System architecture
The building blocks
EntityAttributeRelationshipIdentifier
Data model quality
A well-formed data modelA high fidelity image
A well-formed data model
Construction rules obeyedNo ambiguity
All entities, attributes, relationships, and identifiers are definedNames are meaningful to the client
A high fidelity image
Faithfully describes the world it is supposed to representRelationships are of the correct degreeData model is complete, understandable, and accurateThe data model makes sense to the client
Quality improvement
Is the level of detail correct?Are all exceptions handled?Is the model accurate?
Pure geography
NATION
*natname
natpop
natarea
REGION
*regname
regtype
regpop
regarea
CITY
*cityname
citypop
cityarea
Geography revised
NATION
*natnamenatpopnatarea
ADMIN UNIT
*unitnameunittypeunitpopunitarea
CITY
*citynamecitypopcityarea
natcaptype
ADMIN UNIT-CITY
unitcaptype
National capital
Family matters - take 1
husband/wife
MAN
*manidmandob
manfnamemanonamemanlname
WOMAN
*womanidwomandob
womanfnamewomanonamewomanlname
Family matters - take 2
spousePERSON
*iddob
fnameonamelnamegender
Family matters - take 3
PERSON
*iddob
fnameonamelnamegender
MARRIAGE
*begindateenddate
Family matters - take 4
PERSON
*iddob
fnameonamelnamegender
MARRIAGE
*marriagenomarriagestatus
begindateenddate
Family matters - take 5
MARRIAGE
*marriagenomarriagestatus
begindateenddate
PERSON
*iddob
fnameonamelnamegender
child
Bookish matters
BORROWER
*borrowerid…
LIBRARY
*libname…
BOOK
*callnoISBN
booktitleduedate
…
History - take 1
DIVISION
*divname…
DEPARTMENT
*deptname…
EMPLOYEE
*empid…
History - take 2
DIVISION
*divname…
DEPARTMENT
*deptname…
EMPLOYEE
*empid…
POSITION
*begdateenddateposttitle
History - take 3
DIVISION
*divname…
DEPARTMENT
*deptname…
EMPLOYEE
*empid…
POSITION
*begdateenddateposttitle
PAYSLIP
*paydate
PAYSLIPLINE
*pslnopslamount
psltype
History - take 4
DIVISION
*divname…
DEPARTMENT
*deptname…
EMPLOYEE
*empid…
POSITION
*begdateenddateposttitle
PAYSLIP
*paydate
PAYSLIPLINE
*pslnopslamount
PSLTEXT
*psltypepslmessage
A ménage à trois for entities - take 1
AIRCRAFT-AGENT
AGENT
*agentid…
AIRLINE-AIRCRAFT
AIRLINE
*airlinename…
AGENT-AIRLINE
AIRCRAFT
*aircraftcode…
A ménage à trois for entities - take 2
AIRCRAFT
*aircraftcode…
AIRLINE
*airlinename…
AGENT
*agentid…
LEASE
*startdate…
Planning and doing - take 1
EMPLOYEE
*empno…
ACTIVITY
planned hours
DAILY WORK
*workdateactual hours
PROJECT
*projectid…
Planning and doing - take 2
EMPLOYEE
*empno…
ACTIVITY
DAILY WORK
*workdateplanned hoursactual hours
PROJECT
*projectid…
Entity types
IndependentDependentAssociativeAggregateSubordinate
Independent
Often a starting pointProminent in the client's mindOften related to other independent entities
STOCK
*stock codefirm namestock price
stock quantitystock dividend
stock PE
NATION
*nation codenation name
exchange rate
Dependent
Relies on another entity for its existence and identificationCan become independent if given an arbitrary identifier
REGION
*regnameregtyperegpopregarea
…
CITY
*citynamecitypopcityarea
…
Associative
A by-product of an m:m relationshipTypically between independent entitiesCan store current or historical dataCan become independent if given an arbitrary identifierDIVISION
*divname…
DEPARTMENT
*deptname…
EMPLOYEE
*empid…
POSITION
*begdateenddateposttitle
Aggregate
Created from several different entities that have a common prefix or suffixCommonly used with addresses or names
Subordinate
An entity with data that can vary among instances
ANIMAL
*animalid…
SHEEP
fleeceweight…
HORSE
furlong speed…
Generalization
A relationship between a more general element and a more specific element
ANIMAL
*animalidanimaldob
…
SHEEP
fleeceweight…
HORSE
furlongspeed…
GeneralizationMap with one table for each entityFor each of the subtype entities the primary key is that of the supertype entity You must also make this column a foreign key so that a subtype cannot be inserted without the presence of the matching supertype
Aggregation
Aggregation is a part-whole relationship between two entities
Herd Cow
consists of
*Herd Cow
Shared aggregation
One entity owns another entity, but other entities can own that entity as well
Class Student
members
** Class Enrollment Student
Composite aggregation
One entity exclusively owns the other entity
Student Aptitude test
taken
*Student Aptitude test
Data model contraction
AUDIOLIBRARY
…
LP
…
TAPE
…
CD
…
AUDIOLIBRARY
…
AUDIORECORDING
…audio type
Hints on data modeling
The model will expand and contractInvent identifiers where necessaryIdentifiers should have only one purpose – identificationA data model does not imply orderingCreate an attribute if ordering of instances is requiredAn attribute’s meaning must be consistent
Names and addresses
The query testIf an attribute has parts, are any of the parts ever likely to appear in a query?
Have an understanding on representing names and addresses in a data model
STUDENT
*studentid...
ADDRESS
*address typecitystate
post codenation
ADDRESS LINE
*address rankaddress line text
Hints on data modeling
Single instance entities are OKSelect names carefullySynonyms—different words have the same meaning
Get clients to settle on a common word or use views
Homonyms—same word has different meaningsClarify to avoid confusion
Naming associative entitiesConcatenate entity names if there is no obvious real world name
Hints on data modeling
Uncover all exceptionsLabel relationships to avoid ambiguityKeep the data model well-formed and accurate
Meaningful identifiers
An identifier is meaningful when some attributes of the entity can be inferred from the identifier’s value
Advantages Disadvantages
Recognizable and rememberable
Identifier exhaustion
Administrative simplicity Reality changes
Loss of meaningfulness
RecommendationNothing, however, is lost and much is gained by using non-meaningful identifiersNon-meaningful identifiers serve their sole purpose well
To uniquely identify an entity
Attributes are used to describe the characteristics of the entityA clear distinction between the role of identifiers and attributes creates fewer data management problems
The seven habits of highly effective data modelers
ImmerseChallengeGeneralizeTestLimitIntegrateComplete