Upload
newmedio
View
5.656
Download
1
Tags:
Embed Size (px)
DESCRIPTION
A quick introduction to the theory and practice of database design in the real world.
Citation preview
Theory Meets Reality
Crack-SmokingData Models
Rules of Normalization(Simplified)
Make sure every table has a primary key (best to have a single value)
1NF - Put any repeating groups into their own table
Don’t have a column with a comma-separated list of values or multiple columns to represent the same value multiple times
2NF/3NF - Put any information that is not dependent on the primary key in its own table
If you have a table of employees (id, position, salary, name) and their salary is fully determined by their position, move that into a separate table of position/salary.
4NF/5NF - Make sure all three-way and above joins are indeed valid for 3-way, and don’t need further separation.
Rules of Normalization(Simplified)
ORNF - The data model contains only elemental facts
DKNF - The data model fully defines all constraints and is free from “update anomalies” (the data model prevents any logical inconsistencies)
Key - uniquely defines a tuple
Constraint - rule governing values of attributes
In DKNF keys fully define tuples, and constraints fully define logical relationships / allowed values, including multi-table constraints
Rules of Normalization(Simplified)
JBNF - Don’t Smoke Crack While Doing Data Models!
JBNF2 - Don’t Do Data Models For Customers Who Smoke Crack!
Okay, I break this one a lot
But seriously, most business rules/decisions are not based on whether or not it contributes to a logical data model
And why should it?
On the other end, when making a data model, we need to realize the flexibility that businesses require, so they don’t have to re-make the data model after every business decision
Nor should they have to think too much about the data model when making business decisions
The Goal
Normalized Data Models
Everything can be managed via standard operations
No Redundant Data / No Calculated Columns
Strong locality -- don’t have to worry if you forgot to set something
Strict Constraints Enforced
Pushes the business logic back to the data model so it can be easily managed outside of the application logic - no “update anomalies”
Data is modeled as data - fewer text fields
All data can be managed and verified
The Reality
No system can contain all data points
This by itself leads to conflicts with theory
Not all information is available
Summary columns have to be managed which summarizes data inside and outside the database
Some business rules are too complicated/flexible to be modeled, and must be abbreviated with flags and/or text fields
The Reality
System performance demands redundant data
Many summaries are too complicated to be recalculated each time
Some of this can be mitigated with functional indices
Some summaries may need to be altered based on data exterior to the database
The Reality
Some data is best stored de-normalized
Management Issues- Do we really need a table for that?
Performance Issues- Do we really want to query for that?
Time Issues - Do we really want to build the interface to manage that?
In some cases, maybe we can de-normalize our database to save some sanity.
The Reality
Some data models look whacked, because the data they are modeling is whacked.
In an ideal world, we would encourage the customer to come up with a more consistent way to manage themselves.
But usually we just model what they have because it’s easier than changing 20 years of tradition and infrastructure
Case Study - Homebuilder
Builder has several divisions, each division is responsible for an area (kind of - some are on top of each other)
Builder has several brands
Builder categorizes houses and communities by lifestyle
Builder also needs to track home plans and inventory
JB’s Rules of Practicalization
1PF - Design a well-normalized database (the level is up to you) which describes their data as you understand it.
2PF - If they way that the customer talks about their data is inconsistent, develop a vocabulary to use when talking to them about their project which matches the data model. Be sure to clarify any unclear statements they make using the new vocabulary.
JB’s Rules of Practicalization
3PF - Determine which business rules are too fungible to be implemented well by the database, and instead make manual processes for dealing with exceptions using flags and text fields
4PF - Determine if some values may have exceptions based on incomplete information in the database, and create user-maintainable columns or tables for them
JB’s Rules for Practicalization
5PF - Rejigger your data model so that it matches your development platform nicely.
6PF - Create calculated columns based on real or anticipated performance problems. Be sure there are application-level measures taken to keep these mostly consistent.
7PF - ?