13
Theory Meets Reality Crack-Smoking Data Models

Crack Smoking Data Models

Embed Size (px)

DESCRIPTION

A quick introduction to the theory and practice of database design in the real world.

Citation preview

Page 1: Crack Smoking Data Models

Theory Meets Reality

Crack-SmokingData Models

Page 2: Crack Smoking Data Models

Rules of Normalization(Simplified)

Make sure every table has a primary key (best to have a single value)

1NF - Put any repeating groups into their own table

Don’t have a column with a comma-separated list of values or multiple columns to represent the same value multiple times

2NF/3NF - Put any information that is not dependent on the primary key in its own table

If you have a table of employees (id, position, salary, name) and their salary is fully determined by their position, move that into a separate table of position/salary.

4NF/5NF - Make sure all three-way and above joins are indeed valid for 3-way, and don’t need further separation.

Page 3: Crack Smoking Data Models

Rules of Normalization(Simplified)

ORNF - The data model contains only elemental facts

DKNF - The data model fully defines all constraints and is free from “update anomalies” (the data model prevents any logical inconsistencies)

Key - uniquely defines a tuple

Constraint - rule governing values of attributes

In DKNF keys fully define tuples, and constraints fully define logical relationships / allowed values, including multi-table constraints

Page 4: Crack Smoking Data Models

Rules of Normalization(Simplified)

JBNF - Don’t Smoke Crack While Doing Data Models!

JBNF2 - Don’t Do Data Models For Customers Who Smoke Crack!

Okay, I break this one a lot

But seriously, most business rules/decisions are not based on whether or not it contributes to a logical data model

And why should it?

On the other end, when making a data model, we need to realize the flexibility that businesses require, so they don’t have to re-make the data model after every business decision

Nor should they have to think too much about the data model when making business decisions

Page 5: Crack Smoking Data Models

The Goal

Normalized Data Models

Everything can be managed via standard operations

No Redundant Data / No Calculated Columns

Strong locality -- don’t have to worry if you forgot to set something

Strict Constraints Enforced

Pushes the business logic back to the data model so it can be easily managed outside of the application logic - no “update anomalies”

Data is modeled as data - fewer text fields

All data can be managed and verified

Page 6: Crack Smoking Data Models

The Reality

No system can contain all data points

This by itself leads to conflicts with theory

Not all information is available

Summary columns have to be managed which summarizes data inside and outside the database

Some business rules are too complicated/flexible to be modeled, and must be abbreviated with flags and/or text fields

Page 7: Crack Smoking Data Models

The Reality

System performance demands redundant data

Many summaries are too complicated to be recalculated each time

Some of this can be mitigated with functional indices

Some summaries may need to be altered based on data exterior to the database

Page 8: Crack Smoking Data Models

The Reality

Some data is best stored de-normalized

Management Issues- Do we really need a table for that?

Performance Issues- Do we really want to query for that?

Time Issues - Do we really want to build the interface to manage that?

In some cases, maybe we can de-normalize our database to save some sanity.

Page 9: Crack Smoking Data Models

The Reality

Some data models look whacked, because the data they are modeling is whacked.

In an ideal world, we would encourage the customer to come up with a more consistent way to manage themselves.

But usually we just model what they have because it’s easier than changing 20 years of tradition and infrastructure

Page 10: Crack Smoking Data Models

Case Study - Homebuilder

Builder has several divisions, each division is responsible for an area (kind of - some are on top of each other)

Builder has several brands

Builder categorizes houses and communities by lifestyle

Builder also needs to track home plans and inventory

Page 11: Crack Smoking Data Models

JB’s Rules of Practicalization

1PF - Design a well-normalized database (the level is up to you) which describes their data as you understand it.

2PF - If they way that the customer talks about their data is inconsistent, develop a vocabulary to use when talking to them about their project which matches the data model. Be sure to clarify any unclear statements they make using the new vocabulary.

Page 12: Crack Smoking Data Models

JB’s Rules of Practicalization

3PF - Determine which business rules are too fungible to be implemented well by the database, and instead make manual processes for dealing with exceptions using flags and text fields

4PF - Determine if some values may have exceptions based on incomplete information in the database, and create user-maintainable columns or tables for them

Page 13: Crack Smoking Data Models

JB’s Rules for Practicalization

5PF - Rejigger your data model so that it matches your development platform nicely.

6PF - Create calculated columns based on real or anticipated performance problems. Be sure there are application-level measures taken to keep these mostly consistent.

7PF - ?