63
Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com

Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

Embed Size (px)

Citation preview

Page 1: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

Data Modelling Contentious Issues

Data Modelling Contentious Issues

The Best Debates from the Data Modelling List

January 2005

Karen LopezInfoAdvisors.com

The Best Debates from the Data Modelling List

January 2005

Karen LopezInfoAdvisors.com

Page 2: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 22

Karen López, I.S.P.Karen López, I.S.P.

• Karen is the moderator of the Data Modelling List. She has 18 years of data modelling experience on large, multi-project programs

• She has a B.Sc. in Computer Technology / Information Systems from Purdue University

• She is a former President of the Information Resource Management Association of Canada (IRMAC)

• Karen is the moderator of the Data Modelling List. She has 18 years of data modelling experience on large, multi-project programs

• She has a B.Sc. in Computer Technology / Information Systems from Purdue University

• She is a former President of the Information Resource Management Association of Canada (IRMAC)

Page 3: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 33

Upcoming Speaking Engagements

Upcoming Speaking Engagements

• DAMA / Meta Data SymposiumMay 2005

• Toronto IRMACSpring 2005

• DAMA / Meta Data SymposiumMay 2005

• Toronto IRMACSpring 2005

Page 4: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 44

About this PresentationAbout this Presentation

• We will be using an interactive format - you will be participating in informal polls about data modelling issues and best practices.

• This is not an introductory presentation - a good knowledge of data modelling issues will be assumed.

• We will be using an interactive format - you will be participating in informal polls about data modelling issues and best practices.

• This is not an introductory presentation - a good knowledge of data modelling issues will be assumed.

Page 5: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 55

InfoAdvisors’ Discussion Groups

InfoAdvisors’ Discussion Groups

• E-mail, web, and newsgroup based discussion group

• Data Modeling, Frameworks, Tools Groups

• Over 8000 subscribers • Moderated• No Charge• www.infoadvisors.com

• E-mail, web, and newsgroup based discussion group

• Data Modeling, Frameworks, Tools Groups

• Over 8000 subscribers • Moderated• No Charge• www.infoadvisors.com

Page 6: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 66

E-mailE-mail

Page 7: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 77

WebWeb

Page 8: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 88

NewsgroupsNewsgroups

Page 9: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 99

AgendaAgenda

• Contentious Issues• Background• Discussion Quotes• Poll• Results & Analysis

• Resources

• Contentious Issues• Background• Discussion Quotes• Poll• Results & Analysis

• Resources

Page 10: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 1010

Contentious IssuesContentious Issues

Near-religious discussions and debates

– People rarely change their minds based on this discussion

– The most successful discussions are ones where both sides learn something new about the other viewpoint.

Near-religious discussions and debates

– People rarely change their minds based on this discussion

– The most successful discussions are ones where both sides learn something new about the other viewpoint.

Page 11: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 1111

Conceptual Data ModelsConceptual Data Models

• Do you do them?• Are they used? How?• Just what is one?• How do they differ from other

Data Models?

• Do you do them?• Are they used? How?• Just what is one?• How do they differ from other

Data Models?

Page 12: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 1212

Conceptual Data ModelsConceptual Data Models

What is a CDM?– Entity only, no cardinality, fewer than 15

entities– Entity only, cardinality, more than 15

entities– Entity only, one Entity for every entity in

the LDM– Entity & Attributes, no physical, no

surrogate keys– Entity & Attributes, no DBMS-specific

issues

What is a CDM?– Entity only, no cardinality, fewer than 15

entities– Entity only, cardinality, more than 15

entities– Entity only, one Entity for every entity in

the LDM– Entity & Attributes, no physical, no

surrogate keys– Entity & Attributes, no DBMS-specific

issues

Page 13: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 1313

Conceptual Data Models – View 1

Conceptual Data Models – View 1

The conceptual model is concerned with the real world view and understanding of data; the logical model is a generalized formal structure in the rules of information science; the physical model specifies how this will be executed in a particular DBMS instance.

Duncan Dwelle, AIS Intl.http://www.aisintl.com/case/CDM-PDM.html

The conceptual model is concerned with the real world view and understanding of data; the logical model is a generalized formal structure in the rules of information science; the physical model specifies how this will be executed in a particular DBMS instance.

Duncan Dwelle, AIS Intl.http://www.aisintl.com/case/CDM-PDM.html

Page 14: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 1414

Conceptual Data Models – View 2

Conceptual Data Models – View 2

“A [CDM], typically called an Entity Relationship Diagram (ERD), contains business information and structure without regard to physical storage concerns. Business items become entities and you describe them with attributes. You formalize relationships…. Typically you have no index, foreign key or tablespace information. …A Logical Data Model (LDM) is sometimes skipped over. It is a hybrid between the CDM and PDM and contains some DBMS elements like denormalization and indexes, but isn't as detailed as the PDM.”

- Michael N.

“A [CDM], typically called an Entity Relationship Diagram (ERD), contains business information and structure without regard to physical storage concerns. Business items become entities and you describe them with attributes. You formalize relationships…. Typically you have no index, foreign key or tablespace information. …A Logical Data Model (LDM) is sometimes skipped over. It is a hybrid between the CDM and PDM and contains some DBMS elements like denormalization and indexes, but isn't as detailed as the PDM.”

- Michael N.

Page 15: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 1515

Conceptual Data Models – View 3

Conceptual Data Models – View 3

“[CDM is] a tricky phrase to define in that our industry uses this same phrase for two very different (in my opinion) ways. If you come from the Information Engineering background, a CDM is an entity and relationship-only method (no attributes and often no cardinalities to the relationships) that has a purpose of defining the scope of smaller, more detailed logical data modelling projects. Those logical data models are not constrained by organizational, platform, or other technical issues.

If you come from an ORM background, CDM is much like Mike has described [previously]. As he mentions, in this case Logical Data Modelling is similar to a first cut physical data model.

- Karen Lopez

“[CDM is] a tricky phrase to define in that our industry uses this same phrase for two very different (in my opinion) ways. If you come from the Information Engineering background, a CDM is an entity and relationship-only method (no attributes and often no cardinalities to the relationships) that has a purpose of defining the scope of smaller, more detailed logical data modelling projects. Those logical data models are not constrained by organizational, platform, or other technical issues.

If you come from an ORM background, CDM is much like Mike has described [previously]. As he mentions, in this case Logical Data Modelling is similar to a first cut physical data model.

- Karen Lopez

Page 16: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 1616

Conceptual Data Models – Warm up Vote

Conceptual Data Models – Warm up Vote

• Do Conceptual Data Models have Attributes?

• Do they have cardinality?• Are there fewer Entities in a

CDM than in an LDM?• If they have attributes, can they

have surrogate keys?

• Do Conceptual Data Models have Attributes?

• Do they have cardinality?• Are there fewer Entities in a

CDM than in an LDM?• If they have attributes, can they

have surrogate keys?

Page 17: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 1717

Vote!Vote!

Ok, Here’s How we vote1. I post a question2. The question has a Range.3. You put the sticky in the area for

your vote (1-5), sometimes once for PDM and once for LDM.

4. We debate the answer and the results

Ok, Here’s How we vote1. I post a question2. The question has a Range.3. You put the sticky in the area for

your vote (1-5), sometimes once for PDM and once for LDM.

4. We debate the answer and the results

Page 18: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 1818

Conceptual Data Models - Vote

Conceptual Data Models - Vote

Do you create Conceptual Data Models …that are actively used?

1. Yes, it’s up-to-date, easily available, and used at least a couple of times a year

2.

3.

4.

5. What’s a Conceptual Data Model? We don’t need no stinkin’ CDMs!

Do you create Conceptual Data Models …that are actively used?

1. Yes, it’s up-to-date, easily available, and used at least a couple of times a year

2.

3.

4.

5. What’s a Conceptual Data Model? We don’t need no stinkin’ CDMs!

Page 19: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 1919

Conceptual Data ModelsConceptual Data Models

TRAP!You may find yourself debating a topic because there is no common definition of a model or object, not because there is a real disagreement.

TRAP!You may find yourself debating a topic because there is no common definition of a model or object, not because there is a real disagreement.

Page 20: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 2020

Do we need Classwords?Do we need Classwords?

• A traditional naming convention• Usually means there’s a

standard classword list• Some tools can check for

standards compliance• Examples: Date, Amount, Count,

Quantity

• A traditional naming convention• Usually means there’s a

standard classword list• Some tools can check for

standards compliance• Examples: Date, Amount, Count,

Quantity

Page 21: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 2121

Attribute NamesAttribute Names

• Customer First Name• Customer

• Backordered Specialty Order Item ID

• Item ID• Item

• Customer First Name• Customer

• Backordered Specialty Order Item ID

• Item ID• Item

Page 22: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 2222

Classwords - VoteClasswords - Vote

Does a good LDM / PDM data model use classwords?

1. Always2. 3. 4. 5. Classwords are so obsolete….

Does a good LDM / PDM data model use classwords?

1. Always2. 3. 4. 5. Classwords are so obsolete….

Page 23: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 2323

Keys - Natural or Surrogate

Keys - Natural or Surrogate

• Just what you call them may show your true colours....– Surrogate, Non-intelligent,

Unnatural, Dataless or Meaningless– Natural, Intelligent, Normal,

Cluttered

• Just what you call them may show your true colours....– Surrogate, Non-intelligent,

Unnatural, Dataless or Meaningless– Natural, Intelligent, Normal,

Cluttered

Page 24: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 2424

Keys – Natural or Surrogate

Keys – Natural or Surrogate

Order Number--------------------------

Order DateSales Person ID

Order NumberOrder Line Number--------------------------Product Number

Quantity

Order Number--------------------------

Order DateSales Person ID

Order Detail Sys Number--------------------------Order Line Number

Product NumberQuantity

Page 25: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 2525

Keys - Natural or Surrogate – View 1Keys - Natural or

Surrogate – View 1I have inherited many [tables] that have used business elements for keys. The complexity of the keys for some tables is horrendous…Another thing I hate about this method is putting business information into the PK. This leads to the business changing either the contents of, or the definition of, a key column. …You should never change PK values….

With single column PK, you have single column FKs and easier maintenance of the system over time. You may have to perform more joins to satisfy complexqueries, but the joins are simpler. … I have implemented surrogate keys as a "strongly recommended“ guideline for all future database development.

- Michael N.

I have inherited many [tables] that have used business elements for keys. The complexity of the keys for some tables is horrendous…Another thing I hate about this method is putting business information into the PK. This leads to the business changing either the contents of, or the definition of, a key column. …You should never change PK values….

With single column PK, you have single column FKs and easier maintenance of the system over time. You may have to perform more joins to satisfy complexqueries, but the joins are simpler. … I have implemented surrogate keys as a "strongly recommended“ guideline for all future database development.

- Michael N.

Page 26: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 2626

Keys - Natural or Surrogate – View 2Keys - Natural or

Surrogate – View 2I agree wholeheartedly with Mike, but with one exception. I think reference tables would be better retaining the business elements in the primary key since they're usually standalone tables not used in joins. On a rather large application we set the ref tables up with sequence numbers as the primarykey. Quickly some developers began hard coding them to refer back to the row. This caused us much difficulty

- Thomas Z.

I agree wholeheartedly with Mike, but with one exception. I think reference tables would be better retaining the business elements in the primary key since they're usually standalone tables not used in joins. On a rather large application we set the ref tables up with sequence numbers as the primarykey. Quickly some developers began hard coding them to refer back to the row. This caused us much difficulty

- Thomas Z.

Page 27: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 2727

Keys - Natural or Surrogate - VoteKeys - Natural or Surrogate - Vote

What best describes your approach in an LDM / PDM?1. Surrogate Keys? We don’t need surrogate keys - we have natural identifiers.

2.

3.

4.

5. Every entity deserves its own surrogate key.

What best describes your approach in an LDM / PDM?1. Surrogate Keys? We don’t need surrogate keys - we have natural identifiers.

2.

3.

4.

5. Every entity deserves its own surrogate key.

Page 28: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 2828

Derived and Redundant Data Dilemma

Derived and Redundant Data Dilemma

• Not all derived data is calculated• Snapshot versus History

– History: Just maintain a foreign key (or relationship) back to the timestamped data

– Snapshot: Copy the data to another location to preserve the data at that point in time.

• Not all derived data is calculated• Snapshot versus History

– History: Just maintain a foreign key (or relationship) back to the timestamped data

– Snapshot: Copy the data to another location to preserve the data at that point in time.

Page 29: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 2929

• Historical (no derived data)

• Derived and

• Snapshot

• Historical (no derived data)

• Derived and

• Snapshot

Derived Data DilemmaDerived Data Dilemma

Product NumberEffective Date-----------------Product Price

Order ID________Order Date

Order IDOrder Line ID----------------Product NumberQuantity

Product Number-----------------Description

Product NumberEffective Date-----------------Product Price

Order ID----------Order DateOrder Total Amount

Product Number-----------------DescriptionPrice

Order IDOrder Line ID----------------Product NumberQuantityProduct PriceOrder Line Total

Page 30: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 3030

Derived Data Dilemma - Vote

Derived Data Dilemma - Vote

Do Derived Attributes belong in an LDM/PDM?1. Never2. 3. 4. 5. All the time - pure logical data

modelling doesn’t work in the real world.

Do Derived Attributes belong in an LDM/PDM?1. Never2. 3. 4. 5. All the time - pure logical data

modelling doesn’t work in the real world.

Page 31: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 3131

Derived Data DilemmaDerived Data Dilemma

TIP!Every decision is ultimately made based on cost, benefit, and risk. Be prepared to analyze all three.

TIP!Every decision is ultimately made based on cost, benefit, and risk. Be prepared to analyze all three.

Page 32: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 3232

Abstraction / GeneralizationAbstraction /

Generalization• A modelling design decision• Can be very flexible• Can be very difficult to

understand

• A modelling design decision• Can be very flexible• Can be very difficult to

understand

Page 33: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 3333

AbstractionAbstraction

MEASUREMENT

VOLUMEMEASUREMENT

TEMPERATUREMEASUREMENT

WEIGHTMEASUREMENT

MEASUREMENT TYPE

Page 34: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 3434

Abstract or Specific - View

Abstract or Specific - View

“Although your model represents a higher level of abstraction than most people would create it does not violate the rules of normalization. So, it is not technically speaking incorrect.

Now come the issues. What is the purpose of the model? Are you documenting the business requirements and rules? If so your approach does not provide as much information about the business as a less abstract approach would…If you are trying to model a flexible environment where the required data (represented by the repeating groups) changes frequently then your model would be more appropriate.”

- Rick B.

“Although your model represents a higher level of abstraction than most people would create it does not violate the rules of normalization. So, it is not technically speaking incorrect.

Now come the issues. What is the purpose of the model? Are you documenting the business requirements and rules? If so your approach does not provide as much information about the business as a less abstract approach would…If you are trying to model a flexible environment where the required data (represented by the repeating groups) changes frequently then your model would be more appropriate.”

- Rick B.

Page 35: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 3535

Abstract or SpecificAbstract or Specific

Which style are you most likely to use in an LDM / PDM?1. The more flexible (Abstract) the better2. 3. 4. 5. The more precise (Specific) the better

Which style are you most likely to use in an LDM / PDM?1. The more flexible (Abstract) the better2. 3. 4. 5. The more precise (Specific) the better

Page 36: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 3636

To Party or Not to PartyTo Party or Not to Party

• Party, Party Role, Party Type, Party Category

• Migration of Party ID• Use of Subtype “Owned Keys”• Universality of Party

• Party, Party Role, Party Type, Party Category

• Migration of Party ID• Use of Subtype “Owned Keys”• Universality of Party

Page 37: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 3737

To Party or Not to PartyTo Party or Not to Party

• Party

• Non-Party

• Party

• Non-Party

PARTY ROLE

PARTY

VENDOR

CUSTOMER

DISTRIBUTOR

EMPLOYEE

VENDORCUSTOMER

ORGANIZATION

PERSON

Page 38: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 3838

To Party or Not to PartyTo Party or Not to Party

It is so unfortunate that there is still a mentality that we need to model the simple case only. I personally think that once you understand the party, party role model it is simpler. That simplicity reinforces itself as you are able to apply the same pattern to lots of different situations and environments.

- George P.

It is so unfortunate that there is still a mentality that we need to model the simple case only. I personally think that once you understand the party, party role model it is simpler. That simplicity reinforces itself as you are able to apply the same pattern to lots of different situations and environments.

- George P.

Page 39: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 3939

To Party or Not to PartyTo Party or Not to Party

The problem I find with PARTY ROLE is that it is derivable information. This is purely a classification of an individual body based on other data that must be available. Otherwise if I identify you as my “customer” what business meaning does it have? There must be one or more “orders”, or “contracts”, or “correspondences”, or “contacts”, or whatever I deem necessary for it to be sensible for you to be known as my customer. It’s the ORDER that makes you my customer, not being my customer that makes me able to take your ORDER.

- Mike V.

The problem I find with PARTY ROLE is that it is derivable information. This is purely a classification of an individual body based on other data that must be available. Otherwise if I identify you as my “customer” what business meaning does it have? There must be one or more “orders”, or “contracts”, or “correspondences”, or “contacts”, or whatever I deem necessary for it to be sensible for you to be known as my customer. It’s the ORDER that makes you my customer, not being my customer that makes me able to take your ORDER.

- Mike V.

Page 40: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 4040

To Party or Not to Party - Vote

To Party or Not to Party - Vote

Which is the best approach to PARTY in an LDM /PDM?

1. It’s the only way to be fully logical 2. 3. 4. 5. It’s too academic, too theoretical and it

just won’t work

Which is the best approach to PARTY in an LDM /PDM?

1. It’s the only way to be fully logical 2. 3. 4. 5. It’s too academic, too theoretical and it

just won’t work

Page 41: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 4141

DA – Lawyer or Architect?DA – Lawyer or Architect?

• Several new legislative actions focus on data quality, integrity, reliability, availability…

• Sounds like good DA work, right?

• Several new legislative actions focus on data quality, integrity, reliability, availability…

• Sounds like good DA work, right?

Page 42: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 4242

ExamplesExamples

• Sarbanes-Oxley (Sarbox): Financial Accountability & Reliability

• Anti-Spam Legislation: Data integrity, Reliability

• Privacy: Quality, Reliability, Accountability…

• Sarbanes-Oxley (Sarbox): Financial Accountability & Reliability

• Anti-Spam Legislation: Data integrity, Reliability

• Privacy: Quality, Reliability, Accountability…

Page 43: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 4343

DA Role in ComplianceDA Role in Compliance

1. Finally, a reason for the Execs to support us…

2. 3. 4. 5. Hey! I don’t even have time

write good definitions…

1. Finally, a reason for the Execs to support us…

2. 3. 4. 5. Hey! I don’t even have time

write good definitions…

Page 44: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 4444

Did UML kill the Data Modelling Star?

Did UML kill the Data Modelling Star?

• Combines data and process into one diagramming technique

• Some tools support ERDs as well as UML models

• Many DA’s are being pressured to choose between UML and ERDs.

• Combines data and process into one diagramming technique

• Some tools support ERDs as well as UML models

• Many DA’s are being pressured to choose between UML and ERDs.

Page 45: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 4545

UML Class DiagramUML Class Diagram

Page 46: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 4646

UML - Vote?UML - Vote?

Is UML an acceptable replacement for LDM?

1. Never – 2 different things2. 3. 4. 5. UML is the future, we might as well

switch now….

Is UML an acceptable replacement for LDM?

1. Never – 2 different things2. 3. 4. 5. UML is the future, we might as well

switch now….

Page 47: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 4747

Generic Data ModelsGeneric Data Models

• Generic Data Models are models or subsets of models that an organization can purchase.

• They are sometimes prepared for a specific industry

• May be very generic, may be both.

• Generic Data Models are models or subsets of models that an organization can purchase.

• They are sometimes prepared for a specific industry

• May be very generic, may be both.

Page 48: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 4848

Generic Data Models - Vote

Generic Data Models - Vote

What do you think about Generic Data Models or Patterns?

1. I love them and won’t ever start a project from a blank page again.

2. 3. 4. 5. They are so high level that I don’t see the

value in purchasing them.

What do you think about Generic Data Models or Patterns?

1. I love them and won’t ever start a project from a blank page again.

2. 3. 4. 5. They are so high level that I don’t see the

value in purchasing them.

Page 49: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 4949

Who Gets to Update the Model?

Who Gets to Update the Model?

• Many marketing pitches portray a team of developers, modelers, DBAs, etc., happily working on the model

• Is it “More hands make for less work” or “Too many cooks spoil the soup”?

• Many marketing pitches portray a team of developers, modelers, DBAs, etc., happily working on the model

• Is it “More hands make for less work” or “Too many cooks spoil the soup”?

Page 50: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 5050

Who Gets to Update the Model

Who Gets to Update the Model

1. Only seasoned Data Modeling Professionals

2. 3. 4. 5. Everybody – we’re all

professionals here

1. Only seasoned Data Modeling Professionals

2. 3. 4. 5. Everybody – we’re all

professionals here

Page 51: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 5151

Who calls the shots in a Logical Data Model?

Who calls the shots in a Logical Data Model?

• Many stakeholders in a LDM• Not everyone shares the same

understanding about the purpose of the LDM

• DM tools can greatly influence the decision

• DA’s can report through a variety of departments

• Many stakeholders in a LDM• Not everyone shares the same

understanding about the purpose of the LDM

• DM tools can greatly influence the decision

• DA’s can report through a variety of departments

Page 52: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 5252

Who calls the shots in a Logical Data Model?

Who calls the shots in a Logical Data Model?

The FINAL say on LDM / PDM decisions belongs to:1. The Data Architect/Modeller2. The Project Manager3. The User…Customer4. The DBA5. The CIO/CEO

The FINAL say on LDM / PDM decisions belongs to:1. The Data Architect/Modeller2. The Project Manager3. The User…Customer4. The DBA5. The CIO/CEO

Page 53: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 5353

XML – the next Data Specification

XML – the next Data Specification

• XML is a type of markup language for data

• Similar to HTML, but carries meaning as well as format metadata

• Many who use EDI are looking toward XML

• Many vendors are jumping onto this ‘standard’

• Still requires establishing meaning.

• XML is a type of markup language for data

• Similar to HTML, but carries meaning as well as format metadata

• Many who use EDI are looking toward XML

• Many vendors are jumping onto this ‘standard’

• Still requires establishing meaning.

Page 54: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 5454

XMLXML

<?xml version="1.0" encoding="iso-8859-1" ?> - <clientapp> <global> 

<string phone_number="*8369464" />  <string user id=“TestUser”/><string dir_html="html" />   <string dir_language="english" />   <string filename_html_connected="connected.htm"

/>   <string filename_html_targetfile="target.htm" />  

</global>-

<?xml version="1.0" encoding="iso-8859-1" ?> - <clientapp> <global> 

<string phone_number="*8369464" />  <string user id=“TestUser”/><string dir_html="html" />   <string dir_language="english" />   <string filename_html_connected="connected.htm"

/>   <string filename_html_targetfile="target.htm" />  

</global>-

Page 55: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 5555

XML – the next Data Specification

XML – the next Data Specification

I have heard from a wide variety of sources that XML means that data modelling is dead -- who needs a data model when you can create a DTD and call the piece of data anything you want? All you have to do is convince everyone else to call it the same thing...

- Karen Lopez

I have heard from a wide variety of sources that XML means that data modelling is dead -- who needs a data model when you can create a DTD and call the piece of data anything you want? All you have to do is convince everyone else to call it the same thing...

- Karen Lopez

Page 56: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 5656

XML – the next Data Specification

XML – the next Data Specification

XML is very simple. It is a facility for defining your own tags in transmitting documents. (With HTML you have to use theirs.) The implication of this is that the browser/person receiving the document has to know how to interpret your tags. This is the fun part. Some organizations have begun to define a set of tags to serve their purposes -- mathematicians, chemists, metadata gurus, etc.

Alternatively, if you send the "Document Tag Definition (DTD)" with your data, it can be used to decode it, at least syntactically.

- David H.

XML is very simple. It is a facility for defining your own tags in transmitting documents. (With HTML you have to use theirs.) The implication of this is that the browser/person receiving the document has to know how to interpret your tags. This is the fun part. Some organizations have begun to define a set of tags to serve their purposes -- mathematicians, chemists, metadata gurus, etc.

Alternatively, if you send the "Document Tag Definition (DTD)" with your data, it can be used to decode it, at least syntactically.

- David H.

Page 57: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 5757

XML – the next Data Specification

XML – the next Data Specification

I see XML as a new medium for presenting data. To effectively present the data it needs to be understood and this is the role of the data model. Do we have the fundamental meaning of our data understood and represented in a quality data model? If yes then the transformation of the requirements into the presentation by XML will be much more effective. ….Enough said for now on how I feel that the data model and XML need to correlate data requirements.

- Paul E.

I see XML as a new medium for presenting data. To effectively present the data it needs to be understood and this is the role of the data model. Do we have the fundamental meaning of our data understood and represented in a quality data model? If yes then the transformation of the requirements into the presentation by XML will be much more effective. ….Enough said for now on how I feel that the data model and XML need to correlate data requirements.

- Paul E.

Page 58: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 5858

XML - Vote?XML - Vote?

Will XML replace data models?1. Never – 2 different things2. 3. 4. 5. XML is the future, we might as

well switch now….

Will XML replace data models?1. Never – 2 different things2. 3. 4. 5. XML is the future, we might as

well switch now….

Page 59: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 5959

Is that with one “L” or Two?

Is that with one “L” or Two?

How is it spelled?1. Modeling/Modeler2. Modelling/Modeller3. It depends on my mood…

How is it spelled?1. Modeling/Modeler2. Modelling/Modeller3. It depends on my mood…

Page 60: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 6060

Online Data Modelling Resources

Online Data Modelling Resources

• InfoAdvisors (http:///www.infoadvisors.com)

– The Data Model List– Product User Discussion Groups (ERwin,

Advantage Repository, Visible Analyst/Advantage, ER/Studio, DBArtisan, CaseWise)

• The Data Administration Newsletter (http://www.tdan.com)

• InfoAdvisors (http:///www.infoadvisors.com)

– The Data Model List– Product User Discussion Groups (ERwin,

Advantage Repository, Visible Analyst/Advantage, ER/Studio, DBArtisan, CaseWise)

• The Data Administration Newsletter (http://www.tdan.com)

Page 61: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 6161

Recommended BooksRecommended Books

• Data Modeling Essentials, Graeme Simsion, 2004

• Building Quality Databases with IDEF1X, Thomas Bruce, 1992, Dorset House, ISBN 0-932633-18-8

• The Data Modeling Handbook, Michael

Reingruber & William Gregory, 1994, Wiley QED, ISBN0-471-05290-6

• A Practical Guide to Logical Data Modeling, George Tillmann, 1993,

McGraw-Hill, $45, ISBN 0-07-064615-5

• Data Modeling Essentials, Graeme Simsion, 2004

• Building Quality Databases with IDEF1X, Thomas Bruce, 1992, Dorset House, ISBN 0-932633-18-8

• The Data Modeling Handbook, Michael

Reingruber & William Gregory, 1994, Wiley QED, ISBN0-471-05290-6

• A Practical Guide to Logical Data Modeling, George Tillmann, 1993,

McGraw-Hill, $45, ISBN 0-07-064615-5

Page 62: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

7 Mar 01Jan 20057 Mar 01Jan 2005

© 2001 InfoAdvisors © 2001 InfoAdvisors 6262

Recommended BooksRecommended Books

• Data Model Patterns: Conventions of Thought, David C. Hay, 1996, Dorset House, ISBN 0-9326333-29-3

• The Data Model Resource Book, Silverston, Volumes 1 and 2

• Data Model Patterns: Conventions of Thought, David C. Hay, 1996, Dorset House, ISBN 0-9326333-29-3

• The Data Model Resource Book, Silverston, Volumes 1 and 2

Page 63: Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com The Best Debates from the Data

Jan 2005Jan 2005Jan 2005Jan 2005

© 2005 InfoAdvisors © 2005 InfoAdvisors 6363

InfoAdvisorsInfoAdvisors11066 Sheppard Ave EastToronto, ON CANADA

[email protected]://www.infoadvisors.com

8000+ IRM participants subscribed to several discussion groups

11066 Sheppard Ave EastToronto, ON CANADA

[email protected]://www.infoadvisors.com

8000+ IRM participants subscribed to several discussion groups