33
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida [email protected] Super-Type & Sub-Type Entities—Topics Problems needing subtype entities Nature of the solution Variations—Specialization and Completeness Subtype Identifiers Implementing Special Topics Using Super- and Sub-Types SQL w/ ST-ST Subtypes of an Unimplemented Supertype Performance Considerations

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida [email protected] Super-Type

Embed Size (px)

Citation preview

Page 1: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

1Dr. Lawrence West, Management Dept., University of Central [email protected]

Super-Type & Sub-Type Entities—Topics

• Problems needing subtype entities

• Nature of the solution

• Variations—Specialization and Completeness

• Subtype Identifiers

• Implementing

• Special Topics

• Using Super- and Sub-Types

• SQL w/ ST-ST

• Subtypes of an Unimplemented Supertype

• Performance Considerations

Page 2: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

2Dr. Lawrence West, Management Dept., University of Central [email protected]

Supertype & Subtype Entities

• Some entities have records that come in various ‘flavors’.

– StudentsDoctoral, Masters, Undergraduate

– ProductsSerial-numbered, perishable, animals, etc.

– EmployeesSalaried, hourly, managerial, part time

– Pet Store Products Food, animals, accessories

• These entity sets have two types of attributes

– Attributes common to every occurrence

– Attributes required by one or more subtypes but not used by all occurrences of the entity

Page 3: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

3Dr. Lawrence West, Management Dept., University of Central [email protected]

Exercise #1

Create all possible attributes and all immediate relationships for a Product entity in a

______________________________

Page 4: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

4Dr. Lawrence West, Management Dept., University of Central [email protected]

Why is This a Problem?

• Variations on an entity create a space problem

– If we put all possible attributes for all possible variations (subtypes) in one entity we will waste unused fields in most records

– Sport attribute for students who are not athletes

Page 5: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

5Dr. Lawrence West, Management Dept., University of Central [email protected]

Supertype & Subtype Entities (cont.)

• Subtypes also create relationship problems– Some relationships will only be with a subtype of the

entity, not with all types– Important when subtype is the Child in the relationship

(has the foreign key)

Student

PIDLastNameFirstNameStreetAddressCity :DissertationAdvisorID

Faculty

EmployeeIDLastNameFirstName :FacultyTypeRankDoctorallyQualified

Has DoctoralAdvisor

Page 6: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

6Dr. Lawrence West, Management Dept., University of Central [email protected]

Nature of the Problem

• Many records willhave empty fields

– GraduationDate

• We care about fieldsthat will always be empty for certain categories of records…

• …and we can easily determine which records those are

• We will remove those often-empty fields to separate storage structures (entities/tables)

Page 7: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

7Dr. Lawrence West, Management Dept., University of Central [email protected]

Supertype & Subtype Entities

• We can split upentities with variationsinto a supertype andone or more subtypes

– Supertype containsattributes common toall occurrences

– Subtypes contain attributes needed by the subtype

S tudent

M astersS tudent

D octora lS tudent

U ndergradS tudent

Student

MastersStudent

DoctoralStudent

Under-graduateStudent

d

ERD Notation

Visio Equivalent

Page 8: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

8Dr. Lawrence West, Management Dept., University of Central [email protected]

An Example

• Cash is a PaymentTypebut needs no special attributes– Partial Specialization (coming up)

• Payment ID is PK of all entities

• Payment ID is also FK insubtype entities

– In SQL Server besure to set parent this way when implementing relationships

PAYMENT

PaymentIDAccountIDPaymentDatePaymentAmountPaymentType

CHECK_PAYMENT

PaymentIDCheckNum

CC_PAYMENT

PaymentIDCC_TypeCC_NumberSecurityCodeApprovalCode

Page 9: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

9Dr. Lawrence West, Management Dept., University of Central [email protected]

Implementing in SQL Server—Table Design

Page 10: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

10Dr. Lawrence West, Management Dept., University of Central [email protected]

Implementing in SQL Server—Relationships

PK is also FK

Page 11: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

11Dr. Lawrence West, Management Dept., University of Central [email protected]

Implementing in SQL Server—Diagrams

• Arrange in org-chart hierarchy

– Gives visual cue that this is a ST/ST relationship

– You will need to wrestle with the relationship lines a little

• Note Key symbols at both ends of the lines

– Indicates 1:1 Cardinality

Page 12: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

12Dr. Lawrence West, Management Dept., University of Central [email protected]

Need for Subtypes

• Subtypes are used when an identifiable subset of occurrences have a need for fields not needed by all occurrences

– Many occurrences will have empty attribute values

– An occurrence’s membership in the identifiable subset must be observable

• It is known whether a student is registered as an athlete

• But there is no obvious distinction to distinguish ‘local’ students from ‘transient’ students

Page 13: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

13Dr. Lawrence West, Management Dept., University of Central [email protected]

Exercise #2 & #3

#2: Write the SQL to retrieve all of the information for credit card payments in the month of June 2007

#3: Write the SQL to recreate the table on Slide #4

Page 14: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

14Dr. Lawrence West, Management Dept., University of Central [email protected]

Exercise #4

Split the Product entity into Super-/Subtypesby placing attributes appropriately

Page 15: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

15Dr. Lawrence West, Management Dept., University of Central [email protected]

First Variation on Super-/Subtypes

• Completeness Constraint

– Must every supertype occurrence have at least one occurrence in one of the subtypes?

• Total specialization means thata subtype occurrence must exits

– Indicated with a double lineto the connecting circle

• Partial specialization means thata subtype need not exist

– Indicated with a single line tothe connecting circle

S tudent

S tudent

Page 16: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

16Dr. Lawrence West, Management Dept., University of Central [email protected]

Total Specialization Completeness Constraint

• Total specialization means that every record in the supertype must have a matching record in one or more subtypes

• Relatively rare (in my experience) but possible

• Model in Visio using a thicker descending line (use Format Line)

– (Visio doesn’t do double lines)

– Increase thickness by two levels

S tudent

STUDENT

Page 17: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

17Dr. Lawrence West, Management Dept., University of Central [email protected]

Partial Specialization Completeness Constraint

• Some records in supertypes may have no matching subtype records

• Their subtype groups do not needspecial attributes

– But membership in a groupmay still be important andtracked

• It is possible for a suptertype to haveonly one subtype group

STUDENT

ATHLETE

Page 18: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

18Dr. Lawrence West, Management Dept., University of Central [email protected]

Second Variation on Super-/Subtypes

• You must also determine whether a supertype occurrence can be found in more than one subtype

– A disjoint relationship meansthat a supertype occurrence can only be found in one subtype

– An overlap relationship means that a supertype occurrencecan be found in multiple subtypes(E.g., some universities have ajoint J.D./MBA program)

S tudent

d

S tudent

o

“d”

“o”

Page 19: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

19Dr. Lawrence West, Management Dept., University of Central [email protected]

Disjoint Relationships

• A registered vehicle canonly be of one type

VEHICLE

VINManufacturerYearWeightType

d

CAR

VINDoorsSeats

TRUCK

VINBedLengthTowingCapacityTailgateType

Page 20: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

20Dr. Lawrence West, Management Dept., University of Central [email protected]

Overlap Relationships

STUDENT

o

ATHLETEPATIENTINTERN EMPLOYEE VETERAN

Page 21: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

21Dr. Lawrence West, Management Dept., University of Central [email protected]

Subtype Identifiers

• The supertype entity must indicate which (if any) subtypes are used

• Disjoint subtypes can use one attribute with a code to indicate the type of subtype

– Value of the PaymentType attribute (‘Cash’, ‘Check’, ‘CC’) identifies the subtype

– Remember that some subtype identifiers (‘Cash’ here) may have no subtype entities

– Sometimes this value may be blank (not part of any group)

PAYMENT

PaymentIDAccountIDPaymentDatePaymentAmountPaymentType

Page 22: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

22Dr. Lawrence West, Management Dept., University of Central [email protected]

Subtype Identifiers (cont.)

• Overlapping subtypes must use a collection of yes/no attributes, one for each possible subtype

– Setting attribute to true/yes in a record indicates that a matching subtype record exists

– Leaving all to false/no indicates no matching subtype (partial specialization)

STUDENT

PIDLastNameFirstName :InternPatientAthleteEmployeeVeteran

Page 23: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

23Dr. Lawrence West, Management Dept., University of Central [email protected]

Subtype Identifiers (cont.)

• A subset of subtypes may be disjoint while others are overlap

o

MD…INTERN MASTERS PhD

STUDENT

PIDLastNameFirstName :InternPatientAthleteEmployeeVeteranDegreeSought

Page 24: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

24Dr. Lawrence West, Management Dept., University of Central [email protected]

Subtypes of Subtypes

• It is possible to have subtypes of subtypes

• Model products in a pet store where some are inanimate, some are food, some are live and of the live animals some are tracked individually…

– Cute puppies with wet noses

– Cats

• … and others are not

– Goldfish

– Mice

• … and some are sold as food

– Cute little mice as food for slithering scaly snakes

Page 25: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

25Dr. Lawrence West, Management Dept., University of Central [email protected]

Some Caveats

• ST/ST determined at the group level

• Membership in the group must be determinable for every record

– Every record in the group must have the same patter of value/no value for the attributes

– Subtype attributes may be null if they could receive values later

• Individual records may not have values for all fields

– Do not consider for ST/ST unless group membership can be determined

Page 26: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

26Dr. Lawrence West, Management Dept., University of Central [email protected]

Some Caveats

• More than one subtype may have the same field in it

– Field goes in subtype entities if not every subtype group needs it

– E.g.—UndergraduateDegree for Doctoral/Masters students

• Consider eliminating subtypes if they have only one or two attributes

– Roll their attributes back into the suptertype and accept wasted space

– Consider if a large proportion of the population

– Consider if frequently accessed

Page 27: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

27Dr. Lawrence West, Management Dept., University of Central [email protected]

Implementing Super-/Subtypes

• There is a Mandatory-1:Optional-1 relationship between entities in a super and subtype relationship

– Mandatory at supertype end

– Optional at each subtype end

• Each subtype occurrence (record) has identifier attribute values that exactly match a record in the supertype (but not vice-versa)

• All entities have the same primary key/ identifier attributes

• PK in the subtype is also the FK from supertype

– Special case of a weak entity

Page 28: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

28Dr. Lawrence West, Management Dept., University of Central [email protected]

Subtypes of an Unimplemented Supertype

• Many, many data models will have records that could be subtypes of a supertype that is not implemented

• For UCF a “Person” entity could have subtypes

– Student − Donor

– Faculty − Contractor

• Tend to not implement this Person supertype unless the entities are regularly queried together

• Occasional queries can be supported with a UNION query

Page 29: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

29Dr. Lawrence West, Management Dept., University of Central [email protected]

Subtypes and Object Oriented Design

• Super- and Sub-type design exactly corresponds to the philosophy of inheritance in object oriented design

• If programming using an OO approach you will almost always implement objects with inheritance to match super- and sub-type design

• You can also implement inheritance for the unimplemented supertype discussed in the previous slide, even if not implemented in the DB design

Page 30: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

30Dr. Lawrence West, Management Dept., University of Central [email protected]

Using Super-/Sub-type Tables

• Application logic and SQL for super-and sub-type tables becomes more complex

• Inserts must test the subtype identifier to determine where to add records

– Always to the supertype

– Decide which (if any) subtype(s)

• Similar for Updates

Page 31: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

31Dr. Lawrence West, Management Dept., University of Central [email protected]

Using Super-/Sub-type Tables (cont.)

• Retrieval also complex

• You cannot simply join the supertype with all subtypes since no records will be returned if a subtype has no match

– Why won’t the following work?

SELECT Payment.*, Check_Payment.*, CC_Payment.*FROM Payment, Check_Payment, CC_PaymentWHERE Payment.PaymentID = Check_Payment.PaymentID AND Payment.PaymentID = CC_Payment.PaymentID AND Payment.PaymentID = 1472

Page 32: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

32Dr. Lawrence West, Management Dept., University of Central [email protected]

Using Super-/Sub-type Tables (cont.)

• Two query approaches

• Use conditional logic

• Use Left/Right Outer Joins

SELECT Customers.CompanyName, Orders.OrderDateFROM Customers LEFT OUTER JOIN Orders ON Customers.CustomerID = Orders.CustomerID

Page 33: IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Super-Type

IMS 6217: Data Modeling—Super-Type/Sub-Type Entities

33Dr. Lawrence West, Management Dept., University of Central [email protected]

Performance Considerations

• Because of the performance considerations and complexity of Super- and Sub-types you will regularly consider eliminating subtypes

• Roll up their attributes into the super-type and accept the wasted columns

• Arguments for retaining subtypes

– Several unique attributes, especially large (text) ones

– Relatively few records in the subtype (compared to overall number of records)

– Relatively few transactions use the subtype

• Look at vertical partitioning later in the course