64
DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: The ERA Data Model: Entities, Relations and Entities, Relations and Attributes Attributes Specifies each Attribute (property or data field) by its Data Type or permitted set of values. Specifies each Entity, Class or Record type in terms of its component attributes, data members or fields, respectively. Specifies each Relation type in terms of its component Entity Types or Classes, plus other attributes that depend on the associated entities. Three important questions: Can Relation components be nested relation instances? Can Relation instances have attributes or properties? Can an ERA Data Model be specified by a diagram?

DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 1 - RJL Rev. 050908

The ERA Data Model: Entities, The ERA Data Model: Entities, Relations and AttributesRelations and Attributes

• Specifies each Attribute (property or data field) by its Data Type or permitted set of values.

• Specifies each Entity, Class or Record type in terms of its component attributes, data members or fields, respectively.

• Specifies each Relation type in terms of its component Entity Types or Classes, plus other attributes that depend on the associated entities.

• Three important questions: – Can Relation components be nested relation instances?

– Can Relation instances have attributes or properties?

– Can an ERA Data Model be specified by a diagram?

Page 2: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 2 - RJL Rev. 050908

Relation Examples (1)1. A personnel Employee record:

• This represents one employee by a unique primary key for him/her as a Person (or its Employee subclass);

• It also associates that individual employee with values of other specific attributes or properties like age, position, start date, etc.

2. A marriage license: This is an object or instance which associates two

persons who play the roles of bride and groom with other attributes that depend on both of them, such as marriage time and place.

The persons are identified by their primary key values.• These identifiers are ‘foreign keys’ because they

identify other ‘Person’ objects, not this relationship object itself.

Question: What is the ‘arity’ of each relation above?

Page 3: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 3 - RJL Rev. 050908

Relation Examples (2)1. A Parents-to-Child relation:

• Each instance associates two persons (in parent roles) with one of their children (one person in the child role).

2. Parents-to-FamilyUnit P2FU and FamilyUnit-to-Child FU2C relations: Each instance of P2FU associates two Persons with 0 or

more children; The parents are identified by foreign keys (with distinct

names). The children are identified by a (possibly empty) set of

foreign keys. Child fkeys may refer to the child subclass of Person or to

Persons in their ‘child’ role.Questions:

1. What is the ‘arity’ of each relation above?2. What is the role of each associated individual?3. How can ‘roles’ distinguish the two parent fkeys?4. Why not use one set of (two) fkeys to identify both parents?

Page 4: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 4 - RJL Rev. 050908

Relational AlgebraRelational Algebra• An algebraic model: The set-theoretic algebra of

relations, where each relation is a set of ‘tuples’ over some union of attribute domains, requires closure under these operations: union, intersection, projection, (subset) selection; ‘joins’ of various types.

Questions: Define ‘join’, ‘projection’, ‘closure’? How would you join P2FU with FU2C to get tuples with

pkeys and ages of each involved person plus the address of their current FamilyUnit?

What ‘C’ program would provide access to data for each parent and each child of one particular family unit?

- What data structures would simplify this task?

Page 5: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 5 - RJL Rev. 050908

Validating a Relation InstanceValidating a Relation Instance• It must be possible to recognize whether a

candidate N-tuple is a legal value of some particular N-ary relation.

• This set of legal values may be enumerated as a set of points and/or intervals, computed by an algorithm, or tested for membership by a logical predicate based on a set of rules.

• Examples:– The relation from an offered Course to its (set of) Pre-

requisite Courses is a tabulated multi-valued function.

– The Father to Son relation between two Persons must obey the constraint Sex=‘Male’ on any Person having the role of Father or Son.

– The relation isLengthOf(h, Vector(x, y)) must satisfy the Pythagorean Theorem (h2 = x2 + y2).

Page 6: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 6 - RJL Rev. 050908

Classes and InstancesClasses and Instances• A Class extends an abstract data type by allowing

multiple instances. A class extends a ‘C’ struct by defining a set of methods or operations that can be applied to that class or to an instance thereof.

• In C++, a struct is equivalent to a class with the restriction that the access mode of all data and function members defaults to public.

• An object is an instance of a class. Each object contains values for each data member in its class declaration.

• In C++, an Entity Type can be declared or type-def’ed as either a ‘struct’ or a class. (A class can also declare ‘class’ methods, which do not affect instance data.)

Page 7: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 7 - RJL Rev. 050908

Entity Types as ClassesEntity Types as Classes

• An Entity Type is a named, structured data type whose named components are called attributes, properties or fields.

– Instances of the same type can be stored in a container called a table.

– A set of Entity Instances of different types can be stored separately and referenced by a table of pointers or of foreign keys.

– Entity Type attributes are analogous to [class or instance] data members of the corresponding class.

Page 8: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 8 - RJL Rev. 050908

Class Instances = ObjectsClass Instances = Objects

• Instances of a Class are called objects.– Class methods may construct (c’tor) a new instance of

that class or compute things that do not depend on a specific instance as an input argument (e.g. count them).

– Instance methods may get (read) or set (update) an attribute value, or do arbitrary computations based on a pre-selected object; in C++ an explicit or default ‘this’ pointer identifies this class instance.

– An Attribute Domain defines the set of legal values for an Attribute; it may be a pre-defined data type or an enumerated set of values or value ranges.

– An Entity Domain is [a subset of] the set-theoretic direct or Cartesian product of its attribute domains.

Question: How can the subset be further constrained, if desired?

Page 9: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 9 - RJL Rev. 050908

Entity Types and InstancesEntity Types and Instances

• Instances of a named Entity Type are called (Entity) Instances.

– The value of an entity instance is specified by assigning a value to each of its component attributes or data members.

– Each attribute has a data type which defines (and is defined by) its ‘Attribute Domain’ (set of legal values).

– An Entity Domain is [a subset of] the set-theoretic direct or Cartesian product of its attribute domains.

– Legitimate attribute values may be from a primitive scalar data type or refer to other entity instances.

– Structured attribute values are also possible within one entity instance,

– A set of Entity Instances (or references to them) may be held in a Container (e.g. List, Vector, Array) type or class.

Page 10: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 10 - RJL Rev. 050908

Entity Sets as Tables Entity Sets as Tables

• We will use a vector notation to represent the value of an entity or object as an N_tuple (sequence or ordered list) of values of its attributes or data members.

– This avoids the redundancy of listing N <attributeName, value> pairs which a pure set-theoretic notation requires.

– Like relational databases, but unlike object-oriented databases, we will avoid nested sub-structures by constraining all attributes to be scalar data types.

• Then, a set of instances (of the same entity type) can be represented in a tabular format with one column per attribute and one row for each instance of the entity type.

– This is also the standard representation for a relational database table (for a single relation type).

Page 11: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 11 - RJL Rev. 050908

Event and Entity TypesEvent and Entity Types

• An Event Type is a special case of an Entity Type.

• An Event Type declares a message with typed parameters; It is a special case of an Entity Type or Class, for which instances of that Event Type may be created.

• Instances of an Event Type are typically created on the heap, and used to make a remote procedure call, by sending it as a message between distributed concurrent processes.

• We can represent a set of Event Instances (of the same Event Type) in a tabular format with one column per attribute, and one row for each event instance.

• Different EventTypes can be declared as subclasses of a generic EventType class. EventInstances for each subclass of EventType can inherit from their corresponding EventInstance superclass.

Page 12: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 12 - RJL Rev. 050908

Function Prototypes as Event TypesFunction Prototypes as Event Types

A function [proto]type declares a formal parameter type-list, or signature.

An Event Type also declares a typed list of scalar attributes, which is its ‘signature’.

Both EventTypes and function declarations have a name attribute that is a unique identifier (in its scope).

For an event, this is a ‘class’, not instance, attribute. A function name can also be a name for its actual argument list

as a structured data type. An Event Instance and a function call must ‘contain’

argument values which agree with its declared ‘signature’. Event Instance attributes must explicitly identify its sender and

receiver; in contrast, function caller and callee identification is implicit due to the shared LIFO stack discipline.

Page 13: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 13 - RJL Rev. 050908

Function Calls as Event InstancesFunction Calls as Event Instances

An Event Instance has an argument list like a function call, which is its ‘signature’.

Pointers to these can be collected into a FIFO Event Instance Pointer queue for distributed system communications

A function call during program execution can be regarded as a special case of an event message, which happens to be materialized on the call stack.

A function call can also be handled (less efficiently) by stacking only one argument: the pointer to an actual argument list created as a struct or class instance on the heap.

Page 14: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 14 - RJL Rev. 050908

Relational Algebra as a SetRelational Algebra as a Set• A set-theoretic abstract model:

– An N-ary Relation is a Set of relation instances.– N is called the ‘arity’ or dimension of the relation.

– N indicates how many individual instances are associated by each instance of the relation.

• Each N-ary relation instance is an N-tuple which groups, associates, or relates one specific instance from each of the relation’s N component data types or classes.

– N = minimum number of attribute values in each relation instance;

– N attributes are required to identify the instances that participate (play their respective roles) in the relation;

– There may be other attributes which are properties of the association between related individuals.

Page 15: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 15 - RJL Rev. 050908

Relations with AttributesRelations with Attributes

• An Attributed N-ary Relation is a K-tuple with K-N other attributes besides its N foreign keys (for some integer K).

– There may be a primary key (pkey) to uniquely identify each instance of the relation. This is mandatory only if lower-level entity instances exist that are related to it by foreign keys.

– There must be N foreign key (fkey) attributes that identify the instances of other Entity Types or relations that participate (play their respective roles) in the relation.

• We will require every relation to have a pkey.– The remaining K-N-1 ‘non-key’ attributes are properties of the

association between related individuals; these should not be computable from (i.e., not ‘functionally dependent on’) fewer than N keys, to avoid duplicating data and the risks of multiple maintenance.

• We will model every relation as attributed, even if there are no actual dependent attributes (K = N+1).

Page 16: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 16 - RJL Rev. 050908

Relations are Sets of PointsRelations are Sets of Points

• The K-tuples of a Relation R(x,y,z) have values in the Direct Product space XxYxZ of its participating attribute domains.

– In other words, the value of any Relation is a subset of points in the direct product space of its Attribute Domains:

Range(Relation(x,y,z))

Range(x) x Range(y) x Range(z)

• An N-ary relation is a set of N-tuples, each of which specifies or contains a reference to an instance of each of its participating entity types.

– Binary relations (N=2) are common.

– Ternary (N = 3) and higher relations are rare

– Examples?

Page 17: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 17 - RJL Rev. 050908

Relational AlgebraRelational Algebra• An algebraic model: The set-theoretic algebra of

relations, where each relation is a set of ‘tuples’ over some union of attribute domains, requires closure under these operations: union, intersection, projection, (subset) selection; ‘joins’ of various types.

Questions: Define ‘join’, ‘projection’, ‘closure’? How would you join P2FU with FU2C to get tuples with

pkeys and ages of each involved person plus the address of their current FamilyUnit?

What ‘C’ program would provide access to data for each parent and each child of one particular family unit?

- What data structures would simplify this task?

Page 18: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 18 - RJL Rev. 050908

Relation Interpretations:Relation Interpretations:Example: R(x,y,z) is a ternary relation among

elements of N =3 integer Domains X, Y, Z; if R has no other attributes, each of its instances is a 3-tuple <x,y,z> of integer values for its ‘roles’.

• One possible Interpretation: Each 3-tuple is a Point of 3-dimensional Euclidean space which defines a continuous function z= f(x,y). All values are floats approximating reals.

– There exists at most one triple <x,y,z> for each <x,y>.

– Limit(|f(x+a,y+b) – f(x,y)|) = 0 as a and b approach 0.

– There are uncountably many such points.

– x is East, y is North, and z is Altitude.

• One alternate Interpretation: Each 3-tuple is a point on the unit 3-cube where g(x,y,z) = TRUE.

– There are at most 8 such points {<0,0,0}… <1,1,1>}

– Each coordinate value must be either 0 or 1.

Page 19: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 19 - RJL Rev. 050908

Dimension or ArityDimension or Arity

A relation is characterized by its ‘arity’ N, which is the dimension of the relation as an N-tuple or vector of component entity types.

• Arity is independent of the types of entities in an N-ary relation.

• In practice, any or all participating instances may be represented as reference types or foreign keys, instead of value types.

Page 20: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 20 - RJL Rev. 050908

Role and MultiplicityRole and Multiplicity

Two other properties must be specified for each of the N component entity types in the relation:

1. The entity’s multiplicity or cardinality (the min and max number of relation instances in which it may occur). Example: At most two specific instances of Person, both in their

role as ‘Parent’ can be associated with ‘zero to many’ other persons in the role of Child. (A child can be associated with exactly 2 parents at birth and 0, 1 or 2 parents later.)

This requires a set of two Person fkey attributes in the Child type with the same fkey name.

2. The entity’s role or semantic meaning (how it is used, its meaning or significance in the relation). Example: A child might be associated to exactly one specific

instance of Person in the role of Father, and one in the role of Mother.

This requires two distinctly named fkey attributes in the Child type.

Page 21: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 21 - RJL Rev. 050908

Multiplicity (1)Multiplicity (1)

• Multiplicity is a [meta-]property of relations that defines a range for the number of K-tuples in which a specific fkey value (referencing the same entity type) can appear with distinct values for its other fkeys.

• Each N-ary relation has a defined set of N possible multiplicity [meta-]attributes (<minOccurs, maxOccurs> pairs).

– This defines lower and upper bounds on the number of instances of one entity type/role that can be related to instances of the N-1 other types/roles.

– I use the names minmult or mincard and maxmult or maxcard for these bounds.

– XML uses much more descriptive names: minOccurs and maxOccurs.

Page 22: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 22 - RJL Rev. 050908

Multiplicity NotationsMultiplicity Notations• How should this range be specified?

• Unified Modeling Language (UML) is a proposed standard from the Object Management Group’s Analysis and Design Task Force (OMG/ADTF).

• UML uses four possible combinations: 0 or 1 for minmult, and 1 or * for maxmult. This yields four multiplicity ranges:

– 0..* (or just *) denotes minmult = 0 and maxmult = unlimited ;

– 1..* denotes minmult = 1 and maxmult = unlimited;

– 0..1 denotes zero or 1 (minmult = 0, maxmult = 1);

– 1..1 (or just 1) denotes exactly 1 (minmult = maxmult = 1).

• UML notation includes both minOccurs and maxOccurs at each end of a binary relation, E.g., [Parent]0..2--------------0..*[Child]

( 0..2 because we assume children can be orphans.)

Page 23: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 23 - RJL Rev. 050908

Multiplicity Notations (1) Multiplicity Notations (1) What is the meaning of 0..M or 0..* ?

– The two symbols 0 and M or * are called the minimum and maximum multiplicity, respectively, of the corresponding role in the relation.

– I used to call this the MinCard:MaxCard notation. [TBD: rename it to minOccurs..maxOccurs, for consistency with XML names for these bounds.]

– UML uses * but older data model notations use M or N instead of *.

Example: – In a genealogical data model, the many-to-many relation <Parent,

Child> has maxOccurs(Parent) and maxOccurs(Child) values 2 and *, respectively. MinOccurs can be 0,1or 2 for Parent, depending on the assumptions about parent dead or undefined. MinOccurs can be 0 or 1 for the Child role, depending if Parent includes all adults or only adults with children.

[Ref: $PH/DavidHay/DHay_ComparingDModTechniques.htm ]

Page 24: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 24 - RJL Rev. 050908

Multiplicity Notations (2) Multiplicity Notations (2) MinCard is replaced by Optionality in many older notations:

– Traditional notations denote maxOccurs and Optionality in separate ways (and not at the same end of a diagrammed relationship link).

– E.g. in a genealogical {Family, Person} data model, for the many-to-many relation [Parent] ------ [Child] maxOccurs(Parent) and maxOccurs(Child) have the values 2 and *, respectively.

– If either role of the relationship can be empty (partial, not total), then instead of minOccurs = 0, a Conditional Optionality is drawn as an ‘o’ on the relation at the end which MAY but not must praticipate in the relation; without the ‘o’ or with a ‘|’ if the participation is Mandatory, the ‘o’ is NOT shown (some notations require a ‘|’instead).

[Ref: $PH/DavidHay/DHay_ComparingDModTechniques.htm ]

Page 25: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 25 - RJL Rev. 050908

Multiplicity Notations (3) Multiplicity Notations (3)

• Some notations use a letter (e.g. M or N) for maxmult instead of *. (M or N is not an actual maximum – it just represents the regular expression’s ‘*’ operator.)

• Other notations are more graphic: they use arrowheads or tails to represent maxOccurs = *;

• Oracle CaseDesigner (and David Hay) use Barker’s notation (dashing the half-line at the end which is optional instead of mandatory).

• Other model notations replace minOccurs = 0 by an ‘o’ amd minOccurs = 1 by a bar, across the same end of the link where an arrowhead or tail is drawn iff maxOccurs = *.

[Ref: $PH/DavidHay/DHay_ComparingDModTechniques.htm ]

Page 26: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 26 - RJL Rev. 050908

Multiplicity vs. CardinalityMultiplicity vs. Cardinality

• Multiplicity used to be called Cardinality, before the Unified Modeling Language became a new de facto standard for the semantics of information meta-models.

– [Ref. 1] Booch/Rumbaugh/Jacobsen/Rational Corp.– [Ref. 2] Object Management Group Analysis & Design

Task Force (OMG/ADTF)

• The rationale for the change of name was that the cardinality of each participating entity type/role is easily confused with the size of the relation itself:

Page 27: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 27 - RJL Rev. 050908

Entities have ‘Roles’ in RelationsEntities have ‘Roles’ in Relations

• A ‘Role’ is a name for the way an entity participates in a Relation; a role name has an associated explanation or semantic meaning of the relation from this entity’s perspective.

• ‘Role’ is also a discriminator for multiple appearances of the same entity type at multiple positions within each N-tuple.

• An entity type may participate in a relation more than once, provided its ‘role’ differs each time:

– Manager is a specific role for some Employees;

– Husband, Father and Son are particular roles of (Male) Persons in Family relationships;

– Origin and Destination are particular roles of Node instances in the binary relation called Edge which, together with a set of Nodes, defines a digraph.

Page 28: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 28 - RJL Rev. 050908

Relations as Relations as Characteristic FunctionsCharacteristic Functions

• Let R(x, y) be a binary relation between two finite (co)domains or data types X and Y.

• R is defined assymetrically as a multivalued function or mapping (MVM) y = R(x) from Domain X to Range or CoDomain Y.

• R is defined symmetrically by its Characteristic Function, CFR: XxY --> {TRUE, FALSE}. CFR(x,y) is a Boolean expression which defines the exact subset of XxY for which R is TRUE.

• The rules defining the CFR are usually extensible at run-time: they supply a recognition rule or criterion for recognizing membership in R.

Page 29: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 29 - RJL Rev. 050908

Example of MVF and CFR:Example of MVF and CFR:• F = Domain Female; M = CoDomain Male.

• R = Relation ‘HusbandOf’ from F to M

• Assume the relation is fully defined

• 1. Relation as Multi-Valued Function (MVF) – R(x,y) means “y = HusbandOf(x)” (y M).

– R(x) is generally multi-valued:

– Example: IsOrWasHusbandOf(y,x)?.– In general, R maps X into PowerSet(Y).

• 2. Relation as Characteristic Function:– CFR(y, x) is “IsHusbandOf(x, y)?”.

– CFR maps MxF into {TRUE, FALSE}.

• y HusbandOf(x) iff IsHusbandOf(y, x) is True.– Alternate way to write CFR: Y.IsHusbandOf.X.

Page 30: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 30 - RJL Rev. 050908

CFR as a Database QueryCFR as a Database Query

• CFR(x, y) is often specified as a query on a ‘temporal’ database of historical facts:

• E.g., CFR(x,y) = “( (There exists a valid Marriage License between x and y) and (There exists no subsequent record of Divorce between x and y) )”.

Page 31: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 31 - RJL Rev. 050908

Advantages of Characteristic Advantages of Characteristic Function Representation:Function Representation:

• CFR has the advantages of symmetry and bilateral navigability (relation inverse exists):

• We can iterate over all members of R by using an SQL-style statement:

– “Select x from X, y from Y where CFR(x, y) == TRUE”.

• We can iterate over all y related to a given xo or over all x related to a given yo:

– “Select y from Y, x from X where x==xo and CFR(x, y) == TRUE”.

– “Select y from Y, x from X, where y == yo and CF(x,y) == TRUE”.

– Either SQL Select command above returns a multi-valued set of related instances.

Page 32: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 32 - RJL Rev. 050908

Primary Keys as IndicesPrimary Keys as Indices• Let integer coordinates i = pk(xi) and j = pk(yj)

index the elements of X and Y, and redefine CFR(x,y) as CFK(i,j) in {TRUE, FALSE}.

• Then members of R(x, y) correspond 1 to 1 with integer pairs or 2-D points where CFK(i,j) = 1.

• If x and y are two entity or record types in a database, the integer variables i and j play the role of primary keys on the Domain and CoDomain.

– Ref: 96f523 project on binary Join extension of chgen.– Question: What if the relation is ternary (N=3)?

• Relations may represent a participating entity instance by reference, as a foreign key or fkey, whose value matches the pkey of the referenced object.

Page 33: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 33 - RJL Rev. 050908

Partial vs. Total RelationsPartial vs. Total Relations

• If ni >= 1 for all xi then each xi has exactly one related yij = R(xi), and the relation R(x,y) is, by definition, total (I.e., totally defined).

• If ni = 0 for at least one xi, then that xi has no related yij and the relation is, by definition, partial (i.e., its value is not defined for some x in its domain).

• The same reasoning applies symmetrically for the inverse relation Rinv(y,x).

Page 34: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 34 - RJL Rev. 050908

Relations Are Multi-valuedRelations Are Multi-valued

• For each xi in X, let Yi be the subset of Y whose members are related to xi by some relation R(x,y).

– Yi has ni members iff xi is related to ni members of Y.

• Likewise for each yj there is a subset Xj of X whose members are all related to yj.

– Xj has mj members iff mj elements of X are related to yj.

• If ni > 1 for at least one xi then this relation is a multivalued (generalization of) a function.

– Do not call it a function unless you explicitly identify it as multivalued.

Page 35: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 35 - RJL Rev. 050908

Sparse Matrix RepresentationSparse Matrix Representation• If the size of relation R(X, Y) is ‘small’ compared to

the size M*N of the direct product XxY, the relation is called sparse (a small number of matrix elements have a non-zero value).

• A sparse matrix can be efficiently represented as a list of ordered pairs of related elements <x, y>.

– Note that this list can be in row-major order or column-major order, but not both at the same time.

• If instances of X and/or Y are large in size (I.e. many scalar property values are needed to specify an instance of either type), then R is more efficiently represented as a set of pairs of references <x*, y*> to elements of X and Y.

– Example: the set of all road segments which directly join two towns or road intersections on a map.

Page 36: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 36 - RJL Rev. 050908

Directed Graph ExampleDirected Graph Example

The Edge set E of a directed graph or digraph is a binary relation on its Node set N:

Questions:• What are the Co-Domains?

• Can E be a ternary relation?

• How is the Edge matrix defined?

• When is E a sparse or dense relation?

• Can an Edge relate a Node instance to itself?

• Can an undirected graph be a special case?

• What predicate constrains a digraph to be a DAG (directed acyclic graph)?

• How to draw the relationship(s) between Node and Edge?

Node

Edge

Page 37: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 37 - RJL Rev. 050908

Binary Relations as SubsetsBinary Relations as Subsets• A binary relation is a set of ordered pairs, and a

subset of a direct product (DP) of the two sets of instances which it pair-wise relates.

• Example: IsMarriedTo(man, woman) is a subset of the direct product of all pairs in Male x Female.

– This relation is sparse, not dense.

– It has an inverse relation IsMarriedTo(woman, man).

– The inverse (in this case) has the same name but a different argument type sequence or signature (I.e. it is polymorphic).

• The rules for membership in Male, Female and IsMarriedTo must be defined very precisely.

– Question: State alternate possibilities for these rules?

Page 38: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 38 - RJL Rev. 050908

Structured or Composite Data Structured or Composite Data Types as RelationsTypes as Relations

• Every C-struct, every C++-class ‘with state’, and every record declaration declares a composite or structured data type to represent entity instances.

• This type is also a relation over the direct product set of its M >= 1 ‘member’ or component data types.

• Every variable of a composite type must have a value that is an M-tuple of values, one from each of its component types (including reference types).

• Each instance of an entity class or type must also be a tuple of values belonging to its respective member data types.

Page 39: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 39 - RJL Rev. 050908

E.F. Codd’s RM/T Data ModelE.F. Codd’s RM/T Data Model

E.F.Codd formalized relational algebra using set theory.

In RM/T, Codd defined three types of entities:

• kernel or ‘real-world’ entities that exist independently of others;

• characteristic entities which ‘characterize’ another entity and cannot exist without it. It is used to relate an attribute domain element to an entity type that has this domain type as one of its attributes or properties.

• Associations, which represent N-ary relations among other entities of all three types.

We use Codd’s RM/T model in COOL-FAQ .

Page 40: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 40 - RJL Rev. 050908

Relation ImplementationRelation Implementation• Computer-aided design programs (e.g., CASE tools) use

hierarchicl data models to describe complex objects wth 1:M parent-child or container-component relationships.

• These tools often need to access multiple instances of some child or component type BB from one instance of a related parent or composite container or M-to-1 associated type AA.

• A fast way to do this is to traverse a linked-list structure from an instance of struct AA to its related instances of struct BB.

• Macros in chgen or templates in gencpp can be used to replicate generic declarations and functions.

• When the database is too large for memory-resident tables or lists, caches and/or indices must be maintained on disk and used efficiently.

Page 41: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 41 - RJL Rev. 050908

• Every row of a table contains a unique primary key, or ‘pkey’ as its first field. This key identifies the record or table type (2 or 4 upper-case letters), a version number (2 or 3 digits) and a row number (4 or 5 digits)

• Table types optionally include foreign-key (fkey) fields. The value of an fkey must match the pkey value of its associated container or superclass ‘parent’.

• Non-key attributes, fkeys to ‘is_a’ superclasses, and fkeys to ‘is-part-of’ containers are distinguished by different values for the is_key meta-attribute in meta-table TA: fKey type: Not a Key isPartOf is_a

is_key value: 0 +/- 1 s

Pkeys and fkeysPkeys and fkeys

Page 42: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 42 - RJL Rev. 050908

Organization of VMNetDBOrganization of VMNetDB

• Maintenance of efficient linked lists is a guiding principle behind the classical network and new OO databases.

• GEN declares and maintains a circular list for every fkey attribute that it finds in the schema (i.e. for every 1:M relation link on the schema diagram).

• When pr_load reads any database into virtual memory (VM), the links become VM address pointers. The result is a network data structure called VMNetDB.

Page 43: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 43 - RJL Rev. 050908

• A table definition in the schema represents a set of rows that are instances of its contained record type and a set of columns that contain attribute values.

• This table definition can be converted by chgen to a ‘C’ struct declaration and threaded lists for its relatinships,.

• Data types available for table columns include primary and foreign key, int, float, word, text.

• A Data Structure Diagram (DSD) is an ERA diagram after all M:N relationships have been converted to Associative Entity tables.

• A DSD can be drawn by any drawing tool and a corresponding chgen input schema.sch file can be created using any text editor.

Tables in VMNetDBTables in VMNetDB

Page 44: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 44 - RJL Rev. 050908

Super-to-Subclass or Super-to-Subclass or Gen-Spec RelationsGen-Spec Relations

• Relational databases (RDB’s) and ERA data models support containment or aggregation (isPartOf) relations.

• Object-Oriented databases (OODB’s) also support inheritance by means of superclass - subclass (is-a) relations.

• Is-a relations are also called generalization - specialization (genspec) relations.

• UML and CDIF use different drawing conventions to distinguish gen-spec from isPartOf relations in EERA Diagrams. (BDE uses CDIF’s convention.)

Page 45: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 45 - RJL Rev. 050908

• A schema ER-diagram drawn with BDE is saved as a persistent database file. Thje latter can be automatically converted into meta-tables TT, TA and/or schema.sch via 96s523/bde2tt1/{b2t,t2s}.• A schema diagram drawn with another tool (e.g.,

PowerPoint or Idraw) must be manually edited into schema.sch. The latter cannot be automatically checked to see if it corresponds to the drawing.

• In either case, full attribute descriptions must be added at some point (by appending HA text to HN nodes in bde or by post-editing tables TT and TA or file schema.sch). Each attribute line in schema.sch must contain the 5-tuple <name, altname, format, iskey, description> .

Schemas from ERA DiagramsSchemas from ERA Diagrams

Page 46: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 46 - RJL Rev. 050908

• A table may participate in any number of ‘is-part-of’ relations or associations in either of two ways: as parent container or child component (but not both).

• A table’s child roles are explicitly identified by the fact that it contains a foreign key or ‘fkey’ field for each child role (e.g. ‘XXid’ identifies parent type as ‘XX’).

• Chgen finds out a table’s parent roles indirectly (by detecting the parent table type in the name of some fkey attribute in another (child) table.

• Chgen does not suport cyclic schemas in which an fkey references the table type which contains it.

Parent and Child RolesParent and Child Roles

Page 47: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 47 - RJL Rev. 050908

• 1:M relations are drawn as directed linksbetween nodes that represent parent and child entity types on a DSD. An arrow head OR arrow tail denotes the child end.

• Either half of the link may be dashed or solid. Dashed link ends indicate mincard = 0 (optional). Solid links indicate mincard = 1 (mandatory). Source code from GEN does not yet enforce this distinction.

• bde2sch creates one foreign (f)key field in the child table for each arrow head (or tail) that points into (or out of) the child table type, respectively.

• Chgen creates distinct fkey names for two links connecting the same table pair (AAid1 and AAid2 in table BB above).

Drawing Relations:Drawing Relations:

BB

AA

BBidAAid1AAid2

AAid

Page 48: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 48 - RJL Rev. 050908

• A table may also play the role of a superclass ‘generalization’ or subclass ‘specialization’ in an ‘inheritance’ relation.

• This relation associates exactly one instance of the superclass generalization type AA with one of a set of subclass or specialization types BB, CC, etc.

Inheritance RelationsInheritance Relations

AAAA

BBBB

This relation associates exactly one instance of the superclass generalization type AA with one of a set of subclass or specialization types BB, CC, etc.

CCCC

Page 49: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 49 - RJL Rev. 050908

Gen-Spec Relation NotationGen-Spec Relation Notation

Although a gen-spec relation is actually 1:1 among instances, its schema diagram must describe a set of subclass types that specialize their superclass type.

Therefore, a different link drawing style is used:

AAAA

BBBB CCCC

• The superclass is placed above a crossbar, and each of its subclasses is placed below the cross-bar; vertical lines connect all classes to the crossbar beween them.

• UML (but not CDIF) uses an open triangle on one link to denote the entity with the superclass role. The triangle is placed at the superclass end of the link, as shown at right.

Page 50: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 50 - RJL Rev. 050908

Notation for InheritanceNotation for Inheritance• An inheritance relation is indicated on a data

structure diagram by a T-shaped branch structure from the superclass type to each subclass type (with no arrow heads or tails).

• BDE supports this EERA extension of ERA diagrams. The cross-bar is a special node type, identified by its height-to-width ratio.

AA

BB CC DD

(this implies a mutually exclusive relationship, not a 1:M parent-child relation, among instances.)

Page 51: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 51 - RJL Rev. 050908

• GEN (v10) supports reflective or self-describing databases. This description is called a meta-schema*:

• Meta-table TT describes table or record types. Meta-table TA defines key and non-key attributes.

The relation TT-->TA is 1:many.• A reflective database can process

its own description as data, at runtime.

Reflective databases:Reflective databases:

_____

*Tables TT and TA must be read-only; changing TT or TA modifies database format and requires re-compilation.

TT: table descriptions

TA: attribute descriptions

Examples: the names of fields in a particular table can be used to print a header row of column labels over a tabular report containing record instances as rows; format rules can be checked(validated) before accepting input data.

Page 52: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 52 - RJL Rev. 050908

– Tables TT and TA can reside in the application database, and can be processed by chgen macros, because they have the same format as other tables.

– Tables TT and TA are self-describing: One TT-row describes table TT; another TT_row describes table TA; every TT-row has a set of TA-child rows.

– Run ‘chgen -meta ...’ to generate tables TT and TA.

Metatables TT and TAMetatables TT and TA

TT

TA

• Tables TT and TA together define a database. • Each row of table TT describes one table type.• Each TA-row describes one field of one table.

• Each TA-row contains these field meta-attributes: <name, altname, format, iskey, description>

Page 53: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 53 - RJL Rev. 050908

Schemas and GencppSchemas and Gencpp• Gencpp is a replacement for chgen that generates

C++ code skeletons, taking data member definitions from DSD models, and (TBD) method definitions from state models.

• Each table definition in the schema will be converted by gencpp into a declaration of three or more classes:

– a corresponding C++ class whose data members are the table’s attributes and whose instances are the table rows;

– an instantiation of a generic container class for the table itself;

– a linked list class for each parent-child relationship that appears in the schema.

– TBD: Add action (method) names and signatures as attributes in schema diagrams.

Page 54: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 54 - RJL Rev. 050908

Application database ‘intension’ •is a set of type definitions for the database•describes the format of each database table•is described in schema.sch (input to GEN) •is described in meta-tables TT and TA •is an output of chgen (v10 emits TT and TA) •is also called metadata (data about data)

Application database ‘extension’:•is the actual content or ‘population’ of a database•contains actual values of table rows or instances•has record types that are described in table TT•has data fields that are described in table TA

Intension vs. ExtensionIntension vs. Extension

Page 55: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 55 - RJL Rev. 050908

Example Example (Tree Schema SU->WH->IT)(Tree Schema SU->WH->IT)

Supplier SU

Warehouse WH

Item IT

Only one warehouse of one supplier can supply a particular part in this (over-simplified) model.

Each arrow implies a foreign key in each child table record.

Non-key fields:: Vendor nameExpiration date of price quotation

(Phone number and address)

(Non-key fields: ourItemNo, vendorPartNo, unitPrice, unitQuantity, and description)

Page 56: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 56 - RJL Rev. 050908

Schema as Schema as DrawnDrawn by BDE by BDE

Supplier SU /*a supplier of some items we buy */name NA c24 0 /* name of company we buy from */expDate NA c10 0 /* last date price quotes are valid */

Warehouse WH /* where vendor stores items */phone NA c10 0 /*warehouse phone number */address NA t80 0/* address of warehouse */

Item IT /* an item supplied by a vendor */ourItemNo NA i4 0 /* our integer part identifier */vendorCode NA c12 0 /* vendor’s item code */unitPrice NA i4 0 /* vendor price in $/100 */unitQuantity NA i4 0 /* quantity in package */itemDesc NA i4 0 /* item description */

Top line of each node is HN.name text; rest are HA.hlabel text; New schema supports multi-word hlabel text; border=HN; dashed line=TBD.

Page 57: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 57 - RJL Rev. 050908

Schema produced by b2t + t2s:Schema produced by b2t + t2s:Supplier SU /* a supplier of some items we buy */{ . . .}Warehouse WH /* where vendor stores items */ {

WHid NA c8 1 /* primary key (always first) */SUid NA c8 1 /* foreign key from link */phone NA c10 0 /* warehouse phone number */address NA t80 0 /* address of warehouse */

}Item IT /* an item stored in vendor’s warehouse */{ . . . }

This format may not be reproduced exactly. Attributes are shown only for table WH. Full-line comments are ignored by chgen.

Page 58: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 58 - RJL Rev. 050908

Struct declared by chgen:Struct declared by chgen:struct WH /* where vendor stores items */

{

hcg_key WHid /* 32bit primary key, from table type */

hcg_key SUid /* 32bit foreign key, from link source */

char phone[11] /* warehouse phone number + \n */

char address[81] /* address of warehouse + \n */

struct SU* SUid_pp /* direct pointer to WH parent */

dummy_ptr SUid_fpp /* forward pointer via siblings to parent */

dummy_ptr ITid_fcp /* pointer to first IT-child of this WH */

struct WH* next /* pointer to next row of table WH */

}/*The dashed line separates external fields declared in schema.sch from pointers inVMNetDB.These are declared in schema.h.Pr_load derives pointer values from external pkey and fkey values..Struct types SU and IT are similar, but need fewer pointers. Back-ptrsSUid_bpp and ITid_bcp are suppressed if you run ‘chgen -nobp ...’.*/

Page 59: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 59 - RJL Rev. 050908

TT000001 Supplier SU /*a supplier of some items we buy */TT000002 Warehouse WH /* where vendor stores items */T0000003 Item IT /* an item supplied by a vendor */

TA000001 TT000001 SUid NA c8 1 /* unique pkey of SUpplier */TA000002 TT000001 name NA c24 0 /* name of company we buy from */TA000003 TT000001 expDate NA c10 0 /* last date price quotes are */TA000004 TT000002 WHid NA c8 1 /* unique pkey of WareHouse */TA000005 TT000002 SUid NA c24 0 /* unique fkey(xref to SUpplier*/TA000006 TT000002 phone NA c10 0 /*warehouse phone number */TA000007 TT000002 address NA t80 0 /* address of warehouse */TA000008 TT000003 ITid NA c8 1 /* unique pkey of ITem */TA000009 TT000003 WHid NA c8 1 /* unique fkey (xref to WareHou*/TA000010 TT000003 ourItemNo NA i4 0 /* our integer part identifier*/TA000011 TT000003 vendorCode NA c12 0 /* vendor’s item code */TA000012 TT000003 unitPrice NA i4 0 /* vendor price in $/100 */TA000013 TT000003 unitQuant NA i4 0 /* quantity in packag */TA000014 TT000003 itemDesc NA i4 0 /* item description */

MetaSchema Tables TT and TAMetaSchema Tables TT and TA

Each table type is in a row of meta-table TT with its TT-row identifier or pkey TTid.Each attribute of a table type is in a row of meta-table TA with its TA-row pkey TAid.Each TA-row has a cross-reference fkey TTid to the TT_row of its containing table type.The remaining columns of table TA are copied directly from schema.sch.

TTid TableName TableAbbrev TableDescription

TAid TTid name altName fmt is_key description

Page 60: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 60 - RJL Rev. 050908

Meta-tables TT, TA are Self-Describing:Meta-tables TT, TA are Self-Describing:

TT000001 TableType TT /* meta-table that defines tables */TT000002 TableAttribute TA /* meta-table that defines attributes */

TA000001 TT000001 TTid NA c8 1 /* unique pkey of TT-row */TA000002 TT000001 tableName NA c32 0 /* Long name of this table */TA000003 TT000001 ttAbbrev NA c2 0 /* abbreviated name (UCltrs)*/TA000004 TT000001 ttDescription NA c24 0 /* meaning of table *//*----------------------------------------------------------------*/TA000005 TT000002 TAid NA c8 1 /* unique pkey of TA-row */TA000006 TT000002 TTid NA c8 1 /*unique fkey xref to TableType */TA000007 TT000002 name NA t80 0 /* name of attribute (field) */TA000008 TT000002 altName NA c24 0 /* unique pkey of ITem */TA000009 TT000002 fmt NA c4 0 /* field format: i4,f4,f8,c8,t80..*/TA000010 TT000002 is_key NA c3 0 /* values: nonkey:0; key:1,-1,s */TA000011 TT000002 description NA t80 0 /* meaning of table field */

TTid* tableName ttAbbrev ttDescription

TAid TTid name altName fmt is_key description

Descriptions of meta-tables TT and TA occupy rows 1 and 2 of meta-table TTMeta-attributes of tables TT and TA appear in rows 1..11 of meta-table TA.Each TA-row includes a foreign key TTid to identify its parent row 1 or 2 of table TT.Other attributes of tables TT and TA are merely copied from schema tables TT and TA.

Page 61: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 61 - RJL Rev. 050908

Merging Sub-schemasMerging Sub-schemas

• A database converter (bde2sch, bde2SM, etc.) must combine an input data model or subschema with an output data model or subschema to define a common schema with separate input and output views.

• Input and output schema.sch files must be concatenated before running chgen, which accepts exactly one schema.sch and emits tables TT + TA.

• Concatenating independent metaschema tables derived from independent schema diagrams (e.g., two TT-tables with same TTid for distinct ttAbbrev’s) would require renaming all TTid keys of one subschema to avoid overlap. (Combining two bde diagram files into one would have the same problem and solution).

Page 62: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 62 - RJL Rev. 050908

Constraints on Schema UnionsConstraints on Schema Unions• Always declare the (constant) tables TT and TA

first, then other shared tables, in schema.sch.• Pkeys must be disjoint and invariant over

applications which share tables. • New tables can be appended at the end of the

schema (e.g. when adding new subclasses).• Pkeys and table abbrevs must remain unique and

invariant as the schema grows.• Table Abbrevs correspond 1:1 to TTid values.Both

are type indicators with global scope and should not be altered over the lifetime of an application.

• Table Attribute names can be duplicated since they have local (table) scope.

Page 63: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 63 - RJL Rev. 050908

Compression (Breadth-First)Compression (Breadth-First)BB TT000001 3 Supplier SU /*a supplier of some items we buy */ Warehouse WH /* where vendor stores items */ Item IT /* an item supplied by a vendor */EB TT 3BB TA000001 TT000001 3

SUid NA c24 0 /* unique pkey of SUpplier */name NA c24 0 /* name of company we buy from */expDate NA c10 0 /* last date price quotes are */

EB TA 3BB TA000004 TT000002 4

WHidNA c24 0 /* unique pkey of WareHouse */SUidNA c24 0 /* unique fkey(xref to SUpplier */phone NA c10 0 /*warehouse phone number */address NA t80 0 /* address of warehouse */

EB TA 4BB TA000008 TT000003 7

ITidNA c24 0 /* unique pkey of ITem */WHidNA c24 0 /* unique fkey (xref to WareHou */ourItemNo NA i4 0 /* our integer part identi */vendorCode NA c12 0 /* vendor’s item code */unitPrice NA i4 0 /* vendor price in $/100 */unitQuantity NA i4 0 /* quantity in packag */itemDesc NA i4 0 /* item description */

EB TA 7

*/

Page 64: DataModels05fr1.ppt 1 - RJL Rev. 050908 The ERA Data Model: Entities, Relations and Attributes Specifies each Attribute (property or data field) by its

DataModels05fr1.ppt 64 - RJL Rev. 050908

Compression (Depth-First)Compression (Depth-First)BB TT000001 3

Supplier SU /*a supplier of some items we buy */BB TA000001 TT000001 3

SUid NA c24 0 /* unique pkey of SUpplier */name NA c24 0 /* name of company we buy from */expDate NA c10 0 /* last date price quotes are */

EB TA 3Warehouse WH /* where vendor stores items */BB TA000004 TT000002 4

WHidNA c24 0 /* unique pkey of WareHouse */SUidNA c24 0 /* unique fkey(xref to SUpplier */phone NA c10 0 /*warehouse phone number */address NA t80 0 /* address of warehouse */

EB TA000008Item IT /* an item supplied by a vendor */BB TA000008 TT000003 7

ITidNA c24 0 /* unique pkey of ITem */WHidNA c24 0 /* unique fkey (xref to WareHou */ourItemNo NA i4 0 /* our integer part identi */vendorCode NA c12 0 /* vendor’s item code */unitPrice NA i4 0 /* vendor price in $/100 */unitQuantity NA i4 0 /* quantity in packag */itemDesc NA i4 0 /* item description */

EB TA000014EB TT000003