Notes_ER Model Notes

Embed Size (px)

Citation preview

  • 8/4/2019 Notes_ER Model Notes

    1/17

    DBMS

    Basic concepts:

    Data can be defined as unprocessed information.

    Information is data that is organized and communicated in a coherent (clear)

    and meaningful manner.Data is converted into information and information is converted into knowledge.

    Knowledge is the information evaluated and organized so that it is usedpurposefully.

    Flow of data:

    Disadvantages of File Management System:

    In FMS the data is stored as a collection of operating system files. This approach hasmany drawbacks, including the following:

    Data redundancy and inconsistency: Multiple file formats and duplication ofinformation in different files.

    Difficulty in accessing data: Need to waste a new program to carry out eachnew task.

    Data Isolation: Multiple files and formats.

    Integrity problems: Integrity constraints become buried in program code

    rather than being stated explicitly. Hard to add new constraints or changeexisting one.

    Atomicity of updates: Failures may leave database in an inconsistent statewith partial updates carried out.Example: Transfer of funds from one account to another should either completeor not happen at all.

    Concurrent access of multiple users:Concurrent access needed for performance. Uncontrolled concurrent accessescan lead to inconsistencies.Example: Two people reading a balance and updating it at the sametime.

    Limited data sharing and lengthy processing time.

    Main drawback of FMS is storage and memory management.

    Database: It is structured or organized collection of related records or data that isstored in a computer system and which can easily be accessed managed andupdated.

    Database applications: Banking : All transactions Airlines : Reservations and schedules

    Data Information Knowledge Action

  • 8/4/2019 Notes_ER Model Notes

    2/17

    Universities : Registration, grades etc

    Sales : Customers, products and purchases

    Online retailers : Order tracking, customized recommendations

    Human resources: Employee records, salaries etc.

    Database Management System:DBMS is a database program or software system that stores, retrieves and modifies

    data in a database on request.Examples: MS Access, Oracle, SQL Server, FoxPro etc.

    Three view of data / Levels of abstraction:

    Physical Level: Describes how a record is stored.

    Logical Level: Describes data stored in database and the relationships among the

    data.

    View level:: Application programs hide details of data types. Views can also hide

    information for security purposes.

    Schema: The logical structure of the database.

    Physical Schema: Database design at physical level

    Logical Schema: Database design at logical level.

    Instance Actual content of the database at a particular point of time.

    Advantages of DBMS:

    Reduction of redundancies: In database approach data can be stored at a singleplace or with controlled redundancy under DBMS, which saves space and does notpermit inconsistency.

    Shared Data: A DBMS allows the sharing of database under its control by anynumber of application programs or users.

  • 8/4/2019 Notes_ER Model Notes

    3/17

    Data Independence: A change in the structure of data may require alternations toprograms. DBMS separates data descriptions from data. Hence it is not affected bychanges. This is called data independence, where details of data are not exposed.DBMS provides an abstract view and hides details.

    Improved Integrity: Data integrity refers to validity and consistency of data. Dataintegrity means that the data should be accurate and consistent. This is done by

    providing some checks or constraints. These are consistency rules that the databaseis not permitted to violate. Constraints may apply to data items with in a record orrelationships between records.Example: Age of all employees can be between 18 and 70 years only.

    Efficient Data Access: DBMS utilizes techniques to store and retrieve the dataefficiently atleast for unforeseen queries. A complex DBMS should be able to provideservices to end users, where they can efficiently retrieve the data almostimmediately.

    Data Security: Data is of vital importance to an organization and may be

    confidential. Unauthorized person must not access the confidential data. The DBAwho has the ultimate responsibility for the data in the DBMS can ensure that properaccess procedures are followed, including proper authentication schemes for accessto the DBMS and additional checks before permitting access to sensitive data.Different levels of security can be implemented for various types of data andoperations.

    Backup and Recovery: DBMS provides facilities for recovering the software andhardware failures. A backup and recovery subsystem is responsible for this. In casea program fails, it restored the database to a state in which it was before theexecution of the program.

    Support for concurrent transactions: DBMS allows multiple transactions tooccur simultaneously.

    Differences between File Processing System and Database ManagementSystem

    # File Processing System Database Management System1 A file-processing system only

    coordinates physical access tothe data

    A database coordinates thephysical and logical access to thedata

    2 Cheaper Costly3 Data dependent Data independent4 Data redundancy Controlled data redundancy5 Inconsistent data Consistent data6 A file-processing system only

    allows pre-determined accessto data

    A DBMS is designed to allowflexibility in what queries giveaccess to the data

    7 Integrity problems Improved integrity8 Concurrent transactions are Supports concurrent transactions

  • 8/4/2019 Notes_ER Model Notes

    4/17

    not supported9 A file processing system is

    much more restrictive insimultaneous data access.

    A DBMS is designed to coordinateand permit multiple users toaccess data at the same time.

    Data Models:

    Data model is the collection of tools for describing Data Data relationships Data semantics Data constraints

    Types of data models:1) Relational Model2) Entity Relationship Model3) Object Based Data Models

    a) Object orientedb) Object relational

    4) Semi structured data model(XML)5) Network Model6) Hierarchial Model

    Database design:The database design process is divided into six steps.

    1) Requirement Analysis: Requirement analysis is an informal process thatinvolves discussions with user groups, a study on current operating environment,

    how it is going to change, analysis of available documentation on existingapplication.

    2) Conceptual database design: The information gathered in the requirementanalysis step is used to develop a high level description of the data to be storedin the database, along with the constraints known to hold over this data. Thisstep is carried out using ER model.

    3) Logical database design: The task of the logical design step is to convert anER diagram/ER schema into a relational database schema.

    4) Schema Refinement: The fourth step in database design is to analyze thecollection of relations in our relational database schema to identify potentialproblems and refine it.

    5) Physical database design: This step involves building indexes on some tablesand clustering some tables, exit may involve a substantial redesign of parts ofthe database schema obtained form the earlier design steps.

    6) Application and security design: Here we must identify the entities andprocesses involved in the application. We must describe the role of each entity inevery process that is reflected in some application task, as part of completeworkflow for that task. For each role, we must identify the parts of the databasethat must be accessible and parts of the database that must not be accessible.We must take steps that these access rules are enforced.

  • 8/4/2019 Notes_ER Model Notes

    5/17

    In the implementation phase, we must code each task in an application languageusing the DBMS to access data.

    Realistically, although we might begin with the six step process outlines here, acomplete database design will probably require a subsequent tuning phase in whichall six kinds of design steps are interleaved and repeated until the design is

    satisfactory.

    Database Users:

    Users are differentiated by the way they expect to interact with the system.

    Application Programmers: Interact with system through DML calls

    Sophisticated users: Forms requests in a database query language

    Specialized users: Writes specialized database application that do not fit into

    the traditional data processing framework. Nave users: People accessing database over the web etc.

    Database Administrator:DBA coordinates all the activities of the database system. He has goodunderstanding of the enterprises, information resources and needs.

    DBAs duties include: Storage structure and access method definition Schema and physical organization modification Granting users authority to access the database

    Backing up of data Monitoring performance and responding to changes Database tuning

    Entity Relationship Model

    Features of ER Model:o Entity relationship model is a high level conceptual data model.o It allows us to describe the data involved in a real world enterprise in terms ofobjects and their relationships.o It is widely used to develop an initial design of a database.o It provides a set of useful concepts that make it convenient for a developer tomove from a basic set of information to a detailed and precise description ofinformation that can be easily implemented in a database system.o ER model describes data as a collection of

    Entities / Entity set Relationship set Attributes

  • 8/4/2019 Notes_ER Model Notes

    6/17

    Entity:o An entity is an object in the real worlds that is distinguishable from otherobjects.Example: car, table, book etc

    o An entity need not be a physical entity; it can also represent a concept in real

    world.Example: project, loan etc

    o Entity represents a class of things, not any one instance.Example: STUDENT entity has instances of JONES, RAMA etc

    o Entity is denoted by the symbol rectangle

    o Entity type/Entity set: A collection of a similar kind of entities is called an

    entity set or entity type.

    Attribute:o An attribute is a property used to describe the specific feature of an entity.o Attributes are denoted by ellipse symbol

    Example: STUDENT entity may be described by the attributes stud_name, age,address, course etc

    Domain:o Each simple attribute of an entity type contains a possible set of values thatcan be attached to it. This is called the domain of the attribute.o An attribute cannot have a value outside this domain.o Example: For PERSON entity, person_Id attribute has a specific domain,

    integer values say upon 1 to100.

    Types of Attributes:1) Simple Attribute:The attribute that cannot be further divided into smaller parts

    and represents the basic meaning is called a simple attribute.Example: Name, age attributes of an entity PERSON represent simple attributes.

  • 8/4/2019 Notes_ER Model Notes

    7/17

    2) Composite Attribute: Attributes which can be divided into subparts and eachindividual unit has a specific meaning.

    Example: An attribute name could be structures as a composite attributeconsisting of Firstname and Lastname

    3) Single Valued Attributes: Attributes having single value for a particular entity.Example: Age is a single valued attribute of a student entity.

    4) Multivalued Attribute: Attributes that have more than one value for aparticular entity is called multivalued attribute. It is represented by double ellipseExample: Consider an employee entity set with the attribute phone_number. Anemployee may have zero or more than one phone number. This type of attributeis said to be mulvalued attribute.

    5) Stored attribute: Attributes that are directly stored in the database.Example Birthdate attribute of a person.

    6) Derived attribute: They are derived from the values of other related attributesor entities. The value of a derived attribute is not stored but is computed whenrequired. It is denoted by dotted ellipse.Example: Age is calculated from date of birth, experience is calculated from DOJ

    7) An attribute takes a null value when an entity does not have a value for it.

    http://wofford-ecs.org/DataAndVisualization/ermodel/images/fig%204.jpg
  • 8/4/2019 Notes_ER Model Notes

    8/17

    8) Descriptive Attribute: Descriptive attributes are used to record informationabout the relationship.

    Relationships:

    A relationship can be defined as a Connection or set of associations

    A rule for communication among entities Association among several entities

    It is denoted by diamond symbol.

    Example: Association between student and course.

    Relationship Sets:

    A relationship set is a set of relationships of the same type.

    Example: Consider the relationship between two entity sets student and course.Collection of all the instances of relationship OPTS forms a relationship set.

    Degree of a relationship type is the number of participating entities.

    The relationship between two entities is called binary relationship.

    Fig: An ER diagram with a binary relationship.

    A relationship among three entities is called ternary relationship.

    STUDENT OPTS COURSE

  • 8/4/2019 Notes_ER Model Notes

    9/17

    Fig: ER diagram with a ternary relationship.

    Relationship among n entities is called n-ry relationship.

    Roles:

    Entity sets of a relationship set need not be distinct.

    The function that an entity plays in a relationship is called its role. Roles are normally explicit and

    not specified.

    They are useful when the meaning of a relationship set needs clarification.

    Roles are indicated in ER diagrams by labeling the lines that connect diamonds to rectangles.

    Role labels are optional and are used to clarify semantics of the relationship.

    Cardinality Constraints:Cardinality specifies the number of instances of an entity associated with anotherentity participating in a relationship. Based on the cardinality, binary relationshipcan be further classified into the following categories:

    One-to-one: An entity in A is associated with at most one entity in B, and an entityin B is associated with at most one entity in A.

    Example : Relationship between college and principal has College Principal

    One college can have at the most one principal and one principal can be assigned toonly one college.

  • 8/4/2019 Notes_ER Model Notes

    10/17

    One-to-many: An entity in A is associated with any number of entities in B. Anentity in B is associated with at the most one entity in A.

    Example : Relationship between department and faculty.

    One department can appoint any number of faculty members but a faculty memberis assigned to only one department.

    Many-to-one: An entity in A is associated with at most one entity in B. An entity inB is associated with any number in A.

    Example: Relationship between course and instructor. An instructor can teachvarious courses but a course can be taught only by one instructor. Please note this isan assumption.

    Many-to-many: Entities in A and B are associated with any number of entities fromeach other.

    Example 20:

    Taught by Relationship between course and faculty.One faculty member can be assigned to teach many courses and one course may betaught by many faculty members.

    Relationship between book and author.

    One author can write many books and one book can be written by more than oneauthors.

    Recursive relationships:

    When the same entity type participates more than once in a relationship type indifferent roles, the relationship types are called recursive relationships

    Participation constraints:

  • 8/4/2019 Notes_ER Model Notes

    11/17

    The participation Constraints specify whether the existence of an entity depends onits being related to another entity via the relationship type. There are 2 types ofparticipation constraints:

    Total Participation: When all the entities from an entity set participate in arelationship type, is called total participation.

    Partial Participation: When it is not necessary for all the entities from an entity

    set to participate in a relationship type, it is called partial participation.

    In the above ER diagram, every loan must have a customer associated to it throughborrower. Therefore participation of loan in borrower is total and participation ofcustomer in borrower is partial.

    Strong entity set: The entity types containing a key attribute are called strongentity types or regular entity types.

    Example: The Student entity has a key attribute RollNo which uniquely identifies it,hence is a strong entity.

    Weak Entity Set:

    Entity types that do not contain any key attribute, and hence cannot beidentified independently, are called weak entity types.

    A weak entity can be identified uniquely only by considering some of itsattributes in conjunction with the primary key attributes of another entity, whichis called the identifying owner entity.

    A partial key is attached to a weak entity type that is used for uniqueidentification of weak entities related to a particular owner entity type.

    The following restrictions must hold:

    The owner entity set and the weak entity set must participate in one to

    many relationship set. This relationship set is called the identifyingrelationship set of the weak entity set. The weak entity set must have total participation in the identifying

    relationship.

  • 8/4/2019 Notes_ER Model Notes

    12/17

    Enhanced Entity Relationship Model (EER)Semantic concepts are incorporated into the original ER model and are called theEER Model.

    Examples of additional concepts of EER model area) Specialization

    b) Generalizationc) Aggregation

    Superclass: An entity type that includes one or more distinct subgroups of itsoccurrences.

    Subclass: A distinct subgrouping of occurrences of an entity type.

    Super class / Sub class relationship is one-to-one.

    Superclass may contain overlapping or distinct subclasses.

    Not all members of a superclass need to be a member of subclass.

    Specialization:1) Specialization is a process of identifying subsets of an entity sets( the

    superclass) that share some distinguishing characteristics.2) Superclass is defined first, subclasses are defined next and subclass attributes

    and relationship set are then added.3) Top-down design process; we designate subgroupings within an entity set that

    are distinctive from other entities in the set.4) These subgroupings become lower-level entity sets that have attributes or

    participate in relationships that do not apply to the higher-level entity set.5) Depicted by a triangle component labeled ISA (E.g. customeris aperson).6) Attribute inheritance a lower-level entity set inherits all the attributes and

    relationship participation of the higher-level entity set to which it is linked.Example:

    In the below ER diagram Person is the entity set having attributes Person_id,Name,Street,City.Person can be further classified as

    Customer

  • 8/4/2019 Notes_ER Model Notes

    13/17

    Employee

    Customer entity is described by the attribute credit_ratingEmployee entity is described by the attributes salary.

    Employee can be further classified as Officer

    Teller Secretary

    The process of designating subgroupings with in an entity set is calledSpecialization.

    Generalization:

    1) A bottom-up design process.2) Generalization consists of identifying some common characteristics of a

    collection of entity sets and creating a new entity set that contains entities

    possessing these common characteristics.3) Subclasses are defined first, superclass is defined next and relationship sets

    involving superclass are then defined.4) Specialization and generalization are simple inversions of each other; they are

    represented in an E-R diagram in the same way

    Design Constraints on a Specialization/Generalization

    A. Constraint on which entities can be members of a given lower-level entity set.

  • 8/4/2019 Notes_ER Model Notes

    14/17

  • 8/4/2019 Notes_ER Model Notes

    15/17

    Should a concept be modeled as an entity or an attribute?

    Consider the scenario, if we want to add address information to the Employees entity set? We

    might choose to add a single attribute address to the entity set. Alternatively, we could introduce

    a new entity set, Addresses and then a relationship associating employees with addresses. What

    are the pros and cons?

    Adding a new entity set is more complex model. It should only be done when there is need for thecomplexity. For example, if some employees have multiple address to be associated, then themore complex model is needed. Also, representing addresses as a separate entity would allow a

    further breakdown, for example by zip code or city.

    What if we wanted to modify the Works_In relationship to have both a start and end date,

    rather than just a start date. We could add one new attribute for the end date; alternatively, wecould create a new entity set Duration which represents intervals, and then the Works_In

    relationship can be made ternary (associating an employee, a department and an interval). What

    are the pros and cons?

    If the duration is described through descriptive attributes, only a single such duration can bemodeled. That is, we could not express an employment history involving someone who left the

    department yet later returned.

    Should a concept be modeled as an entity or a relationship?

    Consider a situation in which a manager controls several departments. Let's presume that a

    company budgets a certain amount (budget) for each department. Yet it also wants managers to

    have access to some discretionary budget (dbudget). There are two corporate models. A

    discretionary budget may be created for each individual department; alternatively, there may be a

    discretionary budget for each manager, to be used as she desires.

    Which scenario is represented by the following ER diagram? If you want the alternateinterpretation, how would you adjust the model?

    Should we use binary or ternary relationships?

    Consider the following ER diagram, representing insurance policies owned by employees at acompany. Each employee can own several polices, each policy can be owned by several

    employees, and each dependent can be covered by several policies.

  • 8/4/2019 Notes_ER Model Notes

    16/17

    What if we wish to model the following additional requirements:

    A policy cannot be owned jointly by two or more employees.Every policy must be owned by some employee.

    Dependents is a weak entity set, and each dependent entity is uniquely identified by

    taking pname in conjunction with the policyid of a policy entity (which, intuitively,

    covers the given dependent).

    The best way to model this is to switch away from the ternary relationship set, and instead use twodistinct binary relationship sets.

    Should we use aggregation?

  • 8/4/2019 Notes_ER Model Notes

    17/17

    Consider again the following ER diagram:

    If we did not need the until orsince attributes. In tihs case, we could model the identical setting

    using the following ternary relationship: