23

Click here to load reader

Data Base Theory Assignment1

Embed Size (px)

DESCRIPTION

bhbj

Citation preview

NAME: KISHORE SAMA STUDENT ID: 109004240

NAME: THOTA VINAY KUMAR, STUDENT ID: 109004052DATABASE THEORY ASSIGNMENT 1 CSCI507 FA-09

1. What is a relational database, and how does its logical structure differ from that of

the network and hierarchical databases? ANSWER: RELATIONAL DATABASE:A relational database is a database where all data visible to the user is organized strictly

as tables of data values, and where all database operations work on these tables.

A relational DBMS can represent parent/child relationships, but they are visible only through the data values contained in the database tables.

A relational database is a collection of data items organized as a set of formally-described tables from which data can be accessed or reassembled in many different ways without having to reorganize the database tables. The relational database was invented by E. F. Codd at IBM in 1970.

In addition to being relatively easy to create and access, a relational database has the important advantage of being easy to extend. After the original database creation, a new data category can be added without requiring that all existing applications be modified.

There are three types of record based logical models.

They are

The relational modelThe hierarchical model

The network model The relational model proposed by Codd was an attempt to simplify database structure. It

eliminated the explicit parent/child structures from the database and instead represented

all data in the database as simple row/column tables of data values.

The structure of relational database is different from hierarchical database and network database is that child/parent relationship exist in hierarchical model, multiple parent/child in network database and in relation model parent/child structures are eliminated. In hierarchical model & network database each record represent a specified part in the database.

Differences Among the Models

The relational model differs from the network and hierarchical models

is that it does not use pointer or links. Instead, the relational

model relates records by the values that they contain. This freedom

from the use of pointers allows a formal mathematical foundation to be defined.

2.What is a relational database table, and what are its components?ANSWER: Relational Databases and Their Components

The relational database model was conceived by E. F. Codd in 1969. The model is based on branches of mathematics called set theory and predicate logic. The main idea is that a database consists of various unordered tables called relations that can be modified by the user. Relational databases were a major improvement to traditional database systems, which were not as flexible and sometimes hardware-dependent.

The organizing principle in a relational database is the table, a rectangular row/column

arrangement of data values. Each table in a database has a unique table name that identifies its contents.

Each row of a table contains exactly one data value in each column.

For each column of a table, all of the data values in that column hold the same type of data. Each column in a table has a column name, which is usually written as a heading at the top of the column. The columns of a table must all have different names, but there is no prohibition against two columns in two different tables having identical names.

The columns of a table have a left-to-right order, which is defined when the table is first created. A table always has at least one column. The ANSI/ISO SQL standard does not

specify a maximum number of columns in a table

Unlike the columns, the rows in a table do not have any particular order. In fact, if you

use two consecutive database queries to display the contents of a table, there is no guarantee that the rows will be listed in the same order twice

Relational databases consist of various components. They are : A table is made up of columns and rows.

A column is a set of values of the same datatype; a character column, for example, contains character strings and an integer column contains integers.

A row is a sequence of values such that the nth value of the row corresponds to the nth column of the table.

Each row is typically identified by a unique value known as its primary key. (It is possible, although not generally useful, to create a table without a primary key column.)

A base table is a table created with a CREATE TABLE statement. A base table persists in the database until it is removed with a DROP TABLE statement.

A result table is returned by a SELECT statement.

A temporary table is a table that is accessible only during the session in which it is created. A temporary table persists in the database only for the duration of that session or until it is removed with a DROP TABLE statement.

3. What are keys? What types of keys may be found within a relational database, and

what are their functions?

ANSWER:

Definition of a Key: Simply consists of one or more attributes that determine other attributes.

Determination : Which can be explained by having two attributes, A and B. The statement that A determines B means that knowing the value of A means that you can also determine the value of attribute B.

1. Super key : An attribute that uniquely identifies each entity in a table.

2. Candidate Key: Similar to a super key, but does not contain a subset of attributes that is itself a super key.

3. Primary Key (PK): Candidate Key selected to uniquely identify all other attribute values in any given row. A primary key cannot contain null values.

"Every table must have a primary key, an attribute or combination of attributes that are guaranteed to be unique and not null.

The entity integrity rule states that for every instance of an entity, the value of the primary key must exist, be unique, and cannot be null."In a well-designed relational database, every table has some column or combination of

columns whose values uniquely identify each row in the table. This column (or columns)

is called the primary key of the table

The primary key has a different unique value for each row in a table, so no two rows of

a table with a primary key are exact duplicates of one another. A table where every row is

different from all other rows is called a relation in mathematical terms. The name relational database comes from this term, because relations (tables with distinct rows) are at the heart of a relational database.

Although primary keys are an essential part of the relational data model, early relational

database management systems (System/R, DB2, Oracle, and others) did not provide explicit support for primary keys. Database designers usually ensured that all of the tables in their data bases had a primary key, but the DBMS itself did not provide a way to identify the primary key of a table.Every table should have a primary key. In this case, the name would be a useful primary key if the names are unique. Primary keys have to be fields that contain unique valuesa primary key is the identifier of a record (row).

Keys have a significant impact on performance, but are also needed to guarantee data integrity.

A simple table with four columns.

4. Secondary Key: An attribute or combination, used strictly for data retrieval purposes.

5. Foreign Key (FK): An Attribute or combination, in one table whose values must either match the primary Key (PK) in another table or be null.

"A foreign key exists in a table to identify a primary key in another. The join of two relations is made at the foreign key and with the referential integrity rule applied creates reliable navigation of relations and data integrity.

The referential integrity rule states that every foreign key value must match a primary key value in an associated table.

The referential integrity rule states that every foreign key value must match a primary key value in an associated table.

A column in one table whose value matches the primary key in some other table is called a foreign key. Just as a combination of columns can serve as the primary key of a table, a foreign key can also be a combination of columns. In fact, the foreign key will always be a compound (multicolumn) key when it references a table with a compound primary key. Obviously, the number of columns and the data types of the columns in the foreign key and the primary key must be identical to one another.

A table can contain more than one foreign key if it is related to more than one other

Foreign keys are keys "taken" from a different table. Imagine a database with two tables. In one table, we store information about companies, such as the name and the location of each company. In the second table, we store information about the employees of the companies stored in the first table. We use a foreign key to make sure that the second table cannot contain information about employees who do not work for one of the companies listed in the first table. The behavior of PostgreSQL when dealing with foreign keys can be defined for every table. It can be defined, for instance, that all employees in the second table are removed when a company is removed from the first table. Rules defining PostgreSQL's behavior are called integrity constraints.

Foreign keys are extremely useful when working with complex data models and are usually used to protect data integrity.

4. How is database integrity assured within the relational database environment?ANSWER: Database integrity ensures that data entered into the database is accurate, valid, and consistent. Any applicable integrity constraints and datavalidation rules must be satisfied before permitting a change to the database.

Three basic types of database integrity constraints are:

Entity integrity, allowing no two rows to have the same identity within a table.

Domain integrity, restricting data to predefined data types e.g.: dates.

Referential integrity, requiring the existence of a related row in another table, e.g. a customer for a given customer ID.

In the relational data model, entity integrity is one of the three inherent integrity rules.

Entity integrity: is an integrity rule which states that every table must have a primary key and that the column or columns chosen to be the primary key should be unique and not null .Entity integrity:

A direct consequence of this integrity rule is that duplicate rows are forbidden in a table. If each value of a primary key must be unique no duplicate rows can logically appear in a table. The NOT NULL characteristic of a primary key ensures that a value can be used to identify all rows in a table.

Within relational databases using SQL, entity integrity is enforced by adding a primary key clause to a schema definition. The system enforces Entity Integrity by not allowing operations (INSERT, UPDATE) to produce an invalid primary key. Any operation that is likely to create a duplicate primary key or one containing nulls is rejected. The Entity Integrity ensures that the data that you store remains in the proper format as well as compressive.

Domain integrity: A data domain refers to all the unique values which a data element may contain. The rule for determining the domain boundary may be as simple as a data type with enumerated list of values.

For example, a database table that has information about people, with one record per person, might have a "gender column This gender column might be declared as a string data type, and allowed to have one of two known code values: "M" for male, "F" for female -- and NULL for records where gender is unknown or not applicable. The data domain for the gender column is: "M", "F".

In a normalized data model, the reference domain is typically specified in a reference table.

Less simple domain boundary rules, if database-enforced, may be implemented through a check constraint or, in more complex cases, in a database trigger. For example, a column requiring positive numeric values may have a check constraint declaring the values must be greater than zero.

This definition combines the concepts of domain as an area over which control is exercised and the mathematical idea of a set of values of an independent variable for which a function is defined.

Referential integrity: is a property of data which, when satisfied, requires every value of one attribute (column) of a relation (table) to exist as a value of another attribute in a different (or the same) relation (table).

Less formally, and in relational databases: For referential integrity to hold, any field in a table that is declared a foreign key can contain only values from a parent table's primary key or a candidate key. For instance, deleting a record that contains a value referred to by a foreign key in another table would break referential integrity. Some relational database management systems (RDBMS) can enforce referential integrity, normally either by deleting the foreign key rows as well to maintain integrity, or by returning an error and not performing the delete. Which method is used may be determined by a referential integrity constraint defined in a data dictionary5. Why would you want to have a data dictionary, and how would you use it?

ANSWER:DATA DICTIONARY :

Dictionaries are like watches; the worst is better than none, and the best cannot be expected to go quite true. Mrs. Priozzi Anecdotes of Samuel Johnson, 1786

The importance of a data dictionary is often lost on many adults, for they have not used a dictionary for 10 or 20 years. Try to think back to your elementary school days, when you were constantly besieged with new words in your schoolwork. Think back also to your foreign language courses, particularly the ones that required you to read books and magazines. Without a dictionary, you would have been lost. The same is true of a data dictionary in systems analysis: without it, you will be lost, and the user wont be sure you have understood the details of the application.

The phrase data dictionary is almost self-defining. The data dictionary is an organized listing of all the data elements that are pertinent to the system, with precise, rigorous definitions so that both user and systems analyst will have a common understanding of all inputs, outputs, components of stores, and intermediate calculations. The data dictionary defines the data elements by doing the following:

It gets rather tedious describing the composition of data elements in a rambling narrative form. We need a concise, compact notation, just as a standard dictionary like Websters has a compact, concise notation for defining the meaning of ordinary words.

Even though the data dictionary correctly cross-references the aliases to the primary data name, you should avoid using aliases whenever possible.

Structure of the Data Dictionary

The data dictionary consists of the following:

Base TablesThe underlying tables that store information about the associated database. Only Oracle should write to and read these tables. Users rarely access them directly because they are normalized, and most of the data is stored in a cryptic format.

User-Accessible ViewsThe views that summarize and display the information stored in the base tables of the data dictionary. These views decode the base table data into useful information, such as user or table names, using joins and WHERE clauses to simplify the information. Most users are given access to the views rather than the base tables.

SYS, Owner of the Data Dictionary

The Oracle user SYS owns all base tables and user-accessible views of the data dictionary. No Oracle user should ever alter (UPDATE, DELETE, or INSERT) any rows or schema objects contained in the SYS schema, because such activity can compromise data integrity. The security administrator must keep strict control of this central account.

How the Data Dictionary Is Used:

The data dictionary has three primary uses:

Oracle accesses the data dictionary to find information about users, schema objects, and storage structures. Oracle modifies the data dictionary every time that a data definition language (DDL) statement is issued. Any Oracle user can use the data dictionary as a read-only reference for information about the database.How Oracle Uses the Data Dictionary:

Data in the base tables of the data dictionary is necessary for Oracle to function. Therefore, only Oracle should write or change data dictionary information. Oracle provides scripts to modify the data dictionary tables when a database is upgraded or downgraded.

During database operation, Oracle reads the data dictionary to ascertain that schema objects exist and that users have proper access to them. Oracle also updates the data dictionary continuously to reflect changes in database structures, auditing, grants, and data.

6. What are relational operators, and what is the purpose of having them?ANSWER:

RELATIONAL OPERATORS :

Relational database supports basic database operations in order to provide useful means for retrieving or manipulating data in tables. Because the relational model has its mathematical basis upon the relational theory (by thinking tables as sets or relations), the supported database operators conform to existing operators in relational algebra. In fact, a relational database software implementation, called DBMS, is said to have higher degree of relational completeness depending upon the extent to which relational algebra operators are supported. In total there are eight operators are found in relational theory, namely 1. SELECT The SQL SELECT statement returns a result set of records from one or more tables

2 .PROJECT The operation of projection consists in selecting the name of the columns of table(s) which one wishes to see appearing in the answer. If one wants to display all the columns "*" should be used. The columns are given after the SELECT clause.

3. JOIN

The JOIN keyword is used in an SQL statement to query data from two or more tables, based on a relationship between certain columns in these tables.

Tables in a database are often related to each other with keys.

A primary key is a column (or a combination of columns) with a unique value for each row. Each primary key value must be unique within the table. The purpose is to bind data together, across tables, without repeating all of the data in every table.

4. INTERSECT INTERSECT operates on two SQL statements.

5. UNIONThe purpose of the SQL UNION command is to combine the results of two queries together. In this respect, UNION is somewhat similar to JOIN in that they are both used to related information from multiple tables. One restriction of UNION is that all corresponding columns need to be of the same data type. Also, when using UNION, only distinct values are selected (similar to SELECT DISTINCT).

6. DIFFERENCE :

It displays all records which are not in multiple tables7. PRODUCTA: A Cartesian join will get you a Cartesian product. A Cartesian join is when you join every row of one table to every row of another table. You can also get one by joining every row of a table to every row of itself. and 8. DIVIDE. Minimally speaking, a DBMS implementation is said to be relational if it supports at least the key relational operators, namely SELECT, PROJECT, and JOIN. Very few DBMSs are capable of supporting all eight relational operators. Use of relational algebra operators on existing tables (relations) results in outcomes look like new relations. This characteristic lets the user recursively applying the operators among the operatorRelational operators such as ,=,== are used in SQL.

Use the database shown in Figure 2 below to answer problems 7 through 15.

FIGURE 2: The Database for Problems 7 - 15

Table name: EMPLOYEE

Database name: CH2_STORE_CO

Table name: STORE

Table name: REGION

7. For each table, identify the primary key and the foreign key(s). If a table does not have a foreign key, write NONE in the assigned space. ANSWER:Table

Primary keyForeign Key(s)

EMPLOYEEEMP_CODESTORE_CODE

STORESTORE_CODEREGION_CODE, EMP_CODE

REGIONREGION_CODENONE

8. Do the tables exhibit entity integrity? Answer Yes or No, then explain your answer.Table

Entity Integrity?

Explanation

EMPLOYEE

STORE

REGION

ANSWER:Table

Entity Integrity? Explanation

EMPLOYEE YESEach EMP_CODE value is unique and there are no nulls.

STORE YESEach STORE_CODE value is unique and there are no nulls.

REGION YESEach REGION_CODE value is unique and there are no nulls.

9. Do the tables exhibit referential integrity? Answer Yes or No, then explain your

answer. Write NA (Not Applicable) if the table does not have a foreign key.

Table

Referential Integrity?

Explanation EMPLOYEE

STORE

REGION

ANSWER:TableReferential Integrity?Explanation

EMPLOYEE YES

Each STORE_CODE value in EMPLOYEE points to an existing STORE_CODE value in STORE.

STORE NOEach REGION_CODE value in STORE points to an existing REGION_CODE value in REGION and each EMP_CODE value in STORE points to an existing EMP_CODE value in EMPLOYEE.

REGION NAThe table does not have a

foreign key

10. Describe the type(s) of relationship(s) between STORE and REGION.ANSWER: The REGION table is represented in the STORE table by the foreign key REGION_CODE. The STORE table has referential integrity because the information in the foreign key REGION_CODE is valid , the relationship between STORE and REGION is M:1.11. Draw the Entity Relationship diagram for the relationship between STORE and REGION.ANSWER:

12. Draw the Relational Schema for the relationship between STORE and REGIONANSWER:

13. Describe the type(s) of relationship(s) between EMPLOYEE and STORE. (Hint: Each store employs many employees, one of whom manages the store.)ANSWER:There are TWO relationships between STORE and REGION. The first relationship, expressed by STORE employs EMPLOYEE, is a 1:M relationship, because one store can employ many employees and each employee is employed by one store. The second relationship, expressed by EMPLOYEE manages STORE, is a 1:1 relationship, because each store is managed by one employee and an employee manages only one store.14. Draw the Entity Relationship diagram to show the relationships among EMPLOYEE, STORE, and REGION.ANSWER:

15. Draw the Relational Schema to show the relationships between EMPLOYEE, STORE, and REGION.

ANSWER:

REGION

STORE

has

has

REGION

STORE

EMPLOYEE