MELJUN CORTES Computer Information Processing Chapter9

Embed Size (px)

Citation preview

  • 7/31/2019 MELJUN CORTES Computer Information Processing Chapter9

    1/16

    CHAPTER 9:DATABASE

    OBJECTIVES

    After completing this chapter, you will be able to:

    Explain why data and information are important to an organization

    Differentiate between file processing and databases

    Discuss the advantages of using a database management system (DBMS)

    Explain how to use a query language

    Describe characteristics of relational and object-oriented databases

    Discuss the responsibilities of the data and database administrators

    CHAPTER OVERVIEW

    Information is one of an organizations most valuable assets. In this chapter, students examine

    databases and information management. Data and information are defined and two critically

    important aspects data integrity and data security are considered. File processing systems

    are contrasted with the database approach. Students explore features of database managementsystems including the data dictionary, data maintenance and retrieval, data security, and backup

    and recovery. Two popular database models, relational databases and object-oriented databases,

    are described. The role of the data and database administrators.

  • 7/31/2019 MELJUN CORTES Computer Information Processing Chapter9

    2/16

    9.1. DATA AND INFORMATION

    Chapter 9: Database

    Data is a collection of items such as words, numbers, images, and sounds that arenot organized and have little meaning individually. Data is processed intoinformation.

    Information is data that is organized, has meaning, and, therefore, is useful. Astudent grade report (information), for example, contains several data items,including student last name, student first name, student identification (ID)number, course numbers, course titles, credit hours, and course grades.

    A database is a collection of data organized in a manner that allows access,retrieval, and use of that data. Database software, also called a databasemanagement system (DBMS), allows you to create a computerized database; add,change, and delete data; sort and retrieve data from the database; and create formsand reports using the data.

    Most organizations realize that data is among their more valuable assets. Withoutdata and information, an organization could not complete many businessactivities. Information accumulated on sales trends, competitors products andservices, production processes, and even employee skills, for example, allow acompany to make decisions and develop, create, and distribute products andservices. This information is a valuable resource that would be difficult, if notimpossible, to replace. Because information cannot be generated without data, anorganization must manage, maintain, and protect its data resources just as it wouldany other resource. Two critically important aspects of this include ensuring thatdata has integrity and is kept secure.

    9.1.1 Data Integrity

    For a computer to output accurate information, the data used to create theinformation must have integrity. Data integrity is the degree to which datais accurate. If your name is misspelled in a student database, for example,the data is inaccurate and is considered a violation of data integrity.Although accurate data does not guarantee accurate information, it is

    impossible to produce accurate information from erroneous data. This

    computing principle, often referred to as garbage in, garbage out (GIGO),

    means that, if you enter erroneous data into a computer (garbage in), thecomputer will produce inaccurate information (garbage out). Data

    integrity is critical because computers and individuals use information

    generated from data to make decisions and take actions. When you placean order with a mail-order company, for example, a sales representative

    enters the product number for each item you order. If he or she enters an

    incorrect product number, the warehouse likely will package and ship anitem you did not intend to order.

    9.1.2 Data Security

    9 - 2

  • 7/31/2019 MELJUN CORTES Computer Information Processing Chapter9

    3/16

    Chapter 9: Database

    Data security involves protecting data so it is not misused or lost. Mostschools, for example, have procedures that allow only authorizedpersonnel to access confidential student data. The school also performsbackup procedures to protect against the loss of data. Backup is refers tomaking duplicate copies of data, files, programs, or disks, which can beused if the originals are lost, damaged, or destroyed. Making backupcopies ensures that you can recover data in a timely manner and thatprocessing can continue.

    9.2 FILE PROCESSING VERSUS DATABASES

    9.2.1 File Processing Systems

    In the past, many organizations stored data in files on tape or disk andmanaged the data using file processing systems. In a typical tile processingsystem, each department within an organization has its own set of files,designed specifically for their own applications, and the records in one file

    are not related to the records in any other file. Figure 9-1 illustrates anexample of how a school might use a file processing system. Theadmissions department has its own files to admit a student into the school,and the registration department has its own set of files to register studentsfor classes.

    Figure 9-1 in a school that uses file processing, the registration and admissions

    departments have their own files that are designed specifically for their applications

    Although organizations have used file process systems for many years,these systems do have major disadvantages. Two of these disadvantagesare data redundancy and isolated data.

    Data Redundancy

    9 - 3

  • 7/31/2019 MELJUN CORTES Computer Information Processing Chapter9

    4/16

    Chapter 9: Database

    Because each department has it own files in a file processingsystem, the files often must store the same (redundant) data. Boththe admissions department files and the registration departmentfiles, for example, must store a students name and address. Dataredundancy wastes resources such as storage space and employeetime. Storing the same data in more than one file requires increasedstorage capacity. When data is added or changed, data maintenancetakes more time because employees must update more than onefile. Data redundancy also compromises data integrity. If a studentrelocates, for example, the school must update the studentsAddress field in the admission and registration files, as sell as anyother department that contains the students Address fieldthroughout the school. If the field is not changed in all the files,then discrepancies among the files will exist.

    Isolated Data.

    When data is stored in multiple files in multiple departments, often

    it is difficult to access the data. For example, to generate a reportlisting the majors of a particular class of students, you would needto access and use data in both the admissions department files andthe registration department files because the admissions files storethe students major and the registration files store the class rosters.Sharing data from multiple, separate file to generate such a listoften is a complicated procedure and typically requires theexpertise of an experienced computer programmer.

    To overcome these and other problems associated with fileprocessing systems, many companies use the database approachfor managing data.

    9.2.2 The Database Approach

    As previously described, a database is a shared collection of data. With thedatabase approach, many application programs in an organization coulduse the data in this single, shared database. A schools database, forexample, would contain dataabout students and courses. Asshown in Figure 9-2,departments within the school,such as admissions andregistration, would share thedata in this database.

    Figure 9-2, departments within the

    school, such as admissions and

    registration, would share the data in

    this database.

    9 - 4

  • 7/31/2019 MELJUN CORTES Computer Information Processing Chapter9

    5/16

    Chapter 9: Database

    Users access the data in the database using database software, which oftenis called a database management system (DBMS). As noted at thebeginning of this chapter, a database management system (DBMS) is asoftware program designed to control access to the database and managethe data resources efficiently. While a user is working with the database,the DBMS resides in the memory of the computer. The next sectionpresents the features of a DBMS in detail.

    The database approach overcomes many of the limitations of fileprocessing systems by reducing data redundancy and allowing for sharingof data. These and other advantages of the database approach arepresented next.

    Reduced Data Redundancy

    Using the database approach, all data is stored together, whichgreatly reduces data redundancy. A school database, for example,

    would store a students name and address only once. When studentdata is entered or changed, one employee makes the change once.Figure 9-3 contrasts a database application to a file processingapplication with respect to data redundancy.

    Figure 9-3 contrasts a database application to a file processing application with

    respect to data redundancy.

    Improved Data Integrity

    Because most data is stored in only one location, the databaseapproach increases the datas integrity by reducing the likelihoodof introducing inconsistencies. When data in the database ischanged, all applications have access to the same updated data.

    9 - 5

  • 7/31/2019 MELJUN CORTES Computer Information Processing Chapter9

    6/16

    Shared Data

    Chapter 9: Database

    .

    Whereas each application in a file processing environment has itsown set of files, the data in a database environment belongs to andis shared by the entire organization. Figure 9-4 compares how adatabase application stores data versus a file processingapplication. Organizations using databases typically set up controlsto define who can access, add, change, and delete the data in adatabase.

    Figure 9-4 The admissions department, registration department, and

    academic departments all have their own files processing environment. In a

    database environment, departments all share the same files

    A database organizes data more efficiently than a file processingsystem, thus it often is easier and faster to develop programs thatuse this data. Many database management systems also provideseveral tools to assist in program development, thus furtherreducing the development time. The next section discusses these

    and other DBMS features.

    Easier Reporting

    The database approach allows non-technical users to access andmanipulate data. Although computer professionals typicallydevelop larger databases and their associated programs, manycomputer users are developing smaller databases themselves,without professional assistance.

    9 - 6

  • 7/31/2019 MELJUN CORTES Computer Information Processing Chapter9

    7/16

    Chapter 9: Database

    While it has many advantages, the database approach does have thedisadvantages of cost and vulnerability. For one, the initial investment inthe hardware and software required for a database usually is high. A largedatabase also can be more complex than a file processing system and thusrequire trained individuals to develop these applications. These largerdatabases also require more memory, storage, and processing power thanfile processing systems.

    A second disadvantage of the database approach is its increasedvulnerability. Because all data is stored in a single location and shared byapplication programs, many users depend on the data in the database. Ifthe database is not operating properly or is damaged or destroyed, manyusers will not be able to perform their jobs. In some cases, certainapplication programs may cease to operate. To protect their valuabledatabase resource, individuals and organizations should establish andfollow security procedures.

    Despite these limitations, many business and home users work withdatabases because of their tremendous advantages.

    9.3 DATABASE MANAGEMENT SYSTEMS

    As previously discussed, a database management system (DBMS) is a softwareprogram or set of programs designed to control access to the database and managethe data resources efficiently. While a user is working with the database, theDBMS resides in the memory of the computer.

    Database management systems are available for many sizes and types ofcomputers (Figure 9-5). Whether designed for a mainframe or a personal

    computer, every database management system has a number of common features.These features include a data dictionary, and functions such as data maintenanceand retrieval, data security, and backup and recovery. The following pages discussthese features of a DBMS.

    Figure 9-5 Database management systems are available for many sizes and types of

    computers

    9 - 7

  • 7/31/2019 MELJUN CORTES Computer Information Processing Chapter9

    8/16

    9.3.1 Data Dictionary

    Chapter 9: Database

    A data dictionary stores data about each file in the database and each fieldwithin those files. For each file, a data dictionary stores data including thefile name, description, and the files relationship to other files. For eachfield, a data dictionary stores data including the field name, field size,description, types of data (e.g., text, numeric, or date/time), default value,validation rules, and the fields relationship to other fields. Figure 9-6shows how a data dictionary might list data about the files and fields in aStudent database.

    Figure 9-6 shows how a data dictionary might list data about the files and fields in a

    Student database.

    A DBMS uses the data dictionary to perform validation checks andmaintain the integrity of the data. When you enter data, the data dictionaryverifies that the entered data matches the fields data type. The HSGraduation Date field, for example, must contain a number. A datadictionary also allows you to specify a default value for certain fields. Adefault value is a value that the DBMS automatically displays in a field.For example, if most students that attend a school live in Indiana, then theDBMS could display a default value of the State field. Displaying adefault value minimizes the possibility of errors, because commonly useddata items are entered for you. Usually, you can override a default value ifit does not apply for a certain record. For example, you can change thevalue, IN, to MI if you need to ad a student that lives in Michigan.

    9.3.2 Data Maintenance and Retrieval

    A DBMS provides several facilities to enable users and programs tomaintain data in and retrieve data from the database. As you have learned,data maintenance involves adding new records, changing data in existingrecords, and removing unwanted records from the database. To retrieve

    9 - 8

  • 7/31/2019 MELJUN CORTES Computer Information Processing Chapter9

    9/16

    Chapter 9: Database

    data from a database, a process called a query, involves extracting specificdata from the database and displaying, printing, or storing it. Thecapability of retrieving (selecting) database information based on aninstruction, called criteria, specified by the user is one of the morepowerful features of a database.

    Often, a variety of users, from experienced professionals to non-technicalusers, need to maintain and retrieve the data in a database. A DBMS thusprovides several methods of accessing data, each of which requiresvarying levels of database expertise. Of these, query languages, forms, andreport generators provide a user-friendly means to maintain and retrievedata from the database. Each of these methods is described in thefollowing paragraphs.

    A query language consists of simple, English-like statements that allowyou to specify the data you want to display, print, or store. Although eachquery language has its own grammar and vocabulary, a person without aprogramming background usually can learn these languages in a short

    time. Although you can maintain data with a query language, most usersutilize a query language only to retrieve data. To simplify the process,many DBMS provide wizards to guide a user through the steps of buildinga query.

    Most database management systems also include a feature called query-by-example. Instead of learning the grammar and vocabulary associatedwith a query language, you can use a query-by-example (QBE) to extractdata from the database. QBE have a graphical user interface that assistsyou with retrieving data.

    A form, sometimes called a data entry form, is a window on the screen

    that provides areas for entering or changing data in a database. You useforms to retrieve and maintain the data in a database. Forms usuallyprovide a means for validating data so as to reduce data entryErrors. When designing a form using a DBMS, you can make the formattractive and easy to use by incorporating color shading, lines, and boxes;varying the fonts and font styles; and using other formatting features.

    A report generator, also called a report writer, allows you to design orlayout a report on the screen, extract data into the report layout, and thendisplay or print the report. Unlike a form, a report generator is used only toretrieve data. Report generators usually allow you to format report pagenumbers and dates; report titles and column headings; subtotals and totals;

    and fonts, font sizes, color, and shading.9.3.3 Data Security

    To ensure that the data in a database is not misused, a DBMS provides

    mechanisms so that only authorized users can access data at permittedtimes. In addition, most database management systems allow you tospecify different levels of access privileges for each field in the database.These access privileges define the activities allowed by a specific user orgroup of users.

    9 - 9

  • 7/31/2019 MELJUN CORTES Computer Information Processing Chapter9

    10/16

    Chapter 9: Database

    Access privileges for data involve establishing who can enter new data,change existing data, delete unwanted data, and retrieve data. In the schooldatabase, for example, a faculty advisor might have read-only privilegesfor student transcripts; that is, the advisor could retrieve the transcriptdata, but cannot change it. The schools registrar, by contrast, would havefull-update privileges to transcript data, meaning that the registrar canretrieve and change the data. Finally, a student would have no accessprivileges to the transcript data and can neither retrieve nor change thedata.

    9.3.4 Backup and Recovery

    Occasionally, a database is damaged or destroyed because of hardwarefailure, a problem with the software, human error, or a catastrophe such asfire or flood. A DBMS provides a variety of techniques to restore thedatabase to a usable form in case it is damaged or destroyed.

    On a regular basis, you should make a backup, or copy, of theentire database. Some DBMS include backup utilities, while othersrely on the backup utilities included with operating systems orthose purchased separately.

    More sophisticated DBMS maintain log, or listing, of activitiesthat have affected the database. If you change a student address,for example, the change appears in the log. In this situation, theDBMS places the following in the log: a copy of the student recordprior to the change, called the before image; the actual change ofaddress data; and a copy of the student record after the change,called the afterimage.

    A DBMS that maintains a log often also provides a recovery utilitythat uses the logs to restore a database in the event it is damaged ordestroyed. Depending on the type of failure, the recovery utilityrestores the database using rollback or roll-forward techniques.In a rollback, also calledbackward recovery, the log is used toreverse or undo any changes made to the database during a certainperiod of time, such as an hour. Once the database is restored, youmust re-enter the transactions entered during this period of time. Ina roll-forward, also calledforward recovery, the log is used to re-enter changes automatically since the last database save or backup.Some database recovery utilities use a combination of bothtechniques.

    9.4 RELATIONAL AND OBJECT-ORIENTED DATABASES

    Every database and database management system is based on a particular datamodel. A data model consists of rules and standards that define how data isorganized in a database. Five data models are hierarchical, network, relational,object oriented, and object relational.

    9 - 10

  • 7/31/2019 MELJUN CORTES Computer Information Processing Chapter9

    11/16

    Chapter 9: Database

    In the past, databases often were organized according to the hierarchical ornetwork data model. In a hierarchical database, data is organized in a series like afamily tre or organization chart. As with a family tree, the hierarchical databasehas branches made up of parent and child records. Eachparent recordcan havemultiple child records. Each child record, however, can have only one parent. Anetwork database is similar to a hierarchical database except that each childrecord can have more than one parent.

    Because hierarchical and network databases offer only limited data access andlack flexibility, database developers prefer two other database models: relationaland object oriented. A newer data model, the object-relational data model,combines features of the relational and object-oriented data models. Thefollowing sections discuss the features of relational and object-oriented datamodels and the databases based on them.

    9.4.1 Relational Databases

    Today, the most commonly used database, the relational database, is basedon the relational data model. A relational database stores data in tables thatconsist of rows and columns. Each row has a primary key and eachcolumn has a unique name.

    A relational database uses terminology different from a file- processingenvironment to represent data. A relational database developer, forexample, refers to a file as a relation, a record as a tuple, and a field as anattribute. A user of a relational database, however, refers to a file as atable, a record as a row, and a field as a column (Figure 9-7). This chapteruses the terms, table, row, and column, when discussing relationaldatabases.

    Figure 9-7In this data- terminology table, the first column identifies the terms

    used by developers in a f ile processing environment. The second column

    presents the terms used by developers of a relational database. The third column

    indicates the terms to which the users of a relational database refer.

    In addition to storing data, a relational database also stores anyassociations among the data, which are called relationships. With arelational database, you can establish a relationship between tables at anytime, provided the tables have a common column (field). You would relatethe Student table and the Schedule table, for example, using the StudentID column. Figure 9-8 illustrates these relational database concepts. In arelational database, the only data redundancy exists in the commoncolumns (fields) that establish relation

    9 - 1

  • 7/31/2019 MELJUN CORTES Computer Information Processing Chapter9

    12/16

    Chapter 9: Database

    Figure 9-8 the student table is linked to the Schedule table through the Student

    ID column. The Schedule table is linked to the Course table through the Course

    Code column.

    RELATIONAL ALGEBRA

    Relational databases often use relational algebra to manipulate

    data. Relational algebra uses variables and operations to build a

    new relation. Three commonly used relational operations areprojection, selection, and join operations.

    To understand the function of the three operations, consider aquery that uses these three operations to retrieve a list of studentsenrolled in ENGL 104 (Figure 9-10). First, the projection operationextracts columns (fields) from a relation, that is, a vertical subsetof a table. In the example, the projection operation retrieves theStudent ID, First Name, and Last Name columns from the Studenttable. The selection operation then retrieves certain rows (records)based on the criteria you specify, that is, a horizontal subset of atable. In the example, the selection operation retrieves all rows

    containing students in ENGL 104 in the Schedule table. The joinoperation then combines the data from the two queries based on acommon column. In the example, the join operation uses theStudent ID to combine the data retrieved from the Student and theSchedule tables.

    9 - 12

  • 7/31/2019 MELJUN CORTES Computer Information Processing Chapter9

    13/16

    Chapter 9: Database

    Figure 9-10 the selection, projection, and join operations are used to

    produce a response to the query.

    STRUCTURED QUERY LANGUAGE

    Relational databases use a query language called Structured QueryLanguage (SQL) to manipulate and retrieve data. SQL includeskeywords and rules used to implement relational algebraoperations. The SQL statement in Figure 10-24, for example,would execute a query that retrieves students enrolled in ENGL104. This SQL statement would generate the relation shown inFigure 9-10.

    Figure 91 This SQL statement would generate the relation

    shown in Figure 9-10.

    Most relational database products for minicomputers andmainframes support Structured Query Language (SQL). Manypersonal computer database system vendors also have developed ormodified existing packages to support SQL.

    9.4.2 Object-Oriented Databases

    An object-oriented database is based on an object-oriented data model

    9 - 13

  • 7/31/2019 MELJUN CORTES Computer Information Processing Chapter9

    14/16

    Chapter 9: Database

    and, thus, maintains objects. An object is an item that can contain bothdata and the activities that read or manipulate the data. A Student object,for example, might contain data about a student (Student ID, First Name,Last Name, Address, and so on) and instructions on how to print thestudent record or the formula required to calculate a students tuition rates.A record in a relational database, by contrast, would only contain dataabout a student.

    Two advantages of object-oriented databases, relative to relationaldatabases, is that they can store more types of data and access this datafaster. With an object-oriented database, you can store unstructured datasuch as photographs, video clips, audio clips, and documents moreefficiently than in a relational database. Further, if you run a query toextract data from an object-oriented database, the object-oriented databaseoften returns results more quickly than the same query of a relationaldatabase. Examples of applications appropriate for an object-orienteddatabase include the following:

    A multimedia database that stores images, audio clips, and/orvideo clips. A geographic information system (GIS) database, forexample, stores maps; a voice mail system stores audio messages;and a television broadcast database stores audio and video clips.

    A groupware database that stores documents such as schedules,calendars, manuals, memos, and reports. Users can perform queriesto search the document contents. One query, for example, mightsearch the schedules for available meeting times.

    A computer-aided design (CAD) database that stores data aboutengineering, architectural, and scientific designs. This data in the

    database includes a list of components of the item being designed,the relationship among the components, and archived versions ofthe design drafts.

    A hypertext database contains text links to other documents, and ahypermedia database also contains graphics, video, and sound. Avariety of hypertext and hypermedia databases are accessible viathe Web. You can search one of these databases for items such asdocuments, graphics, audio and video clips, and links to Webpages.

    Some companies also are developing object-relational databases to take

    advantage of features of both the relational and object-oriented datamodels.

    OBJECT QUERY LANGUAGE

    Object-oriented and object-relational databases often use a querylanguage called object query language (OQL) to manipulate andretrieve data. OQL is similar to SQL in that it uses many of thesame rules, grammar, and vocabulary. Because OQL is a relatively

    9 - 14

  • 7/31/2019 MELJUN CORTES Computer Information Processing Chapter9

    15/16

    Chapter 9: Database

    new standard query language, however, not all object databasessupport it.

    9.5 DATABASE ADMINISTRATION

    Keeping an organizations data centralized in a database requires a great deal ofcooperation and coordination on the part of the database users. In file processingsystems, if you wanted to track or store data, typically you would just createanother file, often duplicating data already stored by someone else, in another file.In a database environment, if you want to track or store data, first you check tose if some or all of the data already is in the database or, if not, how you can adthe data to the database. The role of coordinating the use of the database belongsto the data and database administrators.

    9.5.1 Role of the Data and Database

    The data and database administrators are responsible for managing and

    coordinating all database activities. The data administrator (DA) isresponsible for designing the database; that is, the DA determines theproper placement of fields, defines relationships among data, and outlinesusers access privileges. The database administrator (DBA) is responsiblefor creating and maintaining the data dictionary, establishing andmonitoring security of the database, monitoring the performance of thedatabase, and implementing and testing backup and recovery procedures.

    In small organizations, one person often serves as both the dataadministrator and the database administrator. In larger organizations, theresponsibilities- of the data and database administrators are split amongtwo or more persons.

    9.5.2 Role of the User Administrators

    One of the users first responsibilities is to familiarize himself or herselfwith the data in the existing database. First-time database users often areamazed at the wealth of information available to help them perform theirjobs more effectively.

    Another responsibility of the user is to take an active role in specifyingadditions to the database. The maintenance of an organizations databaseis an ongoing task that organizations should measure constantly againsttheir overall goals. Users thus should participate in designing the databasethat will help them achieve those goals.

    9.5.3 Database Design Guidelines

    A carefully designed database makes it easier for a user to query thedatabase, modify the data, and create reports. Certain database designguidelines, including those shown in Figure 9-12, apply to databases of allsizes.

    9 - 15

  • 7/31/2019 MELJUN CORTES Computer Information Processing Chapter9

    16/16

    Chapter 9: Database

    Figure 9-12 guidelines for developing a database

    9 - 16