61
ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

  • View
    228

  • Download
    6

Embed Size (px)

Citation preview

Page 1: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Chapter 5

Organizing Data and Information

Page 2: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Data

Data A necessity for almost any enterprise to carry out its

business. Consists of raw facts, and when organized may be transformed into information

Database A collection of data organized to meet users’ needs

Database management system (DBMS) A group of programs that manipulate the database and

provide an interface between the database and the user of the database or other application programs

Page 3: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

DBMS ‘Discussion’ (1)

A collection of programs that enables you to store, modify, and extract information from a database. There are many different types of DBMSs, ranging from small systems that run on personal computers to huge systems that run on mainframes. The following are examples of database applications:

computerized library systems automated teller machines flight reservation systems computerized parts inventory systems

From a technical standpoint, DBMSs can differ widely. The terms

Page 4: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

DBMS ‘Discussion’ (2)

relational, network, flat, and hierarchical all refer to the way a DBMS organizes information internally. The internal organization can affect how quickly and flexibly you can extract information.

Requests for information from a database are made in the form of a query, which is a stylized question. For example, the query

SELECT ALL WHERE NAME = "SMITH" AND AGE > 35

requests all records in which the NAME field is SMITH and the AGE

Page 5: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

DBMS ‘Discussion’ (3)

field is greater than 35. The set of rules for constructing queries is known as a query language. Different DBMSs support different query languages, although there is a semi-standardized query language called SQL (structured query language). Sophisticated languages for managing database systems are called fourth-generation languages, or 4GLs for short.

The information from a database can be presented in a variety of formats. Most DBMSs include a report writer program that enables you to output data in the form of a report. Many DBMSs also include

Page 6: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

DBMS ‘Discussion’ (4)

a graphics component that enables you to output information in the

form of graphs and charts.

Page 7: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Hierarchy of Data

Schematic

Page 8: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Hierarchy of data Example

DatabaseDatabase

FilesFiles

RecordsRecords

FieldsFields

Characters(bytes)

Characters(bytes)

Personel file

Department file

Payroll file

(Project database)

005-10-6321 Johns Francine 10-7-65549-77-1001 Buckley Bill 2-17-79098-40-1370 Fiske Steven 1-5-85

(Personnel file)

098-40-1370 Fiske Steven 1-5-85 598 (Record containing SSN, last name, first name, date of hire)

Fiske (Last name field)

1000100 (Letter ‘F’ in ASCII)

Page 9: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Terminology

Database• A collection of integrated and related files

File• A collection of related records

Record• A collection of related fields

Field• A group of characters

Character• Basic building block of information, represented by a byte

Page 10: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Data Entities, Attributes, and Keys

Entity• A generalized class of people, places, or things (objects) for

which data are collected, stored, and maintained• E.g., Customer, Employee

Attribute• A characteristic of an entity; something the entity is identified by• E.g., Customer name, Employee name

Keys• A field or set of fields in a record that is used to identify the

record• E.g, A field or set of fields that uniquely identifies the record

Page 11: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Keys and Attributes

Key field Attributes (fields)Entities(records)

Employee # Last name First name Hire date Dept. #

005-10-6321 Johns Francine 10-7-65 257

549-77-1001 Buckley Bill 2-17-79 650

098-40-1370 Fiske Steven 1-5-85 598

Page 12: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

The Traditional Approach

The traditional approach… Separate files are created and stored for each

application program

Schematic

Page 13: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

PayrollPayroll

programsReports

Data FilesApplicationprograms Users

InvoicingInvoicingprograms

Reports

Inventorycontrol

Inventorycontrol

programs

Reports

Managementinquiries

Managementinquiriesprograms

Reports

Page 14: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Drawbacks

Data redundancy Duplication of data in separate files

Lack of data integrity The degree to which the data in any one file is accurate

Program-data dependence A situation in which program and data organized for

one application are incompatible with programs and data organized differently for another application

Page 15: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Database Approach

The database approach… A pool of related data is shared by multiple

application programs Rather than having separate data files, each

application uses a collection of data that is either joined or related in the database

Schematic

Page 16: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Payroll data

Inventorydata

InvoicingData

Otherdata

Databasemanagement

system

Payrollprogram

Inventoryprogram

Invoicingprogram

Otherprograms

Reports

Reports

Reports

Reports

Database Interface Applications programs Users

Page 17: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Advantages

Improved strategic use of corporate data Reduced data redundancy Improved data integrity Easier modification and updating Data and program independence Better access to data and information Standardization of data access A framework for program development Better overall protection of the data Shared data and information resources

Page 18: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Disadvantages

Relatively high cost of purchasing and operating a DBMS in a mainframe operating environment

Increased cost of specialized staff Increased vulnerability

Page 19: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Data Modeling and Database Models (1)

Planned data redundancy A way of organizing data in which the logical

database design is altered so that certain data entities are combined

Summary totals are carried in the data records rather than calculated from elemental data

Some data attributes are repeated in more than one data entity to improve database performance

Page 20: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Data Modeling and Database Models (2)

Data model A map or diagram of entities and their relationships

Enterprise data modeling Data modeling done at the level of the entire

organization

Entity-relationship (ER) diagrams A data model that uses basic graphical symbols to

show the organization of and relationships between data

Page 21: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Example:Entity Relationship (ER) Diagram for a Customer

Ordering Database

Schematic

Page 22: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

OrderCustomer Product

1:N one-to-manyrelationship

Last name Colour

Identificationnumber

NameFirstname

Identificationnumber

Attributes

Entities

1 N

Page 23: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Hierarchical Database Model

Hierarchical database model A data model in which data are organized in a

top-down, or inverted tree structure

Schematic

Page 24: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Department C

Employee1

Employee2

Employee3

Employee4

Employee5

Employee6

Department B

Project 1

Department A

Page 25: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Network Data Model

Network data model• An expansion of the hierarchical database model

with an owner-member relationship in which a member may have many owners

Project 1 Project 2

Department A Department B Department C

Page 26: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Relational Data Model

Relational data model All data elements are placed in two-

dimensional tables, called relations, that are the logical equivalent of files

Schematic

Page 27: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Project Number Description Dept. Number

155 Payroll 257

498 Widgets 632

226 Sales manager 598

Dept. Number Dept. Name Manager SSN

257 Accounting 421-55-99993

632Manufacturin

g765-00-3192

598 Marketing 098-40-1370

SSN Last Name First Name Hire Date Dept. Number

005-10-6321 Johns Francine 10-7-65 257

549-77-1001 Buckley Bill 2-17-79 650

098-40-1370 Fiske Steven 1-5-85 598

Data Table 1: Project Table Data Table 2: Department Table

Data Table 3: Manager Table

Page 28: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Relational Database Terminology

Selecting Data manipulation that eliminates rows according to

certain criteria Projecting

Data manipulation that eliminates columns in a table Joining

Data manipulation that combines two or more tables Linked

Relating tables in a relational database together

Page 29: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Linking Data Tables to Answer an Inquiry

Schematic

Page 30: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Project Number Description Dept. Number

155 Payroll 257

498 Widgets 632

226 Sales manager 598

Dept. Number Dept. Name Manager SSN

257 Accounting 421-55-99993

632 Manufacturing 765-00-3192

598 Marketing 098-40-1370

SSN Last Name First Name Hire Date Dept. Number

005-10-6321 Johns Francine 10-7-65 257

549-77-1001 Buckley Bill 2-17-79 650

098-40-1370 Fiske Steven 1-5-85 598

Page 31: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Building and Modifying a Relational Database

Using Microsoft Access

Screen snap

Page 32: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Page 33: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Schemas and Subschemas

Schema A description of the entire database

Subschema A file that contains a description of a subset of

the database and identifies which users can perform modifications on the data items in that subset

Schematic

Page 34: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

DBMS

Schema

SubschemaB

SubschemaA

SubschemaC

User1

User2

User3

User4

User5

Page 35: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Schema ‘Discussion’

Pronounced skee-ma, the structure of a database system, described

in a formal language supported by the database management

system (DBMS). In a relational database, the schema defines the

tables, the fields in each table, and the relationships between fields

and tables.

Schemas are generally stored in a data dictionary. Although a

schema is defined in text database language, the term is often used

to refer to a graphical depiction of the database structure.

Page 36: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Data Definition Language

Data Definition Language (DDL) A collection of instructions and commands

used to define and describe data and data relationships in a specific database

Schematic

Page 37: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

SCHEMA DESCRIPTIONSCHEMA NAME IS XXXXAUTHOR XXXXDATE XXXXFILE DESCRIPTION

FILE NAME IS XXXX ASSIGN XXXXFILE NAME IS XXXXASSIGN XXXX

AREA DESCRIPTIONAREA NAME IS XXXX

RECORD DESCRIPTIONRECORD NAME ISXXXXRECORD ID IS XXXXLOCATION MODE ISXXXXWITHIN XXX AREA FROM XXXX THRU XXXX

SET DESCRIPTIONSET NAME IS XXXXORDER IS XXXXMODE IS XXXXMEMBER IS XXXX...

Page 38: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Data Dictionary

Data Dictionary A detailed description of all data used in the

database

Schematic

Page 39: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

NORTHWESTERN MANUFACTURING

PREPARED BY: D. BORDWELLDATE: 04 AUGUSTAPPROVED BY: J. EDWARDS DATE: 13 OCTOBERVERSION: 3.1PAGE: 1 OF 1

DATA ELEMENT NAME: PARTNODESCRIPTION: INVENTORY PART NUMEROTHER NAMES: PTNOVALUE RANGE: 100 TO 5000DATA TYPE: NUMERICPOSITIONS: 4 POSITIONS OR COLUMNS

Page 40: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Data Dictionary Features

Provide a standard definition of terms and data elements

Assist programmers in designing and writing programs

Simplify database modification Reduce data redundancy Increase data reliability Faster program development Easier modification of data and information

Page 41: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Logical and PhysicalAccess Paths

Logical access path (LAP) Application requires information from the

DBMS

Physical access path (PAP) DBMS accesses a storage device to retrieve

data

Schematic

Page 42: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Dataon

storagedevices

DBMS

Managementinquiries

Othersoftware

Applicationprograms

Physical access path (PAP)

Logical access path (LAP)

Page 43: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Manipulating Data

Concurrency control A method of dealing with a situation in which two or

more people need to access the same record in a database at the same time

Data manipulation language (DML) The commands that are used to manipulate the data in

a database

Structured query language (SQL) A standardized data manipulation language

Page 44: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Structured Query Language (SQL)

“Invented” at IBM’s Almaden Research Centre (San Jose, CA) in the 1970s

E.g.,

Select all (“*”) columns from the EMPLOYEE table in which the JOB_CLASSIFICATION field is equal to “C2”

SELECT * FROM EMPLOYEE WHEREJOB_CLASSIFICATION = “C2”

Page 45: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

SQL ‘Discussion’ (1)

Abbreviation of structured query language, and pronounced either see-kwell or as separate letters. SQL is a standardized query language for requesting information from a database. The original version called SEQUEL (structured English query language) was designed by an IBM research center in 1974 and 1975. SQL was first introduced as a commercial database system in 1979 by Oracle Corporation.

Historically, SQL has been the favorite query language for database management systems running on minicomputers and mainframes.

Page 46: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

SQL ‘Discussion’ (2)

Increasingly, however, SQL is being supported by PC database systems because it supports distributed databases (databases that are spread out over several computer systems). This enables several users on a local-area network to access the same database simultaneously.

Although there are different dialects of SQL, it is nevertheless the closest thing to a standard query language that currently exists. In 1986, ANSI approved a rudimentary version of SQL as the official standard, but most versions of SQL since then have included many

Page 47: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

SQL ‘Discussion’ (3)

extensions to the ANSI standard. In 1991, ANSI updated the

standard. The new standard is known as SAG SQL.

Page 48: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Database Output

Screen snap

Page 49: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Page 50: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Popular Database Management Systems for End Users

Microsoft Access 98 Lotus Approach 98 Inprise (formerly Borland) dBASE DBMS Selection Criteria

• Database size• Number of concurrent users• Performance• Integration• Features• The vendor• Cost

Page 51: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Distributed Databases

Distributed database… A database in which the actual data may be

spread across several smaller databases connected via telecommunications devices

‘Pretty’ picture

Page 52: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Page 53: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Data Warehouse

Data warehouse A relational database management system designed

specifically to support management decision making Current evolution of Decision Support Systems

(DSSs)

Data mart A subset of a data warehouse for small and medium-

size businesses or departments within larger companies Schematic

Page 54: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Relationaldatabases

Hierarchicaldatabases

Networkdatabases

Flat files

Spreadsheets

Dataextraction

process

Query andanalysis

tools

Datawharehouse

Datacleanupprocess

End user access

Page 55: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Designing a Customer Data Warehouse

Sharply define your goals and objectives before you build the warehouse

Choose the software that best fits your goals Determine who/what should be in the database Develop a plan Measure results

Page 56: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Data Mining Applications

Data mining The automated discovery of patterns and relationships

in a data warehouse

Data mining applications• Market segmentation

• Customer queries

• Fraud detection

• Direct marketing

• Market basket analysis

• Trend analysis

Page 57: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

On-Line AnalyticalProcessing (OLAP)

On-line analytic processing (OLAP)• Access to multidimensional databases providing

managerially useful display techniques• Now used to store and deliver data warehouse

information

Data warehouse and OLAP• Provides top-down, query-driven analysis

Data mining• Provides bottom-up, discovery-driven analysis

Page 58: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Open DatabaseConnectivity (ODBC)

Open database connectivity (ODBC)• A set of standards that ensures software written to

comply with these standards can be used with any ODBC-compliant database

Schematic

Page 59: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

dBASE

ACCESSdatabase

Lotus 1-2-3spreadsheet

Paradoxdatabase

ODBC Import

ODBC Link

ODBC Export

Page 60: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

Object-Relational Database Management Systems (ORDBMS)

Object-relational database management system (ORDBMS)• A DBMS capable of manipulating audio, video, and graphical data.

Hypertext• Users can search and manipulate alphanumeric data in an

unstructured way

Hypermedia• Allows businesses to search and manipulate multimedia forms of

data

Spatial data technology• Use of an object-relational database to store and access data

according to the location it describes and to permit spatial queries and analysis

Page 61: ITEC 1010 Information and Organizations Chapter 5 Organizing Data and Information

ITEC 1010 Information and Organizations

End of Chapter 5

Chapter 6