31
INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

Embed Size (px)

Citation preview

Page 1: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETYSESSION 15 – RELATIONAL DATABASES

SEAN J. TAYLOR

Page 2: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

ADMINISTRATIVIA

• Assignment 3: Due tonight at midnight(AdSense in a week)

• Midterm back on Thursday

• Database tutorial led by Varun

• Assignment 4: Posted Thursday, due Friday 3/30

Page 3: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

ADMINISTRATIVIA II

• Groups:I will email a form for you to list up to five classmates and then choose the groups. You can list 0-5.

• 2-way feedback:1. I will send you an anonymous survey.2. I will send you a brief summary of your current grade.

Page 4: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

LEARNING OBJECTIVES

1. Understand what relational databases are (or, why text files and Excel are not enough)

2. Identify and distinguish between the following parts of a relational database: tables, records, fields, field values

3. Understand three types of anomalies that arise from un-normalized data

4. Understand how primary keys and foreign keys are used to link tables.

Page 5: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

WHY ARE DATA VALUABLE?

Page 6: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR
Page 7: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

RELATIONAL DATABASES

• Store data (insert)• Retrieve data (query)

• Software applications• Operations

• Analyze data (reporting capabilities)

Page 8: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR
Page 9: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

WHY NOT STORE DATA LIKE THIS?

Order# Date Customer ID Last Name First Name Address ISBN Book Name Author Price

1 9/1/03 C1001 Bezos Jeff 1 Amazon Plaza #0465039138 Code and other laws of cyberspace Lessig, Lawrence $25.00

2 9/2/03 C1004 Sproull LeeDean's Office, Stern School, New York #1573928895

Digital Copyright: Protecting Intellectual Property on the Internet Litman, Jessica $55.00

3 9/3/03 C1002 Student Pat Tisch LC-12, New York #0072952849 MIS in the Information Age Haag, Stephen $98.75

4 9/4/03 C1003 Gates BillMicrosoft Corporation, Redmond #0738206679

Linked: The New Science of Networks

Barabasi, Albert-Laszlo $34.95

5 9/5/03 C1003 Gates BillMicrosoft Corporation, Redmond #0738206083

Smart Mobs: The Next Social Revolution Rheingold, Howard $29.95

6 9/6/03 C1001 Bezos Jeff 1 Amazon Plaza #0738206083 Smart Mobs: The Next Social Revolution Rheingold, Howard $29.95

7 9/7/03 C1002 Student Pat Tisch LC-12, New York #1573928895Digital Copyright: Protecting Intellectual Property on the Internet Litman, Jessica $55.00

8 9/8/03 C1001 Bezos Jeff 1 Amazon Plaza #0738206083 Smart Mobs: The Next Social Revolution Rheingold, Howard $29.95

Page 10: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

RELATIONAL DATABASES

• Databases that use a series of logically related two-dimensional tables to store their information

• Tables are comprised of fields/records, which in turn contain field values

Last Name SS# DOB Major

Smith 100201122 06/11/84 IS

Kim 200202222 1/1/85 FIN

Davis 300201232 12/31/81 MKT

Pat 999132212 3/3/88 ACC

StudentTable

Field

Record

Field value

Page 11: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

Relational DatabaseTables

Records FieldsField valuesBytes, bits

Last Name SS# DOB Major

Smith 100201122 06/11/84 IS

Kim 200202222 1/1/85 FIN

Davis 300201232 12/31/81 MKT

Pat 999132212 3/3/88 ACC

StudentTable

Field

Record

Field value

RELATIONAL DATABASES

Page 12: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

ADVANTAGES1. Consistency

• We can restrict the values of certain fields (e.g. dates, integers)

• We can impose other kinds of constraints (all costs must be positive, last names must be included, orders must have addresses)

• Data look the same to all users at the same time.2. Centralization

• Many different users can edit and view the data simultaneously. Efficient sharing of information.

3. Efficient Querying

• SQL and other query languages can be used to create complex reports quickly

Page 13: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

PROBLEMS WITH EXCEL?

When should you use a database instead of Excel?

– Insertion anomalies

– Deletion anomalies

– Update anomalies}Data Quality Problems

Should we just create multiple workbooks in Excel?

– The real power of a database: Querying

– How would you answer the following question in Excel?

– Find customers that spend on average $50 per book order, that live on West Coast or on the East Coast (but not in Midwest) and whose annual income is at least $150K

Page 14: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

INSERTION ANOMALIES

• Inability to insert a piece of information about an object without having to insert a (bogus) piece of information about something else

• Example: Adding a new customer/book before it is orderedHow can you add the book “Harry Potter” in the file below?

Order# Date Customer ID Last Name First Name Address ISBN Book Name Author Price

1 9/1/03 C1001 Bezos Jeff 1 Amazon Plaza #0465039138 Code and other laws of cyberspace Lessig, Lawrence $25.00

2 9/2/03 C1004 Sproull LeeDean's Office, Stern School, New York #1573928895

Digital Copyright: Protecting Intellectual Property on the Internet Litman, Jessica $55.00

3 9/3/03 C1002 Student Pat Tisch LC-12, New York #0072952849 MIS in the Information Age Haag, Stephen $98.75

4 9/4/03 C1003 Gates BillMicrosoft Corporation, Redmond #0738206679

Linked: The New Science of Networks

Barabasi, Albert-Laszlo $34.95

5 9/5/03 C1003 Gates BillMicrosoft Corporation, Redmond #0738206083

Smart Mobs: The Next Social Revolution Rheingold, Howard $29.95

6 9/6/03 C1001 Bezos Jeff 1 Amazon Plaza #0738206083 Smart Mobs: The Next Social Revolution Rheingold, Howard $29.95

7 9/7/03 C1002 Student Pat Tisch LC-12, New York #1573928895Digital Copyright: Protecting Intellectual Property on the Internet Litman, Jessica $55.00

8 9/8/03 C1001 Bezos Jeff 1 Amazon Plaza #0738206083 Smart Mobs: The Next Social Revolution Rheingold, Howard $29.95

Page 15: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

DELETION ANOMALIES• The loss of a piece of information about one object when a

piece of information about a different object is deleted

• Example: Deleting order 2 => deleting customer Lee Sproull

• Example: Deleting order 1 => deleting book “Code…”Order# Date Customer ID Last Name First Name Address ISBN Book Name Author Price

1 9/1/03 C1001 Bezos Jeff 1 Amazon Plaza #0465039138 Code and other laws of cyberspace Lessig, Lawrence $25.00

2 9/2/03 C1004 Sproull LeeDean's Office, Stern School, New York #1573928895

Digital Copyright: Protecting Intellectual Property on the Internet Litman, Jessica $55.00

3 9/3/03 C1002 Student Pat Tisch LC-12, New York #0072952849 MIS in the Information Age Haag, Stephen $98.75

4 9/4/03 C1003 Gates BillMicrosoft Corporation, Redmond #0738206679

Linked: The New Science of Networks

Barabasi, Albert-Laszlo $34.95

5 9/5/03 C1003 Gates BillMicrosoft Corporation, Redmond #0738206083

Smart Mobs: The Next Social Revolution Rheingold, Howard $29.95

6 9/6/03 C1001 Bezos Jeff 1 Amazon Plaza #0738206083 Smart Mobs: The Next Social Revolution Rheingold, Howard $29.95

7 9/7/03 C1002 Student Pat Tisch LC-12, New York #1573928895Digital Copyright: Protecting Intellectual Property on the Internet Litman, Jessica $55.00

8 9/8/03 C1001 Bezos Jeff 1 Amazon Plaza #0738206083 Smart Mobs: The Next Social Revolution Rheingold, Howard $29.95

Page 16: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

UPDATE ANOMALIES• A need to change the same piece of information about an

object multiple times

• Example: Changing Jeff Bezos address in order 1 leaves orders 6 and 8 unchanged…

Order# Date Customer ID Last Name First Name Address ISBN Book Name Author Price

1 9/1/03 C1001 Bezos Jeff 1 Amazon Plaza #0465039138 Code and other laws of cyberspace Lessig, Lawrence $25.00

2 9/2/03 C1004 Sproull LeeDean's Office, Stern School, New York #1573928895

Digital Copyright: Protecting Intellectual Property on the Internet Litman, Jessica $55.00

3 9/3/03 C1002 Student Pat Tisch LC-12, New York #0072952849 MIS in the Information Age Haag, Stephen $98.75

4 9/4/03 C1003 Gates BillMicrosoft Corporation, Redmond #0738206679

Linked: The New Science of Networks

Barabasi, Albert-Laszlo $34.95

5 9/5/03 C1003 Gates BillMicrosoft Corporation, Redmond #0738206083

Smart Mobs: The Next Social Revolution Rheingold, Howard $29.95

6 9/6/03 C1001 Bezos Jeff 1 Amazon Plaza #0738206083 Smart Mobs: The Next Social Revolution Rheingold, Howard $29.95

7 9/7/03 C1002 Student Pat Tisch LC-12, New York #1573928895Digital Copyright: Protecting Intellectual Property on the Internet Litman, Jessica $55.00

8 9/8/03 C1001 Bezos Jeff 1 Amazon Plaza #0738206083 Smart Mobs: The Next Social Revolution Rheingold, Howard $29.95

Page 17: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

ENTITY-RELATIONSHIP DIAGRAMS

MODELING DATA WITH

Page 18: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

ENTITY RELATIONSHIP DIAGRAM

• The aim of an ERD is to model the data within the Information System.

• Provides a CONCEPTUAL DATA MODEL:a concept of the system, independent of implementation

1. What data should be stored?

2. What relationships exist between items of data?

Page 19: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

ENTITIES

An entity can be uniquely identified.

Organizations collect and store data about entities:

• if a bank stores data about you - you are an entity• if a business stores a piece of paper called an invoice - the

invoice is an entity• a library stores data about a particular book - the book is

an entity

An actual, real thing or person about which data might be stored is referred to as an entity.

An actual, real thing or person about which data might be stored is referred to as an entity.

Page 20: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

RELATIONSHIPS

• Entities are associated with each other via relationships.

• A relationship is a named association between two or more entity types:

Player TeamPlays for

Page 21: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

DEFINING RELATIONSHIPS

Entity-relationship (E-R) diagram: a graphic method of representing entity classes and their relationships.

• Rectangle – entity class

• Dotted line – relationship

• | – single relationship

• O – zero or optional relationship

• Crow’s foot () – multiple relationship

The types of relationships reflect the business rules applicable to the entities

Page 22: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

SIMPLE HOSPITAL EXAMPLE

In a hospital system, each ward has many patients who are cared for by nurses assigned to the specific ward. Patients may require treatment by more than one specialist doctor. A patient belongs to only one ward.

Page 23: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

SIMPLE HOSPITAL EXAMPLE

In a hospital system, each ward has many patients who are cared for by nurses assigned to the specific ward. Patients may require treatment by more than one specialist doctor. A patient belongs to only one ward.

Page 24: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

SIMPLE HOSPITAL SYSTEM

• Ward has many patients (1:N)• Patients are cared for by nurses (N:M)• Ward has assigned many nurses (1:N)• Patients require treatment by one or

more doctor (N:M)

Page 25: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

SIMPLE HOSPITAL SYSTEM ERD

WARD

NURSE PATIENT

DOCTOR

treatsaccommodateshas

assigned

caresfor

Page 26: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

UNIVERSITY DATABASE EXAMPLE

A department has many Professors. A Professor belongs to only one department. The department offers many different courses, and many Professors can teach a single course. A Professor can also teach more than one course. Students enroll for many courses and courses have many students. A course belongs to only one department.

Page 27: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

NORMALIZATION

FITTING DATA INTO THE RELATIONAL MODEL

Page 28: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

NORMALIZING AMAZON’S DATA• The process of assuring that a database can be implemented

effectively as a set of two-dimensional tables

• Unlike Excel though, the tables are connected

• Prevents insertion, deletion and update anomalies

Page 29: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

CONNECTING TABLES TOGETHER

Primary keys– A field (or group of fields in some cases) that uniquely

describes each record in a table

– Examples: Customer ID, ISBN, Order#

Foreign keys– A field that is a primary key in one table and appears in a

different table (though not as the primary key)

– Examples: Customer ID in Orders

Integrity constraints– Rules that help ensure data quality

Each record should have a unique primary key

Page 30: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

DATABASE SCHEMASummary of the logical structure of your database

– The tables in your database, along with each of their fields, keys

– The relationships between the tables

Primary key Primary key

Primary key

Foreign key

Foreign key

Page 31: INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETY SESSION 15 – RELATIONAL DATABASES SEAN J. TAYLOR

NEXT CLASS:SQL

• Do the SQL tutorial athttp://sqlzoo.net/(at least sections 0-3)

• Download the “Facebook” database from Blackboard and make sure you can open it in MS Access

• Interest in OSS databases?

• Bring a laptop if you want.