28
IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

Embed Size (px)

Citation preview

Page 1: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210

IST 210: ORGANIZATION OF DATAChapter 1. Getting Started

1

Page 2: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 2

Purpose of a Database• The purpose of a database is to keep track of things• Unlike a list or spreadsheet, a database may store

information that is more complicated than a simple list

Page 3: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 3

Mini Case• You are designing our course selection system

• What aspects you need to store a record?• Student ID, Student Name, Student's Department, Email • CourseID, Instructor, CourseName , Location

• What questions (i.e. queries) will users ask?• Student: What class I have registered for this semester?• Instructor: How many students are registered and what are their

backgrounds?

• What tool would you use to manage the data?• Excel?

Page 4: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 4

Problems with a Simple List

Redundancy

Multiple Themes

Page 5: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 5

Problems with a Simple List: Redundancy

• In a list, each row is intended to stand on its own. As a result, the same information may be entered several times• A list of class enrollment may include Student ID, Student Name,

Class, Instructor Name, Location, Lecture time, … • If there are 40 students taking IST210, class information will be

entered 40 times.

Page 6: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 6

Problems with a Simple List: Multiple Themes• In a list, each row may contain information on more than

one theme. As a result, needed information may appear in the lists only if information on other themes is also present• For Example: A list of class registration may include Student

Information (ID, Name, Department) and Course Information (ID, Instructor, Location).

210|Dashun |Organization of Data |208IST

Page 7: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 7

List Modification Issues• Redundancy and multiple themes create modification

problems• Deletion problems• Update problems• Insertion problems

Page 8: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 8

List Modification Issues: Insert

Insert: A new student not taking any class

Page 9: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 9

Insert: A new student not taking any classProblem: blank cells for course information

Page 10: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 10

List Modification Issues: Update

Update: IST210 location changed

Page 11: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 11

Update: IST210 location changedProblem: Need to update multiple rows

Page 12: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 12

List Modification Issues: Delete

Delete: Kate drops 230

Page 13: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 13

Delete: Kate drops 230Problem: Information about Kate and about course 230 will be lost!

Page 14: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 14

A Long List to Several Small Lists

Two themes: Student, Course

INFORMATION LOSS! Registration information is not in Student and Course tables

Page 15: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 15

A Long List to Several Small Lists

PROBLEMS!One cell does NOT allow multiple values. (IMPORTANT! This rule is strictly enforced in database.)

Two themes: Student, Course

Page 16: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 16

A Long List to Several Small Lists

Student Entity

Course Entity

Student-Course Relationship

Three themes: two entities and one relationship

Page 17: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 17

A Long List to Several Small ListsStudent

Course

Registration

Key points in splitting: 1. A table must be connected with other table(s) through shared column(s)Student (StudentID) RegistrationCourse (CourseID) Registration2. One cell can only have one value

Revisit previous issues:Insert: A new student not taking any classUpdate: IST210 location changedDelete: Kate drops 230

Use above criteria to check whether you split the tables correctly!

Page 18: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 19

Relational Databases• A relational database stores information in tables. Each

informational topic is stored in its own table.• In essence, a relational database will break-up a list into

several parts. One part for each theme in the list• A well-formed relational database: a criteria to determine

whether a database is good enough (no redundancy, no modification issues) We will learn in Chapter 2

Page 19: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 20

Answer Query: Putting the Pieces Back Together

• In our relational database, we broke apart our list into several tables. Somehow the tables must be joined back together

• In a relational database, to answer a query, tables are joined together using the value of the data

Page 20: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 21

Query Relational Database: Using One Table

Student Table

Course Table

Registration Table

Query 1:How many students take class 210?

Answer: Check Registration Table to see many rows with CourseID as 210. count = 4

Page 21: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 22

Query Relational Database: Using Two Tables

Query 2:How many students take class taught by John?

Student Table

Course Table

Registration Table

Answer: Step 1. Check the CourseID taught by John in Course Table. CourseID = 220Step 2. See how many students taking class with CourseID 220 in Registration Table. count = 2

Page 22: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 23

Query Relational Database: Using Three Tables

Query 3:Who are the students taking class taught by Jessie?

Student Table

Course Table

Registration Table

Answer:Step 1. Check the CourseID taught by Jessie in Course Table. CourseID = 210Step 2. Get the StudentID taking class with CourseID 210 in Registration Table. StudentID 1, 5, 2, 3Step 3. Get the student names in Student Table with StudentID 1,5,2,3. Bob, Lisa, Sarah, Jim

Page 23: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 24

Query Relational Database Student Table

Course Table

Registration Table

In a relational database, to answer a query, tables are joined together using the value of the data in the shared columns

Page 24: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 25

Query Relational Database: Structured Query Language (SQL)• Structured Query Language (SQL) is an international standard for creating, processing and querying databases and their tables

SELECT Count(StudentID)FROM Course, RegistrationWHERE Course.CourseID = Registration.CourseID AND Course.Instructor = ‘John’

Query 2:How many students take class taught by John?Answer: Step 1. Check the CourseID taught by David in Course Table. CourseID = 220Step 2. See how many students taking class with CourseID 220 in Registration Table. count = 3

Page 25: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 27

Sounds likeMore Work, Not Less• A relational database is more complicated than a list• However, a relational database minimizes data

redundancy, preserves complex relationships among topics, and allows for partial data

• Furthermore, a relational database provides a solid foundation for user forms and reports

Page 26: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 33

Key Points in This Chapter• What is the problem with a simple list to store the

information?• Redundancy, Modification issues

• What is the solution to replace a simple list?• Relational database• Break a simple long list to several tables; each table has its own

theme

• How to query a relational database?• Join back the tables by the value of data through shared columns

Page 27: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 34

Next• I know splitting a simple list to multiple tables will reduce

redundancy and avoid modification issues, but• How should we split a simple list?• Is there any rule we could follow to split the list?• Is there any criteria to know the tables are good enough?

• We will answer this question in Chapter 2.

Page 28: IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1

IST210 35

QUESTION?Reminder: No Labs for this week!

Fill out programming skill survey on Angel!