View
221
Download
4
Tags:
Embed Size (px)
Citation preview
Relational Data Analysis II
Plan
• Introduction
• Structured Methods– Data Flow Modelling– Data Modelling– Relational Data Analysis
• Feasibility
• Maintenance
Definitions
• A relation corresponds to a table• A tuple is a row in a table• An attribute is a column in a table• A Primary Key is the attribute by which we
uniquely identify each row• The number of rows in a table is called the
cardinality• The number of attributes in a table is called the
degree
Example Relation (Table)
Student ID
Student Name Course Module Code
Module Name
Grade
1000001 Peter Stringfellow
BSc Basket Weaving
W1001 Flower Arranging
A
1001234 Terrence Halfwit
BA Surfing Studies
S2003 Hazardous Fishes
B
1234567 Big John BSc Business B3333 Selling Stuff E
1234567 Big John BSc Business B3334 Buying Stuff A
Student
Example Relation (Table)
• The table can also be described without its data as follows:Student (Student ID, Student Name, Course, Module
Code, Module Name, Grade)
• OrStudent ID
Student Name
Course
Module Code
Module Name
Grade
Rules
• No two rows in a table are identical– i.e. there are no duplicate tuples/rows
• Every relation has a Primary Key attribute• Each tuple has a primary key value• The sequence of the rows should not be
significant• The sequence of the columns should not be
significant• Each attribute must have a unique name
Problems with Tables
• Problems with tables can be classified into three groups:– Insert Anomalies – Problems caused when
inserting new information– Update Anomalies – Problems caused when
updating existing data– Delete Anomalies – Problems caused when
deleting data
The Solution?
• To remove these anomalies we must re-arrange the data and create new tables
• The process for doing this is called Normalisation
First Normal Form
• All data in a table must be dependant on the key
• In order to do this we must remove “repeating groups”
• This is done by analysing the relationship between the primary key and the rest of the data
Example 1 - Students
• Student ID• Student Name• Course• Course ID• Module Code• Module Name• Grade
• Attributes are moved if there is more than one for each instance of the primary key
Example 1 - Students
• Student ID• Student Name• Course• Course ID• Module Code• Module Name• Grade
• For each Student ID• How many Student
names are there?• 1 or Many?
Example 1 - Students
• Student ID• Student Name• Course• Course ID
– Module Code• Module Name• Grade
• For each Student ID• How many Module
Codes are there?
Example 1 - Students
• Student ID• Student Name• Course• Course ID
– Module Code– Module Name– Grade
• Indented data is a repeating group
• We need to put it into a new table
• This table will describe the module a student is taking
• We will call it Student Module
Example 1 - Students
• Student ID• Student Name• Course• Course ID
• Student ID• Module Code• Module Name• Grade
• We now have two tables
• Student details– Primary Key = Student ID
• Student’s module details– PK = Student ID, Module
Code– Called a compound Key
Yes… But… No… But…
• There are still Anomalies…
• Update– Cannot change a module name without finding all students on it
• Insert– Cannot add a new module unless we have a student enrolled
• Delete– When a student leaves we could lose course information
• Further Normalisation is therefore required…
Example 2 - Library
• Student ID• Name• Faculty• Book ID• Title• Author• Return Date
• Put this data into First Normal Form
Example 2 - Library
• Student ID• Name• Faculty
– Book ID– Title– Author– Return Date
• Identify Repeating group
Example 2 - Library
• Student ID• Name• Faculty
• Student ID • Book ID• Title• Author• Return Date
• Create a New table
• Remember to keep the original PK in that of the new table
• This maintains the relationship between the two tables
Example 3
• Customer ID• Customer Name• Address• Branch No• Branch Manager• Stock ID• Title• Format
• Put this data into First Normal Form
Example 3
• Customer ID• Customer Name• Address• Branch No• Branch Manager
– Stock ID– Title– Format
• Identify Repeating group
Example 3 – Borrowing Videos
• Customer ID• Customer Name• Address• Branch No• Branch Manager
• Customer ID• Stock ID• Title• Format
• Create New table
Remember
• 1NF can be considered as Normalised
• But it doesn’t solve all of our problems
Second Normal Form
• Only Applies to tables with compound keys
• Data in a table must depend on the whole key
• We must remove any partial dependencies
Example 1 – Students (1NF)
• Student ID• Student Name• Course• Course ID
• Student ID• Module Code• Module Name• Grade
• This table is already in 1NF as it does not have a compound key
• This table may not be in 2NF
• Need to analyse the relationship between attributes and the key
Example 1 – 2NF
• Student ID• Module Code• Module Name• Grade
• Examine the attribute ‘Module Name’
• If we removed the Student ID would we expect the module name to remain in our system?
• Yes or No?
Example 1 – 2NF
• Student ID• Module Code• Module Name• Grade
• Examine the attribute ‘Module Name’
• If we removed the Student ID would we expect the module name to remain in our system?
• Yes
Example 1 – 2NF
• Student ID• Module Code• Module Name• Grade
• Examine the attribute ‘Module Name’
• This tells us that Module Name IS NOT dependent on StudentID
Example 1 – 2NF
• Student ID• Module Code• Module Name• Grade
• Examine the attribute ‘Module Name’
• If we removed the Module Code would we expect the module name to remain in our system?
• Yes or No
Example 1 – 2NF
• Student ID• Module Code• Module Name• Grade
• Examine the attribute ‘Module Name’
• If we removed the Module Code would we expect the module name to remain in our system?
• No
Example 1 – 2NF
• Student ID• Module Code• Module Name• Grade
• Examine the attribute ‘Module Name’
• This tells us that Module Name IS dependent on Module Code
Example 1 – 2NF
• Student ID• Module Code
– Module Name• Grade
• Examine the attribute ‘Module Name’
• Module name is therefore dependant on only PART of the primary key
• This is called a partial dependency and must be removed
Example 1 – 2NF
• Student ID• Module Code
– Module Name• Grade
• Examine the attribute ‘Grade’
• Is it dependent on Student ID?
• Is it dependent on Module Code?
Example 1 – 2NF
• Student ID• Module Code
– Module Name• Grade
• Examine the attribute ‘Grade’
• Is it dependent on Student ID? Yes
• Is it dependent on Module Code?
Example 1 – 2NF
• Student ID• Module Code
– Module Name• Grade
• Examine the attribute ‘Grade’
• Is it dependent on Student ID? Yes
• Is it dependent on Module Code? Yes
Example 1 – 2NF
• Student ID• Module Code
– Module Name• Grade
• Examine the attribute ‘Grade’
• There is no partial dependency so it stays in this table
Example 1 – 2NF
• Student ID• Module Code
– Module Name• Grade
• Module Name must be removed
Example 1 – 2NF
• Student ID• Module Code• Grade
• Module Name
• Module Name must be removed
Example 1 – 2NF
• Student ID• Module Code• Grade
• Module Code• Module Name
• Module Name must be removed
• We need to give it a primary key
• This will be the part of the key on which it is dependent
• The data is now in 2NF
Example 1 – 2NF
• Student ID• Student Name• Course• Course ID
• Student ID• Module Code• Grade
• Module Code• Module Name
Example 2
• Student ID• Name• Faculty
• Student ID • Book ID• Title• Author• Return Date
• Take this to 2NF
Example 2 - 2NF
• Student ID• Name• Faculty
• Student ID • Book ID• Return Date
• Book ID• Title• Author
Example 3
• Customer ID• Customer Name• Address• Branch No• Branch Manager
• Customer ID• Stock ID• Title• Format
• Take this to 2NF
Example 3 - 2NF
• Customer ID• Customer Name• Address• Branch No• Branch Manager
• Customer ID• Stock ID
• Stock ID • Title• Format
Example 1 - Anomalies
• There are still problems with Example 1• Insert
– Still cannot add a course unless there are students taking it
• Update– Cannot update course name without finding all
students on the course
• Delete– If we delete a student then we could also lose course
information
Third Normal Form
• Applies to all tables
• Data in a table must depend on Nothing but the Key
• We must remove any non-key dependencies
Example 1 – 3NF
• Student ID• Student Name• Course• Course ID
• Student ID• Module Code• Grade
• Module Code• Module Name
Example 1 – 3NF
• Student ID• Student Name• Course• Course ID
• Problems all seem to affect Course Data
• Student ID• Module Code• Grade
• Module Code• Module Name
Example 1 – 3NF
• Student ID• Student Name• Course• Course ID
• Examine the attribute “Student Name”
• If we removed the Student ID would we expect the Student Name to remain in our system?
Example 1 – 3NF
• Student ID• Student Name• Course• Course ID
• Examine the attribute “Student Name”
• If we removed the Student ID would we expect the Student Name to remain in our system? No
Example 1 – 3NF
• Student ID• Student Name• Course• Course ID
• Examine the attribute “Student Name”
• Therefore “Student Name” is dependent on Student ID and is in the correct table
Example 1 – 3NF
• Student ID• Student Name• Course• Course ID
• Examine the attribute “Course”, which is the name of a course
• If we removed the Student ID would we expect the Course to remain in our system?
Example 1 – 3NF
• Student ID• Student Name• Course• Course ID
• Examine the attribute “Course”, which is the name of a course
• If we removed the Student ID would we expect the Course to remain in our system? Yes
Example 1 – 3NF
• Student ID• Student Name• Course• Course ID
• Examine the attribute “Course”
• Therefore “Course” IS NOT dependent on Student ID and must be moved
Example 1 – 3NF
• Student ID• Student Name• Course ID
• Course ID • Course
• Examine the attribute “Course”
• The new table needs an appropriate Primary Key
• CourseID is the logical option
Example 1 – 3NF
• Student ID• Student Name• Course ID
• Course ID • Course
• Examine the attribute “Course ID”
• If we removed the Student ID would we expect the student’s Course ID to remain in our system?
Example 1 – 3NF
• Student ID• Student Name• Course ID
• Course ID • Course
• Examine the attribute “Course ID”
• Course ID is dependent on Student ID so must remain in the existing table
• Acts as a link between student and course
Example 1 – 3NF
• Student ID• Student Name• Course ID
• Course ID • Course
• Examine the attribute “Course ID”
• If we removed the Student ID would we expect the student’s Course ID to remain in our system? No
Example 1 – 3NF
• Check the remaining tables to ensure they are in 3NF
Example 1 – 3NF
• Student ID• Student Name• Course ID
• Course ID • Course
• Student ID• Module Code• Grade
• Module Code• Module Name
Third Normal Form
• The Data is now in 3NF
• Data in a table must depend on – The Key– The Whole Key– And Nothing but the Key
• All Anomalies have now been removed
Example 2 - 2NF
• Student ID• Name• Faculty
• Student ID • Book ID• Return Date
• Book ID• Title• Author
• Take this to 3NF
Example 2 - 3NF
• Student ID• Name• Faculty
• Student ID • Book ID• Return Date
• Book ID• Title• Author
• Already in 3NF
Example 3 - 2NF
• Customer ID• Customer Name• Address• Branch No• Branch Manager
• Customer ID• Stock ID
• Stock ID • Title• Format
• Take this to 3NF
Example 3 - 3NF
• Customer ID• Customer Name• Address• Branch No
• Branch No • Branch Manager
• Customer ID• Stock ID
• Stock ID • Title• Format
References
• Whiteley, D. (2004) Introduction to Information Systems, Palgrave, 2004.
• Lejk, M. and D. Deeks (2002) Systems Analysis Techniques, Addison Wesley 2002
• Mason, D. and L. Willcocks (1994), Systems Analysis, Systems Design, Alfred Waller, 1994.
References
• Yeates, D. and T. Wakefield (2004) Systems Analysis and Design, FT/Prentice Hall 2004
• Gane, C. and T. Sarson (1979) Structured Systems Analysis, Prentice Hall, 1979
• Eva, M (1994) SSADM Version 4: A users guide, McGraw hill, 1994
References
• DeMarco, T. (1979) Structured Analysis and System Specification, Yourdon, 1979
• Royce, W. (1970) Managing the development of large software systems, In: Proceedings of IEEE WESCON, 1970 pp1-9.
• Connolly, T. and C. Begg (2000) Database Solutions, Addison-Wesley, 2000