Upload
martin-henry
View
213
Download
0
Embed Size (px)
Citation preview
Database Systems – Set TheoryDatabase Systems – Set TheoryRELATIONSRELATIONS
A relational database consists of tables, each of which is assigned a unique name.A relational database consists of tables, each of which is assigned a unique name. A row in a table represents a relationship among a set of values.A row in a table represents a relationship among a set of values. A table is a collection of such relationships.A table is a collection of such relationships. Column Headers are commonly referred to as attributesColumn Headers are commonly referred to as attributes
Websites-Schema=(website, organization, first-year, category)Websites-Schema=(website, organization, first-year, category)
websites relation:websites relation:
website website organization organization first-yearfirst-year categorycategory
www.zojjed.comwww.zojjed.com Walking PromotionsWalking Promotions 20062006 FictionFiction
www.racewalk.comwww.racewalk.com Walking PromotionsWalking Promotions 19951995 HealthHealth
www.greattreks.comwww.greattreks.com Walking PromotionsWalking Promotions 20062006 TravelTravel
www.twofeetgallery.comwww.twofeetgallery.com Walking PromotionsWalking Promotions 20042004 PhotographsPhotographs
www.walkinghealthy.comwww.walkinghealthy.com Walking PromotionsWalking Promotions 20022002 HealthHealth
www.cs.drexel.edu/~jsalvagewww.cs.drexel.edu/~jsalvage Drexel UniversityDrexel University 20052005 EducationEducation
Database Systems – Set TheoryDatabase Systems – Set TheoryRELATIONSRELATIONS
A relational database consists of tables, each of which is assigned a unique name.A relational database consists of tables, each of which is assigned a unique name. A row in a table represents a relationship among a set of values.A row in a table represents a relationship among a set of values. A table is a collection of such relationships.A table is a collection of such relationships. Column Headers are commonly referred to as attributesColumn Headers are commonly referred to as attributes
Websites-Schema=(website, organization, first-year, category)Websites-Schema=(website, organization, first-year, category)
websites relation:websites relation:
website website organization organization first-yearfirst-year categorycategory
www.zojjed.comwww.zojjed.com Walking PromotionsWalking Promotions 20062006 FictionFiction
www.racewalk.comwww.racewalk.com Walking PromotionsWalking Promotions 19951995 HealthHealth
www.greattreks.comwww.greattreks.com Walking PromotionsWalking Promotions 20062006 TravelTravel
www.twofeetgallery.comwww.twofeetgallery.com Walking PromotionsWalking Promotions 20042004 PhotographsPhotographs
www.walkinghealthy.comwww.walkinghealthy.com Walking PromotionsWalking Promotions 20022002 HealthHealth
www.cs.drexel.edu/~jsalvagewww.cs.drexel.edu/~jsalvage Drexel UniversityDrexel University 20052005 EducationEducation
Database Systems – Set TheoryDatabase Systems – Set Theory
DOMAINSDOMAINS
A Domain is the set of permitted values for a column/attribute. A Domain is the set of permitted values for a column/attribute. The domain can be any positive number as in the case with first yearThe domain can be any positive number as in the case with first year The domain can be a series of letters up to a maximum number of letters as in the The domain can be a series of letters up to a maximum number of letters as in the case with organization.case with organization. The domain can be valid web addresses, whose rules might be slightly more The domain can be valid web addresses, whose rules might be slightly more complicated.complicated.
IfIf
D1 denotes the set of all websitesD1 denotes the set of all websites
D2 denotes the set of all organizationsD2 denotes the set of all organizations
D3 denotes the set of all first yearsD3 denotes the set of all first years
D4 denotes the set of all categoriesD4 denotes the set of all categories
Any row of Any row of websiteswebsites must contain a 4-tuple(v1,v2,v3, v4) where must contain a 4-tuple(v1,v2,v3, v4) where
v1 is a website in the domain D1v1 is a website in the domain D1
v2 is a organization in the domain D2v2 is a organization in the domain D2
v3 is year in the domain D3v3 is year in the domain D3
V4 is a category in the domain D4V4 is a category in the domain D4
Therefore, account is a subset of D1xD2xD3xD4.Therefore, account is a subset of D1xD2xD3xD4.
Database Systems – Set TheoryDatabase Systems – Set Theory
DOMAINSDOMAINS
In general a table must be a subset of D1xD2x…xDn-1xDnIn general a table must be a subset of D1xD2x…xDn-1xDn
Tables vs. RelationsTables vs. Relations
There exists a close relationship between this language and the terminology used in There exists a close relationship between this language and the terminology used in databases. databases.
Instead of numbers DB’s use names.Instead of numbers DB’s use names.
Relation -> tableRelation -> table
tuple -> rowtuple -> row
Websites table has 6 tuples.Websites table has 6 tuples.
Database Systems – Set TheoryDatabase Systems – Set Theory
TUPLE NOTATIONTUPLE NOTATION
If t is a variable denoting the first tuple relationship, then t[website] denotes the If t is a variable denoting the first tuple relationship, then t[website] denotes the website of the tuple t.website of the tuple t.
t[website] = “www.zojjed.com”t[website] = “www.zojjed.com”
t[organization]=”Walking Promotions”t[organization]=”Walking Promotions”
t[first-year] = 2006t[first-year] = 2006
t[category] = “Fiction”t[category] = “Fiction”
AlternativelyAlternatively
t[1] = “www.zojjed.com”t[1] = “www.zojjed.com”
t[2]= ”Walking Promotions”t[2]= ”Walking Promotions”
t[3] = 2006t[3] = 2006
t[4] = “Fiction”t[4] = “Fiction”
t t r, indicate the tuple t is in the relation r r, indicate the tuple t is in the relation r
Database Systems – Set TheoryDatabase Systems – Set Theory
DOMAINSDOMAINS
It is possible for several attributes to have the same domain. It is possible for several attributes to have the same domain.
Later we will introduce a customer relation. It has a customer name, if I also had a Later we will introduce a customer relation. It has a customer name, if I also had a employee table with the field employee name, technically they both have the same employee table with the field employee name, technically they both have the same domain. domain.
It depends upon how you look at it. If the domain is the set of all possible names, this is It depends upon how you look at it. If the domain is the set of all possible names, this is true.true.
What about the domains website and first-year. They are incompatible.What about the domains website and first-year. They are incompatible.
What about website and category? While they both may allow the “same” values, I What about website and category? While they both may allow the “same” values, I would consider them as distinct domains.would consider them as distinct domains.
In a set, a attribute may contain the value Null. In a set, a attribute may contain the value Null.
For now we will assume they do not.For now we will assume they do not.
Database Systems – Set TheoryDatabase Systems – Set Theory
DATABASE SCHEMAS DATABASE SCHEMAS Logical design of the databaseLogical design of the database defines the type definition of a variabledefines the type definition of a variable
DATABASE INSTANCEDATABASE INSTANCE Snapshot of the database at a given timeSnapshot of the database at a given time an instance of a variablean instance of a variable
A database schema in relations is defined by using a capitalized name for the A database schema in relations is defined by using a capitalized name for the relationship-schema and a lowercase name of each attribute. An instance of a relation relationship-schema and a lowercase name of each attribute. An instance of a relation is represented by a lowercase name.is represented by a lowercase name.
Websites-schema(website, organization, first-year, category)Websites-schema(website, organization, first-year, category)
A relation on the Website-schema is as follows:A relation on the Website-schema is as follows:
websites(Website-schema)websites(Website-schema)
Side notes, very important:Side notes, very important:
A relation has no orderA relation has no order
A relation can not contain duplicate tuplesA relation can not contain duplicate tuples
Database Systems – Set TheoryDatabase Systems – Set Theory
Customers-Schema=(website, first-name, last-name)Customers-Schema=(website, first-name, last-name)
customers Relationcustomers Relation
websitewebsite first-namefirst-name last-namelast-name
www.zojjed.comwww.zojjed.com DerekDerek JeterJeter
www.zojjed.comwww.zojjed.com ChaseChase UtleyUtley
www.cs.drexel.edu/~jsalvagewww.cs.drexel.edu/~jsalvage JeremyJeremy JohnsonJohnson
www.racewalk.comwww.racewalk.com RyanRyan HowardHoward
www.zojjed.comwww.zojjed.com RyanRyan HowardHoward
Notice the website attribute appears in both the customers relation and Websites relation.
This is not a coincidence, often fields are repeated.
This allows distinct relations to be related.
If we wanted to gather website information for all websites from customers need information from both relations
Database Systems – Set TheoryDatabase Systems – Set Theory
Combined information from website and customers relationsCombined information from website and customers relations
website website categorycategory first-namefirst-name last-namelast-name
www.zojjed.comwww.zojjed.com FictionFiction DerekDerek JeterJeter
www.zojjed.comwww.zojjed.com FictionFiction ChaseChase UtleyUtley
www.cs.drexel.edu/~jsalvagewww.cs.drexel.edu/~jsalvage EducationEducation JeremyJeremy JohnsonJohnson
www.racewalk.comwww.racewalk.com HealthHealth RyanRyan HowardHoward
www.zojjed.comwww.zojjed.com FictionFiction RyanRyan HowardHoward
In real databases, unique id fields would be used to identify the customer and the website so the website name would not be repeated
Database Systems – Set TheoryDatabase Systems – Set Theory
Instead of having two schemas, it’s possible to have one schema as follows:Instead of having two schemas, it’s possible to have one schema as follows:
WebsiteCustomers(website, organization, first-year, category, first-name, last-name)WebsiteCustomers(website, organization, first-year, category, first-name, last-name)
What is wrong with this?What is wrong with this?
Database Systems – Set TheoryDatabase Systems – Set Theory
Redundant data as well as null fields.Redundant data as well as null fields.
website website organization organization first-yearfirst-year categorycategory first-namefirst-name last-namelast-name
www.zojjed.comwww.zojjed.com Walking PromotionsWalking Promotions 20062006 FictionFiction DerekDerek JeterJeter
www.racewalk.comwww.racewalk.com Walking PromotionsWalking Promotions 19951995 HealthHealth RyanRyan HowardHoward
www.greattreks.comwww.greattreks.com Walking PromotionsWalking Promotions 20062006 TravelTravel NullNull NullNull
www.twofeetgallery.comwww.twofeetgallery.com Walking PromotionsWalking Promotions 20042004 PhotographsPhotographs NullNull NullNull
www.walkinghealthy.comwww.walkinghealthy.com Walking PromotionsWalking Promotions 20022002 HealthHealth NullNull NullNull
www.zojjed.comwww.zojjed.com Walking PromotionsWalking Promotions 20062006 FictionFiction RyanRyan HowardHoward
www.zojjed.comwww.zojjed.com Walking PromotionsWalking Promotions 20062006 FictionFiction ChaseChase UtleyUtley
www.cs.drexel.edu/~jsalvagewww.cs.drexel.edu/~jsalvage Drexel UniversityDrexel University 20052005 EducationEducation JeremyJeremy JohnsonJohnson
Database Systems – Set TheoryDatabase Systems – Set Theory
hit-counts-Schema= (website, date, hit-count)hit-counts-Schema= (website, date, hit-count)
hit-counts relationhit-counts relation
website website datedate hit-counthit-count
www.zojjed.comwww.zojjed.com 5/20/20075/20/2007 55
www.racewalk.comwww.racewalk.com 5/20/20075/20/2007 20192019
www.greattreks.comwww.greattreks.com 5/20/20075/20/2007 10501050
www.twofeetgallery.comwww.twofeetgallery.com 5/20/20075/20/2007 3232
www.walkinghealthy.comwww.walkinghealthy.com 5/20/20075/20/2007 159159
www.zojjed.comwww.zojjed.com 5/21/20075/21/2007 66
www.zojjed.comwww.zojjed.com 5/22/20075/22/2007 55
www.cs.drexel.edu/~jsalvagewww.cs.drexel.edu/~jsalvage 5/20/20075/20/2007 376376
www.racewalk.comwww.racewalk.com 5/21/20075/21/2007 20992099
Is there anything wrong with the above relation?
Database Systems – Set TheoryDatabase Systems – Set Theory
hit-counts-Schema= (website, date, hit-count)hit-counts-Schema= (website, date, hit-count)
hit-counts relationhit-counts relation
website website datedate hit-counthit-count
www.zojjed.comwww.zojjed.com 5/20/20075/20/2007 55
www.racewalk.comwww.racewalk.com 5/20/20075/20/2007 20192019
www.greattreks.comwww.greattreks.com 5/20/20075/20/2007 10501050
www.twofeetgallery.comwww.twofeetgallery.com 5/20/20075/20/2007 3232
www.walkinghealthy.comwww.walkinghealthy.com 5/20/20075/20/2007 159159
www.zojjed.comwww.zojjed.com 5/21/20075/21/2007 66
www.zojjed.comwww.zojjed.com 5/22/20075/22/2007 55
www.cs.drexel.edu/~jsalvagewww.cs.drexel.edu/~jsalvage 5/20/20075/20/2007 376376
www.racewalk.comwww.racewalk.com 5/21/20075/21/2007 20992099
Is there anything wrong with the above relation?
No there is no reason why we can not list a website more than once.
Database Systems – Set TheoryDatabase Systems – Set Theory
If we did not care about the date and only cared about the hit count, could we define If we did not care about the date and only cared about the hit count, could we define the hit-counts Schema as follows:the hit-counts Schema as follows:
hit-counts-Schema= (website, hit-count)hit-counts-Schema= (website, hit-count)
hit-counts relation:hit-counts relation:
website website hit-counthit-count
www.zojjed.comwww.zojjed.com 55
www.racewalk.comwww.racewalk.com 20192019
www.greattreks.comwww.greattreks.com 10501050
www.twofeetgallery.comwww.twofeetgallery.com 3232
www.walkinghealthy.comwww.walkinghealthy.com 159159
www.zojjed.comwww.zojjed.com 66
www.zojjed.comwww.zojjed.com 55
www.cs.drexel.edu/~jsalvagewww.cs.drexel.edu/~jsalvage 376376
www.racewalk.comwww.racewalk.com 20992099
Database Systems – Set TheoryDatabase Systems – Set Theory
If we did not care about the date and only cared about the hit count, could we define If we did not care about the date and only cared about the hit count, could we define the hit-counts Schema as follows:the hit-counts Schema as follows:
hit-counts-Schema= (website, hit-count)hit-counts-Schema= (website, hit-count)
hit-counts relation:hit-counts relation:
website website hit-counthit-count
www.zojjed.comwww.zojjed.com 55
www.racewalk.comwww.racewalk.com 20192019
www.greattreks.comwww.greattreks.com 10501050
www.twofeetgallery.comwww.twofeetgallery.com 3232
www.walkinghealthy.comwww.walkinghealthy.com 159159
www.zojjed.comwww.zojjed.com 66
www.zojjed.comwww.zojjed.com 55
www.cs.drexel.edu/~jsalvagewww.cs.drexel.edu/~jsalvage 376376
www.racewalk.comwww.racewalk.com 20992099
In real databases there would be no problem, but we said that you can not repeat In real databases there would be no problem, but we said that you can not repeat tuples in a relation. So the answer is no.tuples in a relation. So the answer is no.
Database Systems – Set TheoryDatabase Systems – Set Theory
QUERY LANGUAGESQUERY LANGUAGES
A query language is a language in which the user request information from the A query language is a language in which the user request information from the database.database.
Can be procedural or non-procedural.Can be procedural or non-procedural.
We will study We will study Relational AlgebraRelational Algebra
It Is a procedural language consisting of sets of operations that take one or two It Is a procedural language consisting of sets of operations that take one or two relations as input and output a relation. Operations include:relations as input and output a relation. Operations include: selectselect projectproject unionunion set difference set difference Cartesian productCartesian product RenameRename IntersectionIntersection Aggregate functionsAggregate functions
We will also study various forms of joining relations. We will also study various forms of joining relations.
Database Systems – Set TheoryDatabase Systems – Set Theory
Unary- operates on one relationUnary- operates on one relation
Binary – operates on a pair of relationsBinary – operates on a pair of relations
The Select OperationThe Select Operation
Unary operationUnary operation
Selects tuples that satisfy a given predicateSelects tuples that satisfy a given predicate
- represents a select operation - sigma- represents a select operation - sigma
<select condition>(R)<select condition>(R)
<selection condition> = <attribute name> <comparison op> <constant value> or <selection condition> = <attribute name> <comparison op> <constant value> or
<selection condition> = <attribute name> <comparison op> <attribute name><selection condition> = <attribute name> <comparison op> <attribute name>
comparison operators are: =, <>, <, <=, >, >=comparison operators are: =, <>, <, <=, >, >=
equal, not equal, less than, less than or equal to, greater than, greater than or equal toequal, not equal, less than, less than or equal to, greater than, greater than or equal to
Database Systems – Set TheoryDatabase Systems – Set Theory
To select those tuples of the hit-counts relation where the website is “www.zojjed.com” To select those tuples of the hit-counts relation where the website is “www.zojjed.com” we write.we write.
website = “www.zojjed.com”website = “www.zojjed.com”
((hit-counts) hit-counts)
This returns the relation:This returns the relation:website website datedate hit-counthit-count
www.zojjed.comwww.zojjed.com 5/20/20075/20/2007 55
www.zojjed.comwww.zojjed.com 5/21/20075/21/2007 66
www.zojjed.comwww.zojjed.com 5/22/20075/22/2007 55
hit-counts relationhit-counts relation
website website datedate hit-counthit-count
www.zojjed.comwww.zojjed.com 5/20/20075/20/2007 55
www.racewalk.comwww.racewalk.com 5/20/20075/20/2007 20192019
www.greattreks.comwww.greattreks.com 5/20/20075/20/2007 10501050
www.twofeetgallery.comwww.twofeetgallery.com 5/20/20075/20/2007 3232
www.walkinghealthy.comwww.walkinghealthy.com 5/20/20075/20/2007 159159
www.zojjed.comwww.zojjed.com 5/21/20075/21/2007 66
www.zojjed.comwww.zojjed.com 5/22/20075/22/2007 55
www.cs.drexel.edu/~jsalvagewww.cs.drexel.edu/~jsalvage 5/20/20075/20/2007 376376
www.racewalk.comwww.racewalk.com 5/21/20075/21/2007 20992099
Database Systems – Set TheoryDatabase Systems – Set Theory
To select those tuples of the hit-counts relation where the hit-count is greater than 1000 To select those tuples of the hit-counts relation where the hit-count is greater than 1000 we write.we write.
hit-count > 1000 hit-count > 1000
((hit-counts) hit-counts)
This returns the relation:This returns the relation:website website datedate hit-counthit-count
www.racewalk.comwww.racewalk.com 5/20/20075/20/2007 20192019
www.greattreks.comwww.greattreks.com 5/20/20075/20/2007 10501050
ww.racewalk.comww.racewalk.com 5/21/20075/21/2007 20992099
hit-counts relationhit-counts relation
website website datedate hit-counthit-count
www.zojjed.comwww.zojjed.com 5/20/20075/20/2007 55
www.racewalk.comwww.racewalk.com 5/20/20075/20/2007 20192019
www.greattreks.comwww.greattreks.com 5/20/20075/20/2007 10501050
www.twofeetgallery.comwww.twofeetgallery.com 5/20/20075/20/2007 3232
www.walkinghealthy.comwww.walkinghealthy.com 5/20/20075/20/2007 159159
www.zojjed.comwww.zojjed.com 5/21/20075/21/2007 66
www.zojjed.comwww.zojjed.com 5/22/20075/22/2007 55
www.cs.drexel.edu/~jsalvagewww.cs.drexel.edu/~jsalvage 5/20/20075/20/2007 376376
www.racewalk.comwww.racewalk.com 5/21/20075/21/2007 20992099
Database Systems – Set TheoryDatabase Systems – Set Theory
Can combine predicates with and, or, and notCan combine predicates with and, or, and not
To select those tuples of the hit-counts relation where the hit-count is greater than 5 To select those tuples of the hit-counts relation where the hit-count is greater than 5 and the website is www.zojjed.com, we write.and the website is www.zojjed.com, we write.
hit-count > 5 and website = “www.zojjed.com”hit-count > 5 and website = “www.zojjed.com”
((hit-counts) hit-counts)
This returns the relation:This returns the relation:website website datedate hit-counthit-count
www.zojjed.comwww.zojjed.com 5/21/20075/21/2007 66
website website datedate hit-counthit-count
www.zojjed.comwww.zojjed.com 5/20/20075/20/2007 55
www.zojjed.comwww.zojjed.com 5/21/20075/21/2007 66
www.zojjed.comwww.zojjed.com 5/22/20075/22/2007 55
website = “www.zojjed.com”website = “www.zojjed.com”
hit-count > 5hit-count > 5website website datedate hit-counthit-count
www.racewalk.comwww.racewalk.com 5/20/20075/20/2007 20192019
www.greattreks.comwww.greattreks.com 5/20/20075/20/2007 10501050
www.twofeetgallery.comwww.twofeetgallery.com 5/20/20075/20/2007 3232
www.walkinghealthy.comwww.walkinghealthy.com 5/20/20075/20/2007 159159
www.zojjed.comwww.zojjed.com 5/21/20075/21/2007 66
www.cs.drexel.edu/~jsalvagewww.cs.drexel.edu/~jsalvage 5/20/20075/20/2007 376376
www.racewalk.comwww.racewalk.com 5/21/20075/21/2007 20992099
Database Systems – Set TheoryDatabase Systems – Set Theory
The Project OperationThe Project Operation
unaryunary
returns arguments in relation without all attributesreturns arguments in relation without all attributes
duplicates are removedduplicates are removed
- represent project operation - pi- represent project operation - pi
<attribute list> (R)<attribute list> (R)
website, category(Websites)website, category(Websites)
website website categorycategory
www.zojjed.comwww.zojjed.com FictionFiction
www.racewalk.comwww.racewalk.com HealthHealth
www.greattreks.comwww.greattreks.com TravelTravel
www.twofeetgallery.comwww.twofeetgallery.com PhotographsPhotographs
www.walkinghealthy.comwww.walkinghealthy.com HealthHealth
www.cs.drexel.edu/~jsalvagewww.cs.drexel.edu/~jsalvage EducationEducation
Database Systems – Set TheoryDatabase Systems – Set Theory
Composition of Relational OperationsComposition of Relational Operations
Often we need to combine operations. Often we wish to select a set of tuples and limit Often we need to combine operations. Often we wish to select a set of tuples and limit the relation returned to a few attributes.the relation returned to a few attributes.
What if we want to find out only the websites that have had greater than 1000 hits in a What if we want to find out only the websites that have had greater than 1000 hits in a given day?given day?
First we must find out what tuples have hit counts greater than 1000. First we must find out what tuples have hit counts greater than 1000.
We can accomplish this with the following relational query:We can accomplish this with the following relational query:
hit-count>1000 hit-count>1000
(hit-counts)(hit-counts)
By using the Project operation we can remove the extra attributes such as hit-count By using the Project operation we can remove the extra attributes such as hit-count and date and only return the values in the website column.and date and only return the values in the website column.
website(website(hit-count>1000 hit-count>1000
(hit-counts))(hit-counts))
What is the relation that is returned?What is the relation that is returned?
Database Systems – Set TheoryDatabase Systems – Set TheoryWhat if we want to find out only the websites that have had greater than 1000 hits in a What if we want to find out only the websites that have had greater than 1000 hits in a given day?given day?
website(website(hit-count>1000 hit-count>1000
(hit-counts))(hit-counts))
What is the relation that is returned?What is the relation that is returned?
website website
www.racewalk.comwww.racewalk.com
www.greattreks.comwww.greattreks.com
hit-counts relationhit-counts relation
website website datedate hit-counthit-count
www.zojjed.comwww.zojjed.com 5/20/20075/20/2007 55
www.racewalk.comwww.racewalk.com 5/20/20075/20/2007 20192019
www.greattreks.comwww.greattreks.com 5/20/20075/20/2007 10501050
www.twofeetgallery.comwww.twofeetgallery.com 5/20/20075/20/2007 3232
www.walkinghealthy.comwww.walkinghealthy.com 5/20/20075/20/2007 159159
www.zojjed.comwww.zojjed.com 5/21/20075/21/2007 66
www.zojjed.comwww.zojjed.com 5/22/20075/22/2007 55
www.cs.drexel.edu/~jsalvagewww.cs.drexel.edu/~jsalvage 5/20/20075/20/2007 376376
www.racewalk.comwww.racewalk.com 5/21/20075/21/2007 20992099
Database Systems – Set TheoryDatabase Systems – Set Theory
Union OperatorUnion Operator
binarybinary
- union operator- union operator
It is often useful to combine the results of queries.It is often useful to combine the results of queries.
Again, remember that set theory removes duplicates.Again, remember that set theory removes duplicates.
Relation 1 Relation 1 Relation 2 = Result Set Relation 2 = Result Set
Database Systems – Set TheoryDatabase Systems – Set Theory
What is a query that returns all websites that have customers What is a query that returns all websites that have customers OROR a hit count greater a hit count greater than 1000?than 1000?
We need information from both the customers relation as well as the hit count relation.We need information from both the customers relation as well as the hit count relation.
First we need the names of all websites that have customersFirst we need the names of all websites that have customers
website(customers)website(customers)
websitewebsite
www.drexel.edu/~jsalvagewww.drexel.edu/~jsalvage
www.racewalk.comwww.racewalk.com
www.zojjed.comwww.zojjed.com
customers Relationcustomers Relation
websitewebsite first-namefirst-name last-namelast-name
www.zojjed.comwww.zojjed.com DerekDerek JeterJeter
www.zojjed.comwww.zojjed.com ChaseChase UtleyUtley
www.drexel.edu/~jsalvagewww.drexel.edu/~jsalvage JeremyJeremy JohnsonJohnson
www.racewalk.comwww.racewalk.com RyanRyan HowardHoward
www.zojjed.comwww.zojjed.com RyanRyan HowardHoward
Database Systems – Set TheoryDatabase Systems – Set TheoryThen we need the names of the websites that have a hit count greater than 1000:Then we need the names of the websites that have a hit count greater than 1000:
website(website(hit-count>1000 hit-count>1000
(hit-counts)) (hit-counts))
website website
www.racewalk.comwww.racewalk.com
www.greattreks.comwww.greattreks.com
hit-counts relationhit-counts relation
website website datedate hit-counthit-count
www.zojjed.comwww.zojjed.com 5/20/20075/20/2007 55
www.racewalk.comwww.racewalk.com 5/20/20075/20/2007 20192019
www.greattreks.comwww.greattreks.com 5/20/20075/20/2007 10501050
www.twofeetgallery.comwww.twofeetgallery.com 5/20/20075/20/2007 3232
www.walkinghealthy.comwww.walkinghealthy.com 5/20/20075/20/2007 159159
www.zojjed.comwww.zojjed.com 5/21/20075/21/2007 66
www.zojjed.comwww.zojjed.com 5/22/20075/22/2007 55
www.cs.drexel.edu/~jsalvagewww.cs.drexel.edu/~jsalvage 5/20/20075/20/2007 376376
www.racewalk.comwww.racewalk.com 5/21/20075/21/2007 20992099
Database Systems – Set TheoryDatabase Systems – Set Theory
Combine the results using a union operationCombine the results using a union operation
website(customers) website(customers) website(website(hit-count>1000 hit-count>1000
(hit-counts)) (hit-counts))
Remember, order not important!
• Unions MUST be of similar types• They MUST have the same number of attributes• The domains of the attributes MUST be the same
websitewebsite
www.drexel.edu/~jsalvagewww.drexel.edu/~jsalvage
www.racewalk.comwww.racewalk.com
www.greattreks.comwww.greattreks.com
www.zojjed.comwww.zojjed.com
websitewebsite
www.drexel.edu/~jsalvagewww.drexel.edu/~jsalvage
www.racewalk.comwww.racewalk.com
www.zojjed.comwww.zojjed.com
website website
www.racewalk.comwww.racewalk.com
www.greattreks.comwww.greattreks.com
website(customers)website(customers)website(website(hit-count>1000 (hit-counts))hit-count>1000 (hit-counts))
Database Systems – Set TheoryDatabase Systems – Set Theory
Intersection OperatorIntersection Operator
binarybinary
∩ ∩ - intersection operator- intersection operator
Returns all tuples contained within both relationsReturns all tuples contained within both relations
Relation 1 ∩ Relation 2 = Result SetRelation 1 ∩ Relation 2 = Result Set
Database Systems – Set TheoryDatabase Systems – Set Theory
What is a query that returns all websites that have customers What is a query that returns all websites that have customers ANDAND a hit count greater a hit count greater than 1000? than 1000?
First we need the names of all websites that have customersFirst we need the names of all websites that have customers
website(customers)website(customers)
websitewebsite
www.drexel.edu/~jsalvagewww.drexel.edu/~jsalvage
www.racewalk.comwww.racewalk.com
www.zojjed.comwww.zojjed.com
Then we need the names of the websites that have a hit count greater than 1000:Then we need the names of the websites that have a hit count greater than 1000:
website(website(hit-count>1000 hit-count>1000
(hit-counts)) (hit-counts))
website website
www.racewalk.comwww.racewalk.com
www.greattreks.comwww.greattreks.com
Database Systems – Set TheoryDatabase Systems – Set Theory
What is a query that returns all websites that have customers What is a query that returns all websites that have customers ANDAND a hit a hit count greater than 1000? count greater than 1000?
website(customers) ∩ website(customers) ∩ website(website(hit-count>1000 hit-count>1000
(hit-counts)) (hit-counts))
websitewebsite
www.racewalk.comwww.racewalk.com
websitewebsite
www.drexel.edu/~jsalvagewww.drexel.edu/~jsalvage
www.racewalk.comwww.racewalk.com
www.zojjed.comwww.zojjed.com
website website
www.racewalk.comwww.racewalk.com
www.greattreks.comwww.greattreks.com
website(customers)website(customers)website(website(hit-count>1000 (hit-counts))hit-count>1000 (hit-counts))
Database Systems – Set TheoryDatabase Systems – Set Theory
The Set Difference Operation (MINUS)The Set Difference Operation (MINUS)
binarybinary
-, denotes set difference-, denotes set difference
Relation 1 - Relation 2 = Result SetRelation 1 - Relation 2 = Result Set
Finds tuples in one set but not in anotherFinds tuples in one set but not in another
r – s, produces a set containing those tuples in r but not in s.r – s, produces a set containing those tuples in r but not in s.
Database Systems – Set TheoryDatabase Systems – Set Theory
Produce a list of websites who have a hit count > 1000 and Produce a list of websites who have a hit count > 1000 and do not havedo not have a a customer.customer.
We need the names of all websites that have customersWe need the names of all websites that have customerswebsite(customers)website(customers)
websitewebsite
www.drexel.edu/~jsalvagewww.drexel.edu/~jsalvage
www.racewalk.comwww.racewalk.com
www.zojjed.comwww.zojjed.com
Then we need the names of the websites that have a hit count greater than 1000:Then we need the names of the websites that have a hit count greater than 1000:
website(website(hit-count>1000 hit-count>1000
(hit-counts)) (hit-counts))
website website
www.racewalk.comwww.racewalk.com
www.greattreks.comwww.greattreks.com
website(website(hit-count>1000 (hit-counts)) - hit-count>1000 (hit-counts)) - website(customers)website(customers)
Database Systems – Set TheoryDatabase Systems – Set Theory
Produce a list of websites who have a hit count > 1000 and Produce a list of websites who have a hit count > 1000 and do not havedo not have a customer. a customer.
website(website(hit-count>1000 (hit-counts)) - hit-count>1000 (hit-counts)) - website(customers)website(customers)
websitewebsite
www.greattreks.comwww.greattreks.com
websitewebsite
www.drexel.edu/~jsalvagewww.drexel.edu/~jsalvage
www.racewalk.comwww.racewalk.com
www.zojjed.comwww.zojjed.com
website website
www.racewalk.comwww.racewalk.com
www.greattreks.comwww.greattreks.com
website(customers)website(customers)website(website(hit-count>1000 (hit-counts))hit-count>1000 (hit-counts))
Notice the attributes in R1 that are not in R2 are included in the result set, but the Notice the attributes in R1 that are not in R2 are included in the result set, but the attributes in R2 that are not in R1 are not included in the result set.attributes in R2 that are not in R1 are not included in the result set.
Database Systems – Set TheoryDatabase Systems – Set Theory
What is the result of the following?What is the result of the following?
website(empty set) - website(empty set) - website(websites)website(websites)
website website organization organization first-yearfirst-year categorycategory
www.zojjed.comwww.zojjed.com Walking PromotionsWalking Promotions 20062006 FictionFiction
www.racewalk.comwww.racewalk.com Walking PromotionsWalking Promotions 19951995 HealthHealth
www.greattreks.comwww.greattreks.com Walking PromotionsWalking Promotions 20062006 TravelTravel
www.twofeetgallery.comwww.twofeetgallery.com Walking PromotionsWalking Promotions 20042004 PhotographsPhotographs
www.walkinghealthy.comwww.walkinghealthy.com Walking PromotionsWalking Promotions 20022002 HealthHealth
www.cs.drexel.edu/~jsalvagewww.cs.drexel.edu/~jsalvage Drexel UniversityDrexel University 20052005 EducationEducation
Given the relation websites below:Given the relation websites below:
Database Systems – Set TheoryDatabase Systems – Set Theory
What is the result of the following?What is the result of the following?
website(empty set) - website(empty set) - website(websites)website(websites)
website website organization organization first-yearfirst-year categorycategory
www.zojjed.comwww.zojjed.com Walking PromotionsWalking Promotions 20062006 FictionFiction
www.racewalk.comwww.racewalk.com Walking PromotionsWalking Promotions 19951995 HealthHealth
www.greattreks.comwww.greattreks.com Walking PromotionsWalking Promotions 20062006 TravelTravel
www.twofeetgallery.comwww.twofeetgallery.com Walking PromotionsWalking Promotions 20042004 PhotographsPhotographs
www.walkinghealthy.comwww.walkinghealthy.com Walking PromotionsWalking Promotions 20022002 HealthHealth
www.cs.drexel.edu/~jsalvagewww.cs.drexel.edu/~jsalvage Drexel UniversityDrexel University 20052005 EducationEducation
Given the relation websites below:Given the relation websites below:
The result is the empty set, because no records are contained in “R1” and only the The result is the empty set, because no records are contained in “R1” and only the records in R1 that are not in R2 are returned from the MINUS operator.records in R1 that are not in R2 are returned from the MINUS operator.
Database Systems – Set TheoryDatabase Systems – Set Theory
What is the result of the following?What is the result of the following?
website(websites) - website(websites) - website(empty set)website(empty set)
website website organization organization first-yearfirst-year categorycategory
www.zojjed.comwww.zojjed.com Walking PromotionsWalking Promotions 20062006 FictionFiction
www.racewalk.comwww.racewalk.com Walking PromotionsWalking Promotions 19951995 HealthHealth
www.greattreks.comwww.greattreks.com Walking PromotionsWalking Promotions 20062006 TravelTravel
www.twofeetgallery.comwww.twofeetgallery.com Walking PromotionsWalking Promotions 20042004 PhotographsPhotographs
www.walkinghealthy.comwww.walkinghealthy.com Walking PromotionsWalking Promotions 20022002 HealthHealth
www.cs.drexel.edu/~jsalvagewww.cs.drexel.edu/~jsalvage Drexel UniversityDrexel University 20052005 EducationEducation
Given the relation websites below:
Database Systems – Set TheoryDatabase Systems – Set Theory
website website
www.zojjed.comwww.zojjed.com
www.racewalk.comwww.racewalk.com
www.greattreks.comwww.greattreks.com
www.twofeetgallery.comwww.twofeetgallery.com
www.walkinghealthy.comwww.walkinghealthy.com
www.cs.drexel.edu/~jsalvagewww.cs.drexel.edu/~jsalvage
The result of website(websites) - website(websites) - website(empty set) is the complete relation website(empty set) is the complete relation websites, since no records are contained in the empty set all records from the websites, since no records are contained in the empty set all records from the websites relation are included in the result set.websites relation are included in the result set.
Database Systems – Set TheoryDatabase Systems – Set Theory
The Cartesian-Product OperationThe Cartesian-Product Operationbinarybinary
x – combines information in two relationsx – combines information in two relations
Relation 1 x Relation 2 = Result SetRelation 1 x Relation 2 = Result Set
because attributes can be repeated in different relations, we need a notationbecause attributes can be repeated in different relations, we need a notationrelation.attribute will be used. relation.attribute will be used.
Therefore, the resulting schema of r = websites x customersTherefore, the resulting schema of r = websites x customers
(websites.website, websites.organization, websites.first-year, websites.category, (websites.website, websites.organization, websites.first-year, websites.category, customers.website, customers.first-name, customers.last-name)customers.website, customers.first-name, customers.last-name)
Note that issues exist if you wish to use the same relation twice, we will address this Note that issues exist if you wish to use the same relation twice, we will address this with the rename operation shortly.with the rename operation shortly.
What tuples exist in r if r = websites x customers?What tuples exist in r if r = websites x customers?
The combination of all tuples in websites with every tuple in customers.The combination of all tuples in websites with every tuple in customers.
Given r1 with n1 tuples and r2 with n2 tuples then r1xr2 has n1*n2 tuplesGiven r1 with n1 tuples and r2 with n2 tuples then r1xr2 has n1*n2 tuples
Database Systems – Set TheoryDatabase Systems – Set Theory
Let’s look at a simplified example first.Let’s look at a simplified example first.
If relation R1 contains the following:If relation R1 contains the following:
Value1Value1
11
22
33
and if relation R2 contains the following:and if relation R2 contains the following:
Value2Value2
AA
BB
CC
Then R1 x R2 contains the following:
R1.Value 1R1.Value 1 R2.Value 2R2.Value 2
11 AA
11 BB
11 CC
22 AA
22 BB
22 CC
33 AA
33 BB
33 CC
Database Systems – Set TheoryDatabase Systems – Set Theory
Similarly, websites x customers appears as follows:Similarly, websites x customers appears as follows:
websites.website websites.website organization organization first-yearfirst-year categorycategory customers.websitecustomers.website first-namefirst-name last-namelast-name
www.zojjed.comwww.zojjed.com Walking PromotionsWalking Promotions 20062006 FictionFiction www.zojjed.comwww.zojjed.com DerekDerek JeterJeter
www.zojjed.comwww.zojjed.com Walking PromotionsWalking Promotions 20062006 FictionFiction www.zojjed.comwww.zojjed.com ChaseChase UtleyUtley
www.zojjed.comwww.zojjed.com Walking PromotionsWalking Promotions 20062006 FictionFiction www.drexel.edu/~jsalvagewww.drexel.edu/~jsalvage JeremyJeremy JohnsonJohnson
www.zojjed.comwww.zojjed.com Walking PromotionsWalking Promotions 20062006 FictionFiction www.racewalk.comwww.racewalk.com RyanRyan HowardHoward
www.zojjed.comwww.zojjed.com Walking PromotionsWalking Promotions 20062006 FictionFiction www.zojjed.comwww.zojjed.com RyanRyan HowardHoward
www.racewalk.comwww.racewalk.com Walking PromotionsWalking Promotions 19951995 HealthHealth www.zojjed.comwww.zojjed.com DerekDerek JeterJeter
www.racewalk.comwww.racewalk.com Walking PromotionsWalking Promotions 19951995 HealthHealth www.zojjed.comwww.zojjed.com ChaseChase UtleyUtley
www.racewalk.comwww.racewalk.com Walking PromotionsWalking Promotions 19951995 HealthHealth www.drexel.edu/~jsalvagewww.drexel.edu/~jsalvage JeremyJeremy JohnsonJohnson
www.racewalk.comwww.racewalk.com Walking PromotionsWalking Promotions 19951995 HealthHealth www.racewalk.comwww.racewalk.com RyanRyan HowardHoward
www.racewalk.comwww.racewalk.com Walking PromotionsWalking Promotions 19951995 HealthHealth www.zojjed.comwww.zojjed.com RyanRyan HowardHoward
www.greattreks.comwww.greattreks.com Walking PromotionsWalking Promotions 20062006 TravelTravel www.zojjed.comwww.zojjed.com DerekDerek JeterJeter
www.greattreks.com www.greattreks.com Walking PromotionsWalking Promotions 20062006 TravelTravel www.zojjed.comwww.zojjed.com ChaseChase UtleyUtley
www.greattreks.comwww.greattreks.com Walking PromotionsWalking Promotions 20062006 TravelTravel www.drexel.edu/~jsalvagewww.drexel.edu/~jsalvage JeremyJeremy JohnsonJohnson
www.greattreks.com www.greattreks.com Walking PromotionsWalking Promotions 20062006 TravelTravel www.racewalk.comwww.racewalk.com RyanRyan HowardHoward
www.greattreks.com www.greattreks.com Walking PromotionsWalking Promotions 20062006 TravelTravel www.zojjed.comwww.zojjed.com RyanRyan HowardHoward
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
Database Systems – Set TheoryDatabase Systems – Set Theory
What if we want to find all the customers who bought from a website created before What if we want to find all the customers who bought from a website created before the year 2000?the year 2000?
We could try the following:We could try the following:
first-year <2000first-year <2000
(websites x customers) (websites x customers)
Note that we are not using a projection to reduce the number of names to show what is Note that we are not using a projection to reduce the number of names to show what is really happening. In the end, you would use a projection to show only the fields really happening. In the end, you would use a projection to show only the fields requested by the question.requested by the question.
websites.website websites.website organization organization first-yearfirst-year categorycategory customers.websitecustomers.website first-namefirst-name last-namelast-name
www.racewalk.comwww.racewalk.com Walking PromotionsWalking Promotions 19951995 HealthHealth www.zojjed.comwww.zojjed.com DerekDerek JeterJeter
www.racewalk.comwww.racewalk.com Walking PromotionsWalking Promotions 19951995 HealthHealth www.zojjed.comwww.zojjed.com ChaseChase UtleyUtley
www.racewalk.comwww.racewalk.com Walking PromotionsWalking Promotions 19951995 HealthHealth www.drexel.edu/~jsalvagewww.drexel.edu/~jsalvage JeremyJeremy JohnsonJohnson
www.racewalk.comwww.racewalk.com Walking PromotionsWalking Promotions 19951995 HealthHealth www.racewalk.comwww.racewalk.com RyanRyan HowardHoward
www.racewalk.comwww.racewalk.com Walking PromotionsWalking Promotions 19951995 HealthHealth www.zojjed.comwww.zojjed.com RyanRyan HowardHoward
Oops, too many tuples!
Database Systems – Set TheoryDatabase Systems – Set Theory
Because the Cartesian-product pairs all possible tuples from websites are combined Because the Cartesian-product pairs all possible tuples from websites are combined with all tuples from customers. While only those with the first-year < 2000 are with all tuples from customers. While only those with the first-year < 2000 are selected, it still returns 5 tuples.selected, it still returns 5 tuples...Of those sets, we only want the ones where the websites relation’s website attribute Of those sets, we only want the ones where the websites relation’s website attribute equals the customers relation’s website attribute.equals the customers relation’s website attribute.
websites.website websites.website organization organization first-yearfirst-year categorycategory customers.websitecustomers.website first-namefirst-name last-namelast-name
www.racewalk.comwww.racewalk.com Walking PromotionsWalking Promotions 19951995 HealthHealth www.zojjed.comwww.zojjed.com DerekDerek JeterJeter
www.racewalk.comwww.racewalk.com Walking PromotionsWalking Promotions 19951995 HealthHealth www.zojjed.comwww.zojjed.com ChaseChase UtleyUtley
www.racewalk.comwww.racewalk.com Walking PromotionsWalking Promotions 19951995 HealthHealth www.drexel.edu/~jsalvagewww.drexel.edu/~jsalvage JeremyJeremy JohnsonJohnson
www.racewalk.comwww.racewalk.com Walking PromotionsWalking Promotions 19951995 HealthHealth www.racewalk.comwww.racewalk.com RyanRyan HowardHoward
www.racewalk.comwww.racewalk.com Walking PromotionsWalking Promotions 19951995 HealthHealth www.zojjed.comwww.zojjed.com RyanRyan HowardHoward
The only tuple we truly want is the highlighted tuple.
we can write this as follows:
websites.website = customers.website(first-year <2000first-year <2000
(websites x customers) (websites x customers))
Database Systems – Set TheoryDatabase Systems – Set Theory
Since, Since, websites.website = customers.website(
first-year <2000 first-year <2000 (websites x customers)(websites x customers))
Returns the following tuple with too many attributes, we must also use a projection to remove the excessive attributes.
Applying the projection of first-name, last name to the previous query gives us the following query:
first-name, last-name(websites.website = customers.website(first-year <2000first-year <2000
(websites x (websites x
customers)customers)))
websites.website websites.website organization organization first-yearfirst-year categorycategory customers.websitecustomers.website first-namefirst-name last-namelast-name
www.racewalk.comwww.racewalk.com Walking PromotionsWalking Promotions 19951995 HealthHealth www.racewalk.comwww.racewalk.com RyanRyan HowardHoward
Database Systems – Set TheoryDatabase Systems – Set Theory
The Assignment OperatorThe Assignment Operatorunaryunaryallows an expression to be assigned to a variableallows an expression to be assigned to a variableNewRelation NewRelation OldRelation OldRelation
For example:For example:
1200loans 1200loans amount > 1200(loan)amount > 1200(loan)or or
result result loan-number(1200loans)loan-number(1200loans)
Or Or
The Rename OperationUnary
x(E) renames the expression E to x.
Relational-algebra expressions do not have a name that we can refer to them by using the rename operator,
is roh.
Database Systems – Set TheoryDatabase Systems – Set Theory
Example, without using an aggregation function (not yet shown), find the largest hit Example, without using an aggregation function (not yet shown), find the largest hit count of any website for a single day. If the same max hit count exists more than count of any website for a single day. If the same max hit count exists more than once, you are only allowed to return a single tuple containing the answer.once, you are only allowed to return a single tuple containing the answer.
We accomplish this in two steps:We accomplish this in two steps:
First, compute a temporary relationship consisting of hit counts not greater than the First, compute a temporary relationship consisting of hit counts not greater than the largest hit count.largest hit count.
Second, take the set difference between the relation Second, take the set difference between the relation hit-count(hit-counts) and the hit-count(hit-counts) and the temporary relationtemporary relation
Compute all the websites hit counts compared to all the websites hit counts, in other Compute all the websites hit counts compared to all the websites hit counts, in other words compute the Cartesian product of the relation hit-counts with itself.words compute the Cartesian product of the relation hit-counts with itself.
hit-counts x hit-countshit-counts x hit-counts
However, we must rename one of the hit-counts relations so that we can identify the However, we must rename one of the hit-counts relations so that we can identify the balance distinctlybalance distinctly
Database Systems – Set TheoryDatabase Systems – Set Theory
Given the projection of only the hit-count field from the relation hit-counts viaGiven the projection of only the hit-count field from the relation hit-counts via
hit-counthit-count
(hit-counts)(hit-counts)
We have:We have:
hit-counthit-count
55
20192019
10501050
3232
159159
66
376376
20992099
If we rename the result of this projectionIf we rename the result of this projection
dd (hit-count(hit-counts))hit-count(hit-counts))
And thus create a cross product of the two relations as And thus create a cross product of the two relations as
hit-count(hit-counts)hit-count(hit-counts) x dd (hit-count(hit-counts))hit-count(hit-counts))
Database Systems – Set TheoryDatabase Systems – Set Theory
hit-count(hit-counts)hit-count(hit-counts) x dd (hit-count(hit-counts))hit-count(hit-counts))
hit-counthit-count d(hit-count)d(hit-count)
55 55
20192019 55
10501050 55
3232 55
159159 55
66 55
376376 55
20992099 55
55 20192019
20192019 20192019
10501050 20192019
3232 20192019
159159 20192019
66 20192019
376376 20192019
20992099 20192019
hit-counthit-count d(hit-count)d(hit-count)
55 10501050
20192019 10501050
10501050 10501050
3232 10501050
159159 10501050
66 10501050
376376 10501050
20992099 10501050
55 3232
20192019 3232
10501050 3232
3232 3232
159159 3232
66 3232
376376 3232
20992099 3232
hit-counthit-count d(hit-count)d(hit-count)
55 159159
20192019 159159
10501050 159159
3232 159159
159159 159159
66 159159
376376 159159
20992099 159159
55 66
20192019 66
10501050 66
3232 66
159159 66
66 66
376376 66
20992099 66
Database Systems – Set TheoryDatabase Systems – Set Theory
hit-count(hit-counts)hit-count(hit-counts) x dd (hit-count(hit-counts))hit-count(hit-counts))
hit-counthit-count d(hit-count)d(hit-count)
55 376376
20192019 376376
10501050 376376
3232 376376
159159 376376
66 376376
376376 376376
20992099 376376
55 20992099
20192019 20992099
10501050 20992099
3232 20992099
159159 20992099
66 20992099
376376 20992099
20992099 20992099
Database Systems – Set TheoryDatabase Systems – Set Theory
Now we select only those tuples that have the first attibute containing a Now we select only those tuples that have the first attibute containing a value less than the second attribute, we do so with the following query:value less than the second attribute, we do so with the following query:
hitcounts.hit-count < d.hit-counthitcounts.hit-count < d.hit-count
(hit-count(hit-counts)hit-count(hit-counts) x dd (hit-count(hit-counts)))hit-count(hit-counts)))
hit-counthit-count d(hit-count)d(hit-count)
55 55
20192019 55
10501050 55
3232 55
159159 55
66 55
376376 55
20992099 55
55 20192019
20192019 20192019
10501050 20192019
3232 20192019
159159 20192019
66 20192019
376376 20192019
20992099 20192019
hit-counthit-count d(hit-count)d(hit-count)
55 10501050
20192019 10501050
10501050 10501050
3232 10501050
159159 10501050
66 10501050
376376 10501050
20992099 10501050
55 3232
20192019 3232
10501050 3232
3232 3232
159159 3232
66 3232
376376 3232
20992099 3232
hit-counthit-count d(hit-count)d(hit-count)
55 159159
20192019 159159
10501050 159159
3232 159159
159159 159159
66 159159
376376 159159
20992099 159159
55 66
20192019 66
10501050 66
3232 66
159159 66
66 66
376376 66
20992099 66
Database Systems – Set TheoryDatabase Systems – Set Theory
hit-count(hit-counts)hit-count(hit-counts) x dd (hit-count(hit-counts))hit-count(hit-counts))
hit-counthit-count d(hit-count)d(hit-count)
55 376376
20192019 376376
10501050 376376
3232 376376
159159 376376
66 376376
376376 376376
20992099 376376
55 20992099
20192019 20992099
10501050 20992099
3232 20992099
159159 20992099
66 20992099
376376 20992099
20992099 20992099
Database Systems – Set TheoryDatabase Systems – Set Theory
This certainly gives us a lot of tuples, but if we then project just the hit-count from the This certainly gives us a lot of tuples, but if we then project just the hit-count from the first column and remove the duplicates, we are left with the following:first column and remove the duplicates, we are left with the following:
hit-counthit-count
55
20192019
10501050
3232
159159
66
376376
This is the set containing all hit counts, but the largest hit count.This is the set containing all hit counts, but the largest hit count.
Database Systems – Set TheoryDatabase Systems – Set Theory
To get just the largest hit count we now simply subtract our result set from the To get just the largest hit count we now simply subtract our result set from the projection of the original hit count relation as follows:projection of the original hit count relation as follows:
hit-count(hit-counts) -hit-count(hit-counts) - hit-counthit-count((hitcounts.hit-count < d.hit-counthitcounts.hit-count < d.hit-count
(hit-count(hit-counts)hit-count(hit-counts) x dd
(hit-count(hit-counts))))hit-count(hit-counts))))
hit-counthit-count
20992099
Database Systems – Set TheoryDatabase Systems – Set Theory
We need a better way to represent certain queries as the notation for joining two We need a better way to represent certain queries as the notation for joining two relations and only selecting records where the attributes match is too cumbersome. relations and only selecting records where the attributes match is too cumbersome. Therefore we have:Therefore we have:
The Natural Join OperationThe Natural Join OperationBinaryBinary
Result Set = R1 Result Set = R1 |x| R2|x| R2
The natural join operation finds the Cartesian product of two relations, but only returns The natural join operation finds the Cartesian product of two relations, but only returns tuples where the attributes whose names are the same in both relations contain the tuples where the attributes whose names are the same in both relations contain the same values.same values.
Database Systems – Set TheoryDatabase Systems – Set Theory
Let’s look at a simplified example first.Let’s look at a simplified example first.
If relation R1 contains the following:If relation R1 contains the following:
Value1Value1 Value2Value2
11 XX
22 YY
33 ZZ
and if relation R2 contains the following:and if relation R2 contains the following:
Value2Value2 Value3Value3
XX AA
ZZ BB
AA CC
Database Systems – Set TheoryDatabase Systems – Set Theory
Then R1 x R2 contains the following:
R1.Value 1R1.Value 1 R1.Value 2R1.Value 2 R2.Value2R2.Value2 R2.Value3R2.Value3
11 XX XX AA
11 XX ZZ BB
11 XX AA CC
22 YY XX AA
22 YY ZZ BB
22 YY AA CC
33 ZZ XX AA
33 ZZ ZZ BB
33 ZZ AA CC
R1.Value 1R1.Value 1 R1.Value 2R1.Value 2 R2.Value2R2.Value2 R2.Value3R2.Value3
11 XX XX AA
33 ZZ ZZ BB
Then R1 |x||x| R2 contains the following:
Database Systems – Set TheoryDatabase Systems – Set Theory
Example:Example:
Find the names of all customers who have made a purchase from a health or travel Find the names of all customers who have made a purchase from a health or travel website. Return the name of the customer, the website, and the category of the website. Return the name of the customer, the website, and the category of the website.website.
The old way:The old way:
Form a Cartesian product of the websites and customers relations.Form a Cartesian product of the websites and customers relations.
Select the tuples of the same website as well as a category equal to “health” or Select the tuples of the same website as well as a category equal to “health” or “travel.” “travel.”
Project the first-name, last-name, website, and categoryProject the first-name, last-name, website, and category
first-name, last-name, website, categoryfirst-name, last-name, website, category
((websites.website = customers.website and websites.website = customers.website and
category = “Health” or category = “Travel”category = “Health” or category = “Travel”(websites x customers))(websites x customers))
Database Systems – Set TheoryDatabase Systems – Set Theory
Another example: Another example:
Find all the names of websites and the dates they have a hit count for web sites that Find all the names of websites and the dates they have a hit count for web sites that are in the health or travel category.are in the health or travel category.
Database Systems – Set TheoryDatabase Systems – Set Theory
Another example: Another example:
Find all the names of websites and the dates they have a hit count for web sites that Find all the names of websites and the dates they have a hit count for web sites that are in the health category.are in the health category.
website, datewebsite, date ( (category = “Health” or category = “Travel”category = “Health” or category = “Travel” (websites (websites |x||x| hit-counts))hit-counts))
Database Systems – Set TheoryDatabase Systems – Set Theory
Generalized ProjectionsGeneralized Projections
Allows basic arithmetic operations within fields of a tupleAllows basic arithmetic operations within fields of a tupleObserve the Sales relation:Observe the Sales relation:
productproduct first-namefirst-name last-namelast-name taxtax total-costtotal-cost
Zojjed!Zojjed! DerekDerek JeterJeter 1.001.00 17.9517.95
Zojjed!Zojjed! ChaseChase UtleyUtley 1.001.00 17.9517.95
VB .NET CoachVB .NET Coach JeremyJeremy JohnsonJohnson 00 54.9554.95
Race Walk Like A ChampionRace Walk Like A Champion RyanRyan HowardHoward 1.251.25 25.9525.95
Zojjed!Zojjed! RyanRyan HowardHoward 00 16.9516.95
What was the price of the cost of the product sold minus the tax paid?
product, first-name, last-name, (total-cost – tax) as net-payproduct, first-name, last-name, (total-cost – tax) as net-pay (Sales) (Sales)
Database Systems – Set TheoryDatabase Systems – Set Theory
Aggregate FunctionsAggregate Functions
takes a collection of values and returns a single value as a resulttakes a collection of values and returns a single value as a result
i.ei.esum {1, 1, 3, 4, 4, 11} returns the value 24.sum {1, 1, 3, 4, 4, 11} returns the value 24.avg {1, 1, 3, 4, 4, 11} returns the value 4.avg {1, 1, 3, 4, 4, 11} returns the value 4.count {1, 1, 3, 4, 4, 11} returns the value 6.count {1, 1, 3, 4, 4, 11} returns the value 6.min {1, 1, 3, 4, 4, 11} returns the value 1.min {1, 1, 3, 4, 4, 11} returns the value 1.max {1, 1, 3, 4, 4, 11} returns the value 11.max {1, 1, 3, 4, 4, 11} returns the value 11.count-distinct {1, 1, 3, 4, 4, 11} returns the value 4.count-distinct {1, 1, 3, 4, 4, 11} returns the value 4.
Ignore the fact that we said sets can’t contain duplicate valuesIgnore the fact that we said sets can’t contain duplicate values
Database Systems – Set TheoryDatabase Systems – Set Theory
Operations to Modify the Contents of RelationsOperations to Modify the Contents of Relations
DeletionDeletionr r r – E r – E
Delete all of the sale of “Zojjed!” from the Sales relationDelete all of the sale of “Zojjed!” from the Sales relation
sales sales sales - sales - product = “Zojjed!” product = “Zojjed!” (sales)(sales)
Delete all sales with no tax collectedDelete all sales with no tax collected
sales sales sales - sales - tax = 0tax = 0 (sales) (sales)
InsertionInsertionr r r r E EAdd a record to the sales relationAdd a record to the sales relationsales sales sales sales {(“I Walk to Eat”, “Chase”, “Utley”, 0, 15.95)} {(“I Walk to Eat”, “Chase”, “Utley”, 0, 15.95)}
Add a record to the websites relationAdd a record to the websites relationwebsites websites websites websites {(“www.mediatitan.net”, “Walking Promotions”, 2006, {(“www.mediatitan.net”, “Walking Promotions”, 2006,
“Fiction”)}“Fiction”)}
You can also insert touples based on the result of another query.You can also insert touples based on the result of another query.
Database Systems – Set TheoryDatabase Systems – Set Theory
UpdatingUpdatingRemove all the tax from the sales relationRemove all the tax from the sales relation
sales sales product, first-name, last-name, 0, total-cost(sales) product, first-name, last-name, 0, total-cost(sales)
Database Systems – Set TheoryDatabase Systems – Set Theory
JoinsJoinsThere are other forms of joins. Let’s look at the following two simple relations:There are other forms of joins. Let’s look at the following two simple relations:
employee-nameemployee-name citycity
JeterJeter New York CityNew York City
HowardHoward PhiladelphiaPhiladelphia
UtleyUtley PhiladelphiaPhiladelphia
SchillingSchilling BostonBoston
employee-nameemployee-name teamteam
GlavineGlavine MetsMets
HowardHoward PhilliesPhillies
BondsBonds GiantsGiants
SchillingSchilling Choke SoxChoke Sox
employee-nameemployee-name citycity employee-nameemployee-name teamteam
HowardHoward PhiladelphiaPhiladelphia HowardHoward PhilliesPhillies
SchillingSchilling BostonBoston SchillingSchilling Choke SoxChoke Sox
cities relation teams relation
Natural Join -> cities |x||x| teams
The natural join omits records that do not match, so we do not have records for Jeter, Utley, Glavine, or Bonds.
Database Systems – Set TheoryDatabase Systems – Set Theory
employee-nameemployee-name citycity
JeterJeter New York CityNew York City
HowardHoward PhiladelphiaPhiladelphia
UtleyUtley PhiladelphiaPhiladelphia
SchillingSchilling BostonBoston
employee-nameemployee-name teamteam
GlavineGlavine MetsMets
HowardHoward PhilliesPhillies
BondsBonds GiantsGiants
SchillingSchilling Choke SoxChoke Sox
employee-nameemployee-name citycity employee-nameemployee-name teamteam
JeterJeter New York CityNew York City NullNull NullNull
HowardHoward PhiladelphiaPhiladelphia HowardHoward PhilliesPhillies
UtleyUtley PhiladelphiaPhiladelphia NullNull NullNull
SchillingSchilling BostonBoston SchillingSchilling Choke SoxChoke Sox
cities relation teams relation
Left Outer Join -> cities LOJLOJ teams
Includes all records from the left and only those records on the right that match
Database Systems – Set TheoryDatabase Systems – Set Theory
employee-nameemployee-name citycity
JeterJeter New York CityNew York City
HowardHoward PhiladelphiaPhiladelphia
UtleyUtley PhiladelphiaPhiladelphia
SchillingSchilling BostonBoston
employee-nameemployee-name teamteam
GlavineGlavine MetsMets
HowardHoward PhilliesPhillies
BondsBonds GiantsGiants
SchillingSchilling Choke SoxChoke Sox
employee-nameemployee-name citycity employee-nameemployee-name teamteam
NullNull NullNull GlavineGlavine MetsMets
HowardHoward PhiladelphiaPhiladelphia HowardHoward PhilliesPhillies
NullNull NullNull BondsBonds GiantsGiants
SchillingSchilling BostonBoston SchillingSchilling Choke SoxChoke Sox
cities relation teams relation
Right Outer Join -> cities ROJROJ teams
Includes all records from the right and only those records on the left that match
Database Systems – Set TheoryDatabase Systems – Set Theory
employee-nameemployee-name citycity
JeterJeter New York CityNew York City
HowardHoward PhiladelphiaPhiladelphia
UtleyUtley PhiladelphiaPhiladelphia
SchillingSchilling BostonBoston
employee-nameemployee-name teamteam
GlavineGlavine MetsMets
HowardHoward PhilliesPhillies
BondsBonds GiantsGiants
SchillingSchilling Choke SoxChoke Sox
employee-nameemployee-name citycity employee-nameemployee-name teamteam
NullNull NullNull GlavineGlavine MetsMets
HowardHoward PhiladelphiaPhiladelphia HowardHoward PhilliesPhillies
NullNull NullNull BondsBonds GiantsGiants
SchillingSchilling BostonBoston SchillingSchilling Choke SoxChoke Sox
JeterJeter New York CityNew York City NullNull NullNull
UtleyUtley PhiladelphiaPhiladelphia NullNull NullNull
cities relation teams relation
Full Outer Join -> cities FOJFOJ teams
Includes all records from the right and left, tuples that do not match have nulls in their place.
Database Systems – Set TheoryDatabase Systems – Set Theory
NULLS
And (true and unknown = unknown, false and unknown = false, unknown and unknown = unknown
Or (true or unknown = true, false or unknown = unknown, unknown or unknown = unknown)
Not (not unknown = unknown)
Database Systems – Set TheoryDatabase Systems – Set Theory
REFERENTIAL INTEGRITY
Superkey of RA unique identified for a tuple.
A set of attributes SK of R such that no two tuples in any valid relation instance r(R) will have the same value for SK. That is, for any distinct tuples t1 and t2 in r(R), t1[SK] t2[SK].
Key of RA "minimal" superkey; that is, a superkey K such that removal of any attribute from K results in a set of attributes that is not a superkey.
Example: The CAR relation schema
CAR(State, Reg#, SerialNo, Make, Model, Year) has two keys
Key1 = {State, Reg#}Key2 = {SerialNo}
Both are superkeys. {SerialNo, Make} is a superkey but not a key.
If a relation has several candidate keys, one is chosen arbitrarily to be the primary key. The primary key attributes are underlined.
Database Systems – Set TheoryDatabase Systems – Set Theory
REFERNTIAL INTEGRITY
Relational Database Schema
A set S of relation schemas that belong to the same database. S is the name of the database.
S = {R1, R2, ..., Rn}
Entity Integrity:
The primary key attributes PK of each relation schema R in S cannot have null values in any tuple of r(R). This is because primary key values are used to identify the individual tuples.
t[PK] null for any tuple t in r(R)
Note, other attributes of R may be similarly constrained to disallow null values, even though they are not members of the primary key.
Database Systems – Set TheoryDatabase Systems – Set Theory
Referential Integrity
A constraint involving two relations (the previous constraints involve a single relation).
Used to specify a relationship among tuples in two relations: the referencing relation and the referenced relation.
Tuples in the referencing relation R1 have attributes FK (called foreign key attributes) that reference the primary key attributes PK of the referenced relation R2.
A tuple t1 in R1 is said to reference a tuple t2 in R2 if t1[FK] = t2[PK].
A referential integrity constraint can be displayed in a relational database schema as a directed arc from R1.FK to R2.