Announcements!
1. YoushouldbeJupyter notebookNinjas!
2. WelcomeTing!• NewTA-Officehoursonwebsite(roomtobeannounced)
3. Projectgroupsfinalized!• IfyoudonothaveagrouptalkwithusASAP!
4. ProblemSet#1released
2
Lecture2
Today’sLecture
1. RecapfromLecture2&Multi-tablequeries• ACTIVITY:Multi-tablequeries
2. Setoperators&nestedqueries• ACTIVITY:Setoperatorsubtleties
4
Lecture2
Whatyouwilllearnaboutinthissection
1. PrimarykeysandForeignkeysrecap
2. Joins:SQLsemantics
3. ACTIVITY:Multi-tablequeries
7
Lecture2>Section3
8
KeysandForeignKeys
PName Price Category ManufacturerGizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorksSingleTouch $149.99 Photography CanonMultiTouch $203.99 Household Hitachi
Product
CompanyCName StockPrice Country
GizmoWorks 25 USACanon 65 JapanHitachi 15 Japan
Whatisaforeignkeyvs.akeyhere?
Lecture2>Section3>ForeignKeys
Akey isaminimalsubsetofattributesthatactsasauniqueidentifierfortuplesinarelation
Iftwotuplesagreeonthevaluesofthekey,thentheymustbethesame tuple!
9
KeysandForeignKeys
PName Price Category ManufacturerGizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorksSingleTouch $149.99 Photography CanonMultiTouch $203.99 Household Hitachi
Product
CompanyCName StockPrice Country
GizmoWorks 25 USACanon 65 JapanHitachi 15 Japan
Whatisaforeignkeyvs.akeyhere?
Lecture2>Section3>ForeignKeys
A foreignkey isanattribute(orcollectionofattributes)inonetablethatuniquelyidentifiesarowofanothertable.
The foreignkey is defined inasecondtable,butitreferstotheprimary key inthefirsttable.
DeclaringForeignKeys
Lecture2>Section3>ForeignKeys
Company(CName: string, StockPrice: float, Country: string)Product(PName: string, Price: float, Category: string, Manufacturer: string)
CREATE TABLE Product(pname VARCHAR(100),price FLOAT,category VARCHAR(100),manufacturer VARCHAR(100),PRIMARY KEY (pname, manufacturer),FOREIGN KEY (manufacturer) REFERENCES Company(cname)
)
Lecture2>Section3>ForeignKeys
Canwedothis?Whatwouldbetheproblem?
DeclaringForeignKeysCREATE TABLE Company(
cname VARCHAR(100),stockprice FLOAT,country VARCHAR(100),PRIMARY KEY (cname),FOREIGN KEY (cname) REFERENCES Product(pname, manufacturer)
)
CREATE TABLE Product(pname VARCHAR(100),price FLOAT,category VARCHAR(100),manufacturer VARCHAR(100),PRIMARY KEY (pname, manufacturer)
)
DeclaringForeignKeysLecture2>Section3>ForeignKeys
CREATE TABLE Company(cname VARCHAR(100),stockprice FLOAT,country VARCHAR(100),PRIMARY KEY (cname),FOREIGN KEY (cname) REFERENCES Product(pname, manufacturer)
)
CREATE TABLE Product(pname VARCHAR(100),price FLOAT,category VARCHAR(100),manufacturer VARCHAR(100),PRIMARY KEY (pname, manufacturer)
)
Wecanhaveproductswithoutaregisteredcompany!Baddesign!We’llseemorenextweek.
Canwedothis?Whatwouldbetheproblem?
Lecture2>Section3>ForeignKeys
DeclaringForeignKeysCREATE TABLE Company(
cname VARCHAR(100),stockprice FLOAT,country VARCHAR(100),PRIMARY KEY (cname),FOREIGN KEY (cname) REFERENCES Product(pname, manufacturer)
)
CREATE TABLE Product(pname VARCHAR(100),price FLOAT,category VARCHAR(100),manufacturer VARCHAR(100),PRIMARY KEY (pname, manufacturer)
)
Ifthe primarykey isasetofcolumns(a compositekey),thenthe foreignkey alsomustbeasetofcolumnsthatcorrespondstothe compositekey.
14
Joins
PName Price Category ManufGizmo $19 Gadgets GWorks
Powergizmo $29 Gadgets GWorks
SingleTouch $149 Photography Canon
MultiTouch $203 Household Hitachi
ProductCompany
Cname Stock CountryGWorks 25 USACanon 65 JapanHitachi 15 Japan
PName PriceSingleTouch $149.99
Lecture2>Section3>Joins:Basics
SELECT PName, PriceFROM Product, CompanyWHERE Manufacturer = CName
AND Country=‘Japan’AND Price <= 200
AnexampleofSQLsemantics
15
SELECT R.AFROM R, SWHERE R.A = S.B
A13
B C2 33 43 5
A B C1 2 31 3 41 3 53 2 33 3 43 3 5
CrossProduct
A B C3 3 43 3 5
A33
ApplyProjection
Lecture2>Section3>Joins:semantics
ApplySelections/Conditions
Output
Notethesemantics ofajoin
16
SELECT R.AFROM R, SWHERE R.A = S.B
Lecture2>Section3>Joins:semantics
Recall:Crossproduct(AXB)isthesetofalluniquetuplesinA,B
Ex:{a,b,c}X{1,2}={(a,1),(a,2),(b,1),(b,2),(c,1),(c,2)}
=Filtering!
=Returningonlysome attributes
Rememberingthisorderiscriticaltounderstandingtheoutputofcertainqueries(seelateron…)
1. Takecrossproduct:𝑋 = 𝑅×𝑆
2. Applyselections/conditions:𝑌 = 𝑟, 𝑠 ∈ 𝑋 𝑟. 𝐴 == 𝑟. 𝐵}
3. Applyprojections togetfinaloutput:𝑍 = (𝑦. 𝐴, )𝑓𝑜𝑟𝑦 ∈ 𝑌
Note:wesay“semantics”not“executionorder”
• Theprecedingslidesshowwhatajoinmeans
• NotactuallyhowtheDBMSexecutesitunderthecovers
Lecture2>Section3>Joins:semantics
18
ASubtletyaboutJoins
Findallcountriesthatmanufacturesomeproductinthe‘Gadgets’category.
SELECT CountryFROM Product, CompanyWHERE Manufacturer=CName AND Category=‘Gadgets’
Lecture2>Section3>ACTIVITYLecture2>Section3>Joins:semantics
Product(PName, Price, Category, Manufacturer)
Company(CName, StockPrice, Country)
19
AsubtletyaboutJoins
PName Price Category Manuf
Gizmo $19 Gadgets GWorks
Powergizmo $29 Gadgets GWorks
SingleTouch $149 Photography Canon
MultiTouch $203 Household Hitachi
Product CompanyCname Stock Country
GWorks 25 USA
Canon 65 Japan
Hitachi 15 Japan
Country??
SELECT CountryFROM Product, CompanyWHERE Manufacturer=Cname
AND Category=‘Gadgets’
Whatistheproblem?What’sthesolution?
Lecture2>Section3>ACTIVITYLecture2>Section3>Joins:semantics
21
SELECT DISTINCT R.AFROM R, S, TWHERE R.A=S.A OR R.A=T.A
AnUnintuitiveQuery
Whatdoesitcompute?
Lecture2>Section3>ACTIVITY
22
SELECT DISTINCT R.AFROM R, S, TWHERE R.A=S.A OR R.A=T.A
AnUnintuitiveQuery
ComputesRÇ (SÈ T)
ButwhatifS=f?
S T
R
Gobacktothesemantics!
Lecture2>Section3>ACTIVITY
23
SELECT DISTINCT R.AFROM R, S, TWHERE R.A=S.A OR R.A=T.A
AnUnintuitiveQuery
• Recallthesemantics!1. Takecross-product2. Applyselections /conditions3. Applyprojection
• IfS={},thenthecrossproductofR,S,T={},andthequeryresult={}!
Mustconsidersemanticshere.Aretheremoreexplicitwaytodosetoperationslikethis?
Lecture2>Section3>ACTIVITY
Whatyouwilllearnaboutinthissection
1. Multiset operatorsinSQL
2. Nestedqueries
3. ACTIVITY:Setoperatorsubtleties
26
Lecture3>Section1
27
SELECT DISTINCT R.AFROM R, S, TWHERE R.A=S.A OR R.A=T.A
AnUnintuitiveQuery
ComputesRÇ (SÈ T)
ButwhatifS=f?
Lecture3>Section1>SetOperators
S T
R
Gobacktothesemantics!
Whatdoesitcompute?
28
SELECT DISTINCT R.AFROM R, S, TWHERE R.A=S.A OR R.A=T.A
AnUnintuitiveQuery
Lecture3>Section1>SetOperators
• Recallthesemantics!1. Takecross-product2. Applyselections /conditions3. Applyprojection
• IfS={},thenthecrossproductofR,S,T={},andthequeryresult={}!
Mustconsidersemanticshere.Aretheremoreexplicitwaytodosetoperationslikethis?
29
SELECT DISTINCT R.AFROM R, S, TWHERE R.A=S.A OR R.A=T.A
WhatdoesthislooklikeinPython?
Lecture3>Section1>SetOperators
• Semantics:1. Takecross-product
2. Applyselections /conditions
3. Applyprojection
Joins/cross-products arejustnestedforloops (insimplestimplementation)!
If-thenstatements!
RÇ (SÈ T)
S T
R
30
SELECT DISTINCT R.AFROM R, S, TWHERE R.A=S.A OR R.A=T.A
WhatdoesthislooklikeinPython?
Lecture3>Section1>SetOperators
RÇ (SÈ T)
S T
R
output = {}
for r in R:for s in S:
for t in T:if r[‘A’] == s[‘A’] or r[‘A’] == t[‘A’]:
output.add(r[‘A’])return list(output)
CanyouseenowwhathappensifS=[]?
RecallMultisets
32
Lecture3>Section1>SetOperators
Tuple
(1,a)
(1,a)
(1, b)
(2,c)
(2,c)
(2,c)
(1,d)
(1,d)
Tuple 𝝀(𝑿)
(1,a) 2
(1,b) 1
(2,c) 3
(1, d) 2EquivalentRepresentationsofaMultiset
Multiset X
Multiset X
Note:Inasetallcountsare{0,1}.
𝝀 𝑿 =“CountoftupleinX”(Itemsnotlistedhaveimplicitcount0)
GeneralizingSetOperationstoMultisetOperations
33
Lecture3>Section1>SetOperators
Tuple 𝝀(𝑿)
(1,a) 2
(1,b) 0
(2,c) 3
(1, d) 0
Multiset X
Tuple 𝝀(𝒀)
(1,a) 5
(1,b) 1
(2,c) 2
(1, d) 2
Multiset Y
Tuple 𝝀(𝒁)
(1,a) 2
(1,b) 0
(2,c) 2
(1, d) 0
Multiset Z
∩ =
𝝀 𝒁 = 𝒎𝒊𝒏(𝝀 𝑿 , 𝝀 𝒀 )Forsets,thisisintersection
34
Lecture3>Section1>SetOperators
Tuple 𝝀(𝑿)
(1,a) 2
(1,b) 0
(2,c) 3
(1, d) 0
Multiset X
Tuple 𝝀(𝒀)
(1,a) 5
(1,b) 1
(2,c) 2
(1, d) 2
Multiset Y
Tuple 𝝀(𝒁)
(1,a) 5
(1,b) 1
(2,c) 3
(1, d) 2
Multiset Z
∪ =
𝝀 𝒁 = 𝒎𝒂𝒙(𝝀 𝑿 , 𝝀 𝒀 )Forsets,
thisisunion
GeneralizingSetOperationstoMultisetOperations
ExplicitSetOperators:INTERSECT
36
SELECT R.AFROM R, SWHERE R.A=S.AINTERSECTSELECT R.AFROM R, TWHERE R.A=T.A
Lecture3>Section1>SetOperators
Q1 Q2
𝑟. 𝐴 𝑟. 𝐴 = 𝑠. 𝐴 ∩ 𝑟. 𝐴 𝑟. 𝐴 = 𝑡. 𝐴}
UNION
37
SELECT R.AFROM R, SWHERE R.A=S.AUNIONSELECT R.AFROM R, TWHERE R.A=T.A
Lecture3>Section1>SetOperators
Q1 Q2
𝑟. 𝐴 𝑟. 𝐴 = 𝑠. 𝐴 ∪ 𝑟. 𝐴 𝑟. 𝐴 = 𝑡. 𝐴}
Whyaren’tthereduplicates?
Whatifwewantduplicates?
UNIONALL
38
SELECT R.AFROM R, SWHERE R.A=S.AUNION ALLSELECT R.AFROM R, TWHERE R.A=T.A
Lecture3>Section1>SetOperators
Q1 Q2
𝑟. 𝐴 𝑟. 𝐴 = 𝑠. 𝐴 ∪ 𝑟. 𝐴 𝑟. 𝐴 = 𝑡. 𝐴}
ALLindicatestheMultisetdisjointunionoperation
39
Lecture3>Section1>SetOperators
Tuple 𝝀(𝑿)
(1,a) 2
(1,b) 0
(2,c) 3
(1, d) 0
Multiset X
Tuple 𝝀(𝒀)
(1,a) 5
(1,b) 1
(2,c) 2
(1, d) 2
Multiset Y
Tuple 𝝀(𝒁)
(1,a) 7
(1,b) 1
(2,c) 5
(1, d) 2
Multiset Z
=
𝝀 𝒁 = 𝝀 𝑿 + 𝝀 𝒀Forsets,
thisisdisjointunion
GeneralizingSetOperationstoMultisetOperations
t
EXCEPT
40
SELECT R.AFROM R, SWHERE R.A=S.AEXCEPTSELECT R.AFROM R, TWHERE R.A=T.A
Lecture3>Section1>SetOperators
Q1 Q2
𝑟. 𝐴 𝑟. 𝐴 = 𝑠. 𝐴 \{𝑟. 𝐴|𝑟. 𝐴 = 𝑡. 𝐴}
Whatisthemultiset version?
𝝀 𝒁 = 𝝀 𝑿 − 𝝀 𝒀ForelementsthatareinX
INTERSECT:Stillsomesubtleproblems…
41
Company(name, hq_city)Product(pname, maker, factory_loc)
SELECT hq_cityFROM Company, ProductWHERE maker = name
AND factory_loc = ‘US’INTERSECTSELECT hq_cityFROM Company, ProductWHERE maker = name
AND factory_loc = ‘China’
WhatiftwocompanieshaveHQinUS:BUTonehasfactoryinChina(butnotUS)andviceversa? Whatgoeswrong?
“HeadquartersofcompanieswhichmakegizmosinUSAND China”
Lecture3>Section1>SetOperators
INTERSECT:Rememberthesemantics!
42
Company(name, hq_city) AS CProduct(pname, maker, factory_loc) AS P
SELECT hq_cityFROM Company, ProductWHERE maker = name
AND factory_loc=‘US’INTERSECTSELECT hq_cityFROM Company, ProductWHERE maker = nameAND factory_loc=‘China’
Lecture3>Section1>SetOperators
Example:CJOINPonmaker=nameC.name C.hq_city P.pname P.maker P.factory_loc
XCo. Seattle X XCo. U.S.
YInc. Seattle X Y Inc. China
INTERSECT:Rememberthesemantics!
43
Company(name, hq_city) AS CProduct(pname, maker, factory_loc) AS P
SELECT hq_cityFROM Company, ProductWHERE maker = name
AND factory_loc=‘US’INTERSECTSELECT hq_cityFROM Company, ProductWHERE maker = nameAND factory_loc=‘China’
Lecture3>Section1>SetOperators
Example:CJOINPonmaker=nameC.name C.hq_city P.pname P.maker P.factory_loc
XCo. Seattle X XCo. U.S.
YInc. Seattle X Y Inc. China
XCohasafactoryintheUS(butnotChina)YInc.hasafactorinChina(butnotUS)
ButSeattleisreturnedbythequery!
WedidtheINTERSECTonthewrongattributes!
OneSolution:NestedQueries
44
Company(name, hq_city)Product(pname, maker, factory_loc)
SELECT DISTINCT hq_cityFROM Company, ProductWHERE maker = name
AND name IN (SELECT makerFROM ProductWHERE factory_loc = ‘US’)
AND name IN (SELECT makerFROM ProductWHERE factory_loc = ‘China’)
Lecture3>Section1>NestedQueries
“HeadquartersofcompanieswhichmakegizmosinUSAND China”
Note:Ifwehadn’tusedDISTINCThere,howmanycopiesofeachhq_city wouldhavebeenreturned?
High-levelnoteonnestedqueries
• WecandonestedqueriesbecauseSQLiscompositional:
• Everything(inputs/outputs)isrepresentedasmultisets- theoutputofonequerycanthusbeusedastheinputtoanother(nesting)!
• Thisisextremely powerful!
Lecture3>Section1>NestedQueries
46
Nestedqueries:Sub-queriesReturningRelations
SELECT c.cityFROM Company cWHERE c.name IN (
SELECT pr.makerFROM Purchase p, Product prWHERE p.product = pr.nameAND p.buyer = ‘Joe Blow‘)
“CitieswhereonecanfindcompaniesthatmanufactureproductsboughtbyJoeBlow”
Company(name, city)Product(name, maker)Purchase(id, product, buyer)
Lecture3>Section1>NestedQueries
Anotherexample:
47
NestedQueries
SELECT c.cityFROM Company c,
Product pr, Purchase p
WHERE c.name = pr.makerAND pr.name = p.productAND p.buyer = ‘Joe Blow’
Isthisqueryequivalent?
Bewareofduplicates!
Lecture3>Section1>NestedQueries
48
NestedQueries
SELECT DISTINCT c.cityFROM Company c,
Product pr, Purchase p
WHERE c.name = pr.makerAND pr.name = p.productAND p.buyer = ‘Joe Blow’
Nowtheyareequivalent
Lecture3>Section1>NestedQueries
SELECT DISTINCT c.cityFROM Company cWHERE c.name IN (SELECT pr.makerFROM Purchase p, Product prWHERE p.product = pr.name
AND p.buyer = ‘Joe Blow‘)
49
Subqueries ReturningRelations
SELECT nameFROM ProductWHERE price > ALL(
SELECT priceFROM ProductWHERE maker = ‘Gizmo-Works’)
Product(name, price, category, maker)
Youcanalsouseoperationsoftheform:• s>ALLR• s<ANYR• EXISTSR
Lecture3>Section1>NestedQueries
Findproductsthataremoreexpensivethanallthoseproducedby“Gizmo-Works”
Ex:
ANYandALLnotsupportedbySQLite.
50
SubqueriesReturningRelations
SELECT p1.nameFROM Product p1WHERE p1.maker = ‘Gizmo-Works’
AND EXISTS(SELECT p2.nameFROM Product p2WHERE p2.maker <> ‘Gizmo-Works’
AND p1.name = p2.name)
Product(name, price, category, maker)
Youcanalsouseoperationsoftheform:• s>ALLR• s<ANYR• EXISTSR
Lecture3>Section1>NestedQueries
Find‘copycat’products,i.e.productsmadebycompetitorswiththesamenamesasproductsmadeby“Gizmo-Works”
Ex:
<>means!=
51
NestedqueriesasalternativestoINTERSECTandEXCEPT
(SELECT R.A, R.BFROM R)
INTERSECT(SELECT S.A, S.BFROM S)
SELECT R.A, R.BFROM RWHERE EXISTS(
SELECT *FROM SWHERE R.A=S.A AND R.B=S.B)
SELECT R.A, R.BFROM RWHERE NOT EXISTS(
SELECT *FROM SWHERE R.A=S.A AND R.B=S.B)
Lecture3>Section1>NestedQueries
INTERSECTandEXCEPTnotinsomeDBMSs!
IfR,Shavenoduplicates,thencanwritewithoutsub-queries(HOW?)(SELECT R.A, R.B
FROM R)EXCEPT(SELECT S.A, S.BFROM S)
52
CorrelatedQueries
SELECT DISTINCT titleFROM Movie AS mWHERE year <> ANY(
SELECT yearFROM MovieWHERE title = m.title)
Movie(title, year, director, length)
Notealso:thiscanstillbeexpressedassingleSFWquery…
Lecture3>Section1>NestedQueries
Findmovieswhosetitleappearsmorethanonce.
Notethescopingofthevariables!
53
ComplexCorrelatedQuery
SELECT DISTINCT x.name, x.makerFROM Product AS xWHERE x.price > ALL(
SELECT y.priceFROM Product AS yWHERE x.maker = y.maker
AND y.year < 1972)
Lecture3>Section1>NestedQueries
Findproducts(andtheirmanufacturers)thataremoreexpensivethanallproductsmadebythesamemanufacturerbefore1972
Product(name, price, category, maker, year)
Canbeverypowerful(alsomuchhardertooptimize)