Upload
lydung
View
215
Download
2
Embed Size (px)
Citation preview
SQL FOR BEGINNERS1
FROM RELATIONAL ALGEBRA TO mysql
Annamari Soini, Åbo Akademi dept. of Information Technology
1 To all my hardworking students of the DB-tutorial class of 2011
The course database:
Course_scheme = (code, coursename, cu, period)Student_scheme = (studentnumber, name, UP, sex)Staff_scheme = (name, title)Who_does_what_scheme = (teacher, code)2
Course_registration_scheme = (studentnumber, code)
course courseRegistration
code courseName cu period
studentNumber code
A001 History of Art 8 1 SA01 A001
A002 Renaissance Art 5 2 SA01 A002
A003 Modern Art 5 3 SA02 A001
E001 English Grammar 5 1 SA02 A002
E002 English Literature 5 2 SA02 A003
E003 Shakespeare 1 5 3 SA03 A003
L001 Latin 1 5 1 SE01 E001
L002 Latin 2 5 2 SE01 E002
L003 Caesar 5 3 SE01 A001
L004 Catullus 5 4 SE02 E003
CS001 Introduction to CS 5 1 SE03 A001
CS002 Programming 1 5 1 SE03 L003
CS003 Programming 2 5 2 SL01 L003
CS004 Databases 5 3 SL02 L004
CS005 Data Structures 5 4 SL03 L004
SL03 A001
SCS01 CS004
SCS02 CS004
SCS03 CS005
SCS04 CS005
SCS05 A003
SCS05 L004
2 Allows several teachers per course, and several courses per teacher.
student staff
studentNumber
name UP sex name title
SA01 Mirella Conti Arts F Caravaggio Dr.
SA02 Fleur D'Amour Arts F Picasso M. A.
SA03 Belle Visiteur Arts F Austen Dr.
SA04 Carlo Straniero Arts M Eliot Dr.
SE01 Will Smith English M Nero M. A.
SE02 Angela Bowie English F Turing Dr.
SE03 Sharon Manners English F Church M. SC.
SE04 Jack Harper English M
SL01 Lucius Valerius Latin M whoDoesWhat
SL02 Antonius Primus Latin M teacher code
SL03 Fulvia Morbida Latin F Caravaggio A001
SL04 Septimus Romanus Latin M Caravaggio A002
SCS01 Betty Buffer CS F Picasso A003
SCS02 Bill Bitwise CS M Austen E001
SCS03 Lily Float CS F Austen E002
SCS04 Judy Python CS F Eliot E003
SCS05 Percy Pascal CS M Nero L001
Nero L002
Nero L003
Nero L004
Church CS001
Church CS002
Church CS003
Turing CS004
Turing CS005
Creating and populating the database
First, download the file courseDB.sql from the tutorials web page!
Then, you must log onto our database server babbage.cs.abo.fi, and start sql:
• mysql u your_userID p• (now give your mysql password)• create database your_userID_courses; • use your_userID_courses;• source courseDB.sql;
DROP TABLE if exists whoDoesWhat; /* if such a table exists, it is dropped */
DROP TABLE if exists courseRegistration;
DROP TABLE if exists staff;
DROP TABLE if exists student;
DROP TABLE if exists course;
CREATE TABLE course(code char(6), /* create the tables */ courseName varchar(30) not null, /* name the fields and give their */ cu int not null, /* datatypes */ period int, primary key (code)) /* give the primary key */ engine = innodb; /* chooses the “engine” that evaluates our queries */ /* innodb checks even the foreign key contraints */ /* MyISAM, another alternative, does not. */
CREATE TABLE student(studentNumber char(6), name varchar(30) not null, UP char(10), sex char(1), primary key (studentNumber), check (sex in ('F', 'f', 'M', 'm'))) /* check integrity constraints */ engine = innodb;
CREATE TABLE staff(name varchar(30), title varchar(10), primary key (name)) engine = innodb;
CREATE TABLE whoDoesWhat(teacher varchar(30), code char(6), primary key(teacher, code), foreign key (code) references course(code), /* foreign keys defined */ foreign key (teacher) references staff(name)) /* the tables named here */ engine = innodb; /* must precede these */
CREATE TABLE courseRegistration(studentNumber char(6), code char(6), primary key (studentNumber, code), foreign key (studentNumber) references student(studentNumber), foreign key (code) references course(code)) engine = innodb;
INSERT INTO course VALUES('A001', 'History of Art', 8, 1); /* fill the tables */INSERT INTO course VALUES('A002', 'Renaissance Art', 5, 2); INSERT INTO course VALUES('A003', 'Modern Art', 5, 3); INSERT INTO course VALUES('E001', 'English Grammar', 5, 1); INSERT INTO course VALUES('E002', 'English Literature', 5, 2); INSERT INTO course VALUES('E003', 'Shakespeare 1', 5, 3); INSERT INTO course VALUES('L001', 'Latin 1', 5, 1); INSERT INTO course VALUES('L002', 'Latin 2', 5, 2); INSERT INTO course VALUES('L003', 'Caesar', 5, 3); INSERT INTO course VALUES('L004', 'Catullus', 5, 4); INSERT INTO course VALUES('CS001', 'Introduction to CS', 5, 1); INSERT INTO course VALUES('CS002', 'Programming 1', 5, 1); INSERT INTO course VALUES('CS003', 'Programming 2', 5, 2); INSERT INTO course VALUES('CS004', 'Databases', 5, 3); INSERT INTO course VALUES('CS005', 'Data Structures', 5, 4);
INSERT INTO student VALUES('SA01', 'Mirella Conti', 'Arts', 'F'); INSERT INTO student VALUES('SA02', 'Fleur DAmour', 'Arts', 'F'); /* you must leave out */INSERT INTO student VALUES('SA03', 'Belle Visiteur', 'Arts', 'F'); /* the hyphen from */ INSERT INTO student VALUES('SA04', 'Carlo Straniero', 'Arts', 'M'); /* Fleur's surname */INSERT INTO student VALUES('SE01', 'Will Smith', 'English', 'M');INSERT INTO student VALUES('SE02', 'Angela Bowie', 'English', 'F');INSERT INTO student VALUES('SE03', 'Sharon Manners', 'English', 'F');INSERT INTO student VALUES('SE04', 'Jack Harper', 'English', 'M');INSERT INTO student VALUES('SL01', 'Lucius Valerius', 'Latin', 'M');INSERT INTO student VALUES('SL02', 'Antonius Primus', 'Latin', 'M');INSERT INTO student VALUES('SL03', 'Fulvia Morbida', 'Latin', 'F');INSERT INTO student VALUES('SL04', 'Septimus Romanus', 'Latin', 'M');INSERT INTO student VALUES('SCS01', 'Betty Buffer', 'CS', 'F');INSERT INTO student VALUES('SCS02', 'Bill Bitwise', 'CS', 'M');INSERT INTO student VALUES('SCS03', 'Lily Float', 'CS', 'F');INSERT INTO student VALUES('SCS04', 'Judy Python', 'CS', 'F');INSERT INTO student VALUES('SCS05', 'Percy Pascal', 'CS', 'M');
INSERT INTO staff VALUES('Caravaggio', 'Dr.');INSERT INTO staff VALUES('Picasso', 'M. A.');INSERT INTO staff VALUES('Austen', 'Dr.');INSERT INTO staff VALUES('Eliot', 'Dr.');INSERT INTO staff VALUES('Nero', 'M. A.');INSERT INTO staff VALUES('Turing', 'Dr.');INSERT INTO staff VALUES('Church', 'M. Sc.');
INSERT INTO whoDoesWhat VALUES('Caravaggio', 'A001');INSERT INTO whoDoesWhat VALUES('Caravaggio', 'A002');INSERT INTO whoDoesWhat VALUES('Picasso', 'A003');INSERT INTO whoDoesWhat VALUES('Austen', 'E001');INSERT INTO whoDoesWhat VALUES('Austen', 'E002');INSERT INTO whoDoesWhat VALUES('Eliot', 'E003');INSERT INTO whoDoesWhat VALUES('Nero', 'L001');INSERT INTO whoDoesWhat VALUES('Nero', 'L002');INSERT INTO whoDoesWhat VALUES('Nero', 'L003');INSERT INTO whoDoesWhat VALUES('Nero', 'L004');INSERT INTO whoDoesWhat VALUES('Church', 'CS001');INSERT INTO whoDoesWhat VALUES('Church', 'CS002');INSERT INTO whoDoesWhat VALUES('Church', 'CS003');INSERT INTO whoDoesWhat VALUES('Turing', 'CS004');INSERT INTO whoDoesWhat VALUES('Turing', 'CS005');
INSERT INTO courseRegistration VALUES('SA01', 'A001');INSERT INTO courseRegistration VALUES('SA01', 'A002'); INSERT INTO courseRegistration VALUES('SA02', 'A001'); INSERT INTO courseRegistration VALUES('SA02', 'A002'); INSERT INTO courseRegistration VALUES('SA02', 'A003');INSERT INTO courseRegistration VALUES('SA03', 'A003');INSERT INTO courseRegistration VALUES('SE01', 'E001');INSERT INTO courseRegistration VALUES('SE01', 'E002');INSERT INTO courseRegistration VALUES('SE01', 'A001');INSERT INTO courseRegistration VALUES('SE02', 'E003');INSERT INTO courseRegistration VALUES('SE03', 'A001');INSERT INTO courseRegistration VALUES('SE03', 'L003');INSERT INTO courseRegistration VALUES('SL01', 'L003');INSERT INTO courseRegistration VALUES('SL02', 'L004');INSERT INTO courseRegistration VALUES('SL03', 'L004');INSERT INTO courseRegistration VALUES('SL03', 'A001');INSERT INTO courseRegistration VALUES('SCS01', 'CS004');INSERT INTO courseRegistration VALUES('SCS02', 'CS004');INSERT INTO courseRegistration VALUES('SCS03', 'CS005');INSERT INTO courseRegistration VALUES('SCS04', 'CS005');INSERT INTO courseRegistration VALUES('SCS05', 'A003');INSERT INTO courseRegistration VALUES('SCS05', 'L004');
The basic structure of SQLqueries is
SELECTFROMWHERE
• The first of these, SELECT, chooses columns (attributes). It corresponds to in relational algebra.
• The second of these, FROM, tells you which tables to go to for information. It corresponds to the relationships in parenthesis () in relational algebra.
• The third of these, WHERE, chooses rows. It corresponds to in relational algebra.
coursename(course) would be written in SQL as follows:
mysql> select courseName from course;
+--------------------+| courseName |+--------------------+| History of Art | | Renaissance Art | | Modern Art || Introduction to CS || Programming 1 || Programming 2 || Databases || Data Structures || English Grammar || English Literature || Shakespeare 1 || Latin 1 || Latin 2 || Caesar || Catullus |+--------------------+15 rows in set (0.00 sec)
As you can see, mysql sorts the rows using the primary key code (which is not shown in the result of this query).
Example: “Which courses are given in period 3?”
period = 3(course) is translated to SQL as follows:
mysql> select * from course where period = 3;
+-------+---------------+----+--------+| code | courseName | cu | period |+-------+---------------+----+--------+| A003 | Modern Art | 5 | 3 || CS004 | Databases | 5 | 3 || E003 | Shakespeare 1 | 5 | 3 || L003 | Caesar | 5 | 3 |+-------+---------------+----+--------+4 rows in set (0.00 sec)
Sometimes we wish to create a temporary table for our results.
temp period = 4(course) would be translated in SQL as follows:
mysql> create temporary table temp select * from course where period = 4;
Query OK, 2 rows affected (0.00 sec)Records: 2 Duplicates: 0 Warnings: 0
Now you can see the contents of this table (temp exists only as long as your current session) by doing a select:
mysql> select * from temp;
+-------+-----------------+----+--------+| code | courseName | cu | period |+-------+-----------------+----+--------+| CS005 | Data Structures | 5 | 4 || L004 | Catullus | 5 | 4 |+-------+-----------------+----+--------+2 rows in set (0.00 sec)
select * chooses the whole row
Example: “What are the courses given by Turing?”
• First pick the table that has that information (whoDoesWhat). That goes into the FROMclause.
• Then choose the rows where teacher = 'Turing'. That is done in the WHEREclause.
• Finally SELECT the column asked for (code).
mysql> select code from whoDoesWhat where teacher = 'Turing';
+-------+| code |+-------+| CS004 || CS005 |+-------+2 rows in set (0.02 sec)
Next example: “What are these courses called?”
• The query above gives us only the codes.• Which table has the information required (the names of the courses)? • combine course and the query above:• The cartesian product, x, is represented by a simple comma ',': • (2 x 15 rows, and most of these will be nonsense)• Select only the sensible rows (where course.code =
whoDoesWhat.code):
mysql> select courseName from whoDoesWhat, course where teacher = 'Turing' and course.code = whoDoesWhat.code;
+-----------------+| courseName |+-----------------+| Databases || Data Structures |+-----------------+2 rows in set (0.00 sec)
Instead of ',' you can write join.
You can replace the cartesian product whoDoesWhat course with a natural join (1): this will automatically choose only the “sensible” rows (combinations)
from this cross product:
mysql> select courseName from whoDoesWhat natural join course where teacher = 'Turing';
+-----------------+| courseName |+-----------------+| Databases || Data Structures |+-----------------+2 rows in set (0.01 sec)
Example: “Find the names of the students who take A001”
mysql> select name from student natural join courseRegistration where code = 'A001';
+----------------+| name |+----------------+| Mirella Conti || Fleur DAmour || Will Smith || Sharon Manners || Fulvia Morbida |+----------------+5 rows in set (0.01 sec)
If you wish to use the cartesian product (join), you will have to write:
mysql> select name from student, courseRegistration where student.studentNumber = courseRegistration.studentNumber and code = 'A001';
So using natural join gives you less work
Three more joins (ways to combine two tables):
• for left outer join , use in mysql natural left join• example: student courseregistration:
mysql> select * from student natural left join courseRegistration;
+---------------+------------------+---------+------+-------+| studentNumber | name | UP | sex | code |+---------------+------------------+---------+------+-------+| SA01 | Mirella Conti | Arts | F | A001 || SA01 | Mirella Conti | Arts | F | A002 || SA02 | Fleur DAmour | Arts | F | A001 || SA02 | Fleur DAmour | Arts | F | A002 || SA02 | Fleur DAmour | Arts | F | A003 || SA03 | Belle Visiteur | Arts | F | A003 || SA04 | Carlo Straniero | Arts | M | NULL || SCS01 | Betty Buffer | CS | F | CS004 || SCS02 | Bill Bitwise | CS | M | CS004 || SCS03 | Lily Float | CS | F | CS005 || SCS04 | Judy Python | CS | F | CS005 || SCS05 | Percy Pascal | CS | M | A003 || SCS05 | Percy Pascal | CS | M | L004 || SE01 | Will Smith | English | M | A001 || SE01 | Will Smith | English | M | E001 || SE01 | Will Smith | English | M | E002 || SE02 | Angela Bowie | English | F | E003 || SE03 | Sharon Manners | English | F | A001 || SE03 | Sharon Manners | English | F | L003 || SE04 | Jack Harper | English | M | NULL || SL01 | Lucius Valerius | Latin | M | L003 || SL02 | Antonius Primus | Latin | M | L004 || SL03 | Fulvia Morbida | Latin | F | A001 || SL03 | Fulvia Morbida | Latin | F | L004 || SL04 | Septimus Romanus | Latin | M | NULL |+---------------+------------------+---------+------+-------+25 rows in set (0.00 sec)
Here you can see that the students who are not registered on any courses are still included:
• first the natural join is made• then these students who did not get coupled with any row from
courseRegistration are added, with NULL for code. • Finally, mysql has sorted the result, using the primary key
studentNumber.
• for right outer join , use in mysql natural right join• example: courseregistration course:
mysql> select * from courseRegistration natural right join course;
+-------+--------------------+----+--------+---------------+| code | courseName | cu | period | studentNumber |+-------+--------------------+----+--------+---------------+| A001 | History of Art | 8 | 1 | SA01 || A001 | History of Art | 8 | 1 | SA02 || A001 | History of Art | 8 | 1 | SE01 || A001 | History of Art | 8 | 1 | SE03 || A001 | History of Art | 8 | 1 | SL03 || A002 | Renaissance Art | 5 | 2 | SA01 || A002 | Renaissance Art | 5 | 2 | SA02 || A003 | Modern Art | 5 | 3 | SA02 || A003 | Modern Art | 5 | 3 | SA03 || A003 | Modern Art | 5 | 3 | SCS05 || CS001 | Introduction to CS | 5 | 1 | NULL || CS002 | Programming 1 | 5 | 1 | NULL || CS003 | Programming 2 | 5 | 2 | NULL || CS004 | Databases | 5 | 3 | SCS01 || CS004 | Databases | 5 | 3 | SCS02 || CS005 | Data Structures | 5 | 4 | SCS03 || CS005 | Data Structures | 5 | 4 | SCS04 || E001 | English Grammar | 5 | 1 | SE01 || E002 | English Literature | 5 | 2 | SE01 || E003 | Shakespeare 1 | 5 | 3 | SE02 || L001 | Latin 1 | 5 | 1 | NULL || L002 | Latin 2 | 5 | 2 | NULL || L003 | Caesar | 5 | 3 | SE03 || L003 | Caesar | 5 | 3 | SL01 || L004 | Catullus | 5 | 4 | SCS05 || L004 | Catullus | 5 | 4 | SL02 || L004 | Catullus | 5 | 4 | SL03 |+-------+--------------------+----+--------+---------------+27 rows in set (0.00 sec)
• First natural join• Then fill in information from the right side table! (In this case, courses
that have no registered students. These are marked in red). • Finally mysql has sorted the table using the primary key code.
Full outer join is not implemented in mysql, but we can implement it in parts.
To exemplify this join, we define a table scholarship, with information about which scholarships there are, and which students have received them (we assume that there may only be one student who gets the scholarship, and some scholarships have not yet been awarded to anyone):
Scholarship_scheme = (scholarshipname, studentnumber)
scholarship
scholarshipname studentnumber
Top Latin SL03
Logic Excellence SCS04
Arts Achievement NULL
Creative Writer SE01
Top Athlete NULL
mysql> select * from scholarship;
+------------------+---------------+| scholarshipName | studentNumber |+------------------+---------------+| Arts Achievement | NULL || Top Athlete | NULL || Logic Excellence | SCS04 || Creative Writer | SE01 || Top Latin | SL03 |+------------------+---------------+5 rows in set (0.00 sec)
To get the full outer join: first make the natural join scholarship 1 student:
mysql> select * from scholarship natural join student;
Mysql has sorted the table using the studentNumber, even though that is not the primary key.
+---------------+------------------+----------------+---------+------+| studentNumber | scholarshipName | name | UP | sex |+---------------+------------------+----------------+---------+------+| SCS04 | Logic Excellence | Judy Python | CS | F || SE01 | Creative Writer | Will Smith | English | M || SL03 | Top Latin | Fulvia Morbida | Latin | F |+---------------+------------------+----------------+---------+------+3 rows in set (0.01 sec)
As you can see, here we only get those rows from scholarship that are matched by a row from student.
Then fill in information from the left side table (scholarship), namely those rows which did not find a pair in the natural join (that is, scholarships that are not awarded to any student as yet):
mysql> select * from scholarship natural left join student;
+---------------+------------------+----------------+---------+------+| studentNumber | scholarshipName | name | UP | sex |+---------------+------------------+----------------+---------+------+| NULL | Arts Achievement | NULL | NULL | NULL || NULL | Top Athlete | NULL | NULL | NULL || SCS04 | Logic Excellence | Judy Python | CS | F || SE01 | Creative Writer | Will Smith | English | M || SL03 | Top Latin | Fulvia Morbida | Latin | F |+---------------+------------------+----------------+---------+------+5 rows in set (0.00 sec)
As you can see, here we get all the rows from the original table scholarship back, with the information from student filled in where possible.
Finally, fill in with information from the right side table (those students who did not get a scholarship). This is the tricky part; we cannot simply write something like
mysql> select * from scholarship natural left join student as S1 natural right join student as S2;
When you open two copies of the same table, they must be given new names (“aliases”) so that you can separate between them. Unfortunately, a temporary table may not be opened twice in the same query, even with aliases. Should you need this, create two temporary tables with the same contents but with different names.
The query is syntactically correct, but the second natural join again eliminates those scholarships for which no students can be found:
+---------------+------------------+---------+------+------------------+| studentNumber | name | UP | sex | scholarshipName |+---------------+------------------+---------+------+------------------+| SA01 | Mirella Conti | Arts | F | NULL || SA02 | Fleur DAmour | Arts | F | NULL || SA03 | Belle Visiteur | Arts | F | NULL || SA04 | Carlo Straniero | Arts | M | NULL || SCS01 | Betty Buffer | CS | F | NULL || SCS02 | Bill Bitwise | CS | M | NULL || SCS03 | Lily Float | CS | F | NULL || SCS04 | Judy Python | CS | F | Logic Excellence || SCS05 | Percy Pascal | CS | M | NULL || SE01 | Will Smith | English | M | Creative Writer || SE02 | Angela Bowie | English | F | NULL || SE03 | Sharon Manners | English | F | NULL || SE04 | Jack Harper | English | M | NULL || SL01 | Lucius Valerius | Latin | M | NULL || SL02 | Antonius Primus | Latin | M | NULL || SL03 | Fulvia Morbida | Latin | F | Top Latin || SL04 | Septimus Romanus | Latin | M | NULL |+---------------+------------------+---------+------+------------------+17 rows in set (0.00 sec)
Another thought is to create two different selections, one using left join, one using right join, and making a union of the two (you will learn more about union later in this tutorial). The problem here is that both the left and the right join put the “left” or the “right” table first (to the left), so the columns will be ordered differently in these two selections; compare the order in
mysql> select * /* a temporary table for scholarships with students */ from scholar; /* who received them */ +---------------+------------------+----------------+---------+------+| studentNumber | scholarshipName | name | UP | sex |+---------------+------------------+----------------+---------+------+| NULL | Arts Achievement | NULL | NULL | NULL || NULL | Top Athlete | NULL | NULL | NULL || SCS04 | Logic Excellence | Judy Python | CS | F || SE01 | Creative Writer | Will Smith | English | M || SL03 | Top Latin | Fulvia Morbida | Latin | F |+---------------+------------------+----------------+---------+------+5 rows in set (0.00 sec)
and mysql> select * from shipSOS; /* a temporary table for students with/out a scholarship */
+---------------+------------------+---------+------+------------------+| studentNumber | name | UP | sex | scholarshipName |+---------------+------------------+---------+------+------------------+| SA01 | Mirella Conti | Arts | F | NULL || SA02 | Fleur DAmour | Arts | F | NULL || SA03 | Belle Visiteur | Arts | F | NULL || SA04 | Carlo Straniero | Arts | M | NULL || SCS01 | Betty Buffer | CS | F | NULL || SCS02 | Bill Bitwise | CS | M | NULL || SCS03 | Lily Float | CS | F | NULL || SCS04 | Judy Python | CS | F | Logic Excellence || SCS05 | Percy Pascal | CS | M | NULL || SE01 | Will Smith | English | M | Creative Writer || SE02 | Angela Bowie | English | F | NULL || SE03 | Sharon Manners | English | F | NULL || SE04 | Jack Harper | English | M | NULL || SL01 | Lucius Valerius | Latin | M | NULL || SL02 | Antonius Primus | Latin | M | NULL || SL03 | Fulvia Morbida | Latin | F | Top Latin || SL04 | Septimus Romanus | Latin | M | NULL |+---------------+------------------+---------+------+------------------+17 rows in set (0.00 sec)
Ideally, union should refuse to combine these tables, because their elements do not have the same structure. However, mysql union does not check this but makes a horrible mixture of them both:
mysql> select * from scholar union select * from shipSOS;
+---------------+------------------+----------------+---------+------------------+| studentNumber | scholarshipName | name | UP | sex |+---------------+------------------+----------------+---------+------------------+| NULL | Arts Achievement | NULL | NULL | NULL || NULL | Top Athlete | NULL | NULL | NULL || SCS04 | Logic Excellence | Judy Python | CS | F || SE01 | Creative Writer | Will Smith | English | M || SL03 | Top Latin | Fulvia Morbida | Latin | F || SA01 | Mirella Conti | Arts | F | NULL || SA02 | Fleur DAmour | Arts | F | NULL || SA03 | Belle Visiteur | Arts | F | NULL || SA04 | Carlo Straniero | Arts | M | NULL || SCS01 | Betty Buffer | CS | F | NULL || SCS02 | Bill Bitwise | CS | M | NULL || SCS03 | Lily Float | CS | F | NULL || SCS04 | Judy Python | CS | F | Logic Excellence || SCS05 | Percy Pascal | CS | M | NULL || SE01 | Will Smith | English | M | Creative Writer || SE02 | Angela Bowie | English | F | NULL || SE03 | Sharon Manners | English | F | NULL || SE04 | Jack Harper | English | M | NULL || SL01 | Lucius Valerius | Latin | M | NULL || SL02 | Antonius Primus | Latin | M | NULL || SL03 | Fulvia Morbida | Latin | F | Top Latin || SL04 | Septimus Romanus | Latin | M | NULL |+---------------+------------------+----------------+---------+------------------+22 rows in set (0.00 sec)
(While 'Septimus Romanus' might be quite a nice name for a scholarship, no such scholarship is awarded at my university. Neither have we students called 'Latin' or 'CS', with UP:s like 'F' or 'M' (fe/male studies?), and most definitely we have no students with sex = NULL or 'Top Latin'!)
There is a way around this problem, however. The problem here is that the attributes (the columns) got in a wrong order in the left/right join, and the union just supposed that the elements of the second set would have the same structure as those in the first set. We cannot affect the way right join arranges the elements (starting from the right side table and putting its columns as the left side columns of the result) BUT we can choose in which order we want to have the columns by using select:
mysql> create temporary table ship select studentNumber, scholarshipName, name, UP, sex from scholarship natural right join student;
Query OK, 17 rows affected (0.00 sec)Records: 17 Duplicates: 0 Warnings: 0
mysql> select * from ship;
+---------------+------------------+------------------+---------+------+| studentNumber | scholarshipName | name | UP | sex |+---------------+------------------+------------------+---------+------+| SA01 | NULL | Mirella Conti | Arts | F || SA02 | NULL | Fleur DAmour | Arts | F || SA03 | NULL | Belle Visiteur | Arts | F || SA04 | NULL | Carlo Straniero | Arts | M || SCS01 | NULL | Betty Buffer | CS | F || SCS02 | NULL | Bill Bitwise | CS | M || SCS03 | NULL | Lily Float | CS | F || SCS04 | Logic Excellence | Judy Python | CS | F || SCS05 | NULL | Percy Pascal | CS | M || SE01 | Creative Writer | Will Smith | English | M || SE02 | NULL | Angela Bowie | English | F || SE03 | NULL | Sharon Manners | English | F || SE04 | NULL | Jack Harper | English | M || SL01 | NULL | Lucius Valerius | Latin | M || SL02 | NULL | Antonius Primus | Latin | M || SL03 | Top Latin | Fulvia Morbida | Latin | F || SL04 | NULL | Septimus Romanus | Latin | M |+---------------+------------------+------------------+---------+------+17 rows in set (0.00 sec)
The temporary table ship now has the result of the right join in the order we need it to be for the union:
mysql> select * from scholar union select * from ship;
+---------------+------------------+------------------+---------+------+| studentNumber | scholarshipName | name | UP | sex |+---------------+------------------+------------------+---------+------+| NULL | Arts Achievement | NULL | NULL | NULL || NULL | Top Athlete | NULL | NULL | NULL || SCS04 | Logic Excellence | Judy Python | CS | F || SE01 | Creative Writer | Will Smith | English | M || SL03 | Top Latin | Fulvia Morbida | Latin | F || SA01 | NULL | Mirella Conti | Arts | F || SA02 | NULL | Fleur DAmour | Arts | F || SA03 | NULL | Belle Visiteur | Arts | F || SA04 | NULL | Carlo Straniero | Arts | M || SCS01 | NULL | Betty Buffer | CS | F || SCS02 | NULL | Bill Bitwise | CS | M || SCS03 | NULL | Lily Float | CS | F || SCS05 | NULL | Percy Pascal | CS | M || SE02 | NULL | Angela Bowie | English | F || SE03 | NULL | Sharon Manners | English | F || SE04 | NULL | Jack Harper | English | M || SL01 | NULL | Lucius Valerius | Latin | M || SL02 | NULL | Antonius Primus | Latin | M || SL04 | NULL | Septimus Romanus | Latin | M |+---------------+------------------+------------------+---------+------+19 rows in set (0.00 sec)
Note that you can only make a union between two selections; e g
scholar union ship
would give an error.
Aggregate functions:
• Work on a collection of rows (“aggregates”) to produce one result(a set with just one value as a member).
• Mysql has a number of predefined operations for these, see e g http://dev.mysql.com/doc/refman/5.1/en/groupbyfunctions.html
• We must select the result that we wish to see.• Example: “How many courses are given in period 1?”• Here we go to the table course and choose the rows where period
equals 1. Then we count how many these are:
mysql> select count(*) from course where period = 1;
+----------+| count(*) |+----------+| 5 |+----------+1 row in set (0.00 sec)
• “How many cu:s does one get if one takes all the courses that are given in period 1?”
mysql> select sum(cu) from course where period = 1;
+---------+| sum(cu) |+---------+| 28 |+---------+1 row in set (0.01 sec)
count(*) counts how many rows there are in the result. We can also use count(code) or count(period); all of these will give the same result, a set with exactly one member the result of the function as the result.
With aggregate functions one sometimes uses grouping:
• first you form the groups• then the operation requested is performed in each group separately:• Example: “How many courses are given in each period?”
◦ First, form the groups, using period as the criterion:◦ Then, count the rows in each group:
mysql> select period, count(*) from course group by period;
+--------+----------+| period | count(*) |+--------+----------+| 1 | 5 || 2 | 4 || 3 | 4 || 4 | 2 |+--------+----------+4 rows in set (0.00 sec)
• it is easy to give a nicer name for the result column using as:
mysql> select period, count(*) as number_of_courses from course group by period;
+--------+-------------------+| period | number_of_courses |+--------+-------------------+| 1 | 5 || 2 | 4 || 3 | 4 || 4 | 2 |+--------+-------------------+4 rows in set (0.00 sec)
If you want to see the grouping criterion (here period) in the result, you must select it.
• “How many points will each student get if they pass all the courses they are registered on?”
◦ First form the groups: one group for each student, containing the courses this student is registered on, then add the cu columns:
mysql> select studentNumber, sum(cu) from course natural join courseRegistration group by studentNumber;
+---------------+---------+| studentNumber | sum(cu) |+---------------+---------+| SA01 | 13 || SA02 | 18 || SA03 | 5 || SCS01 | 5 || SCS02 | 5 || SCS03 | 5 || SCS04 | 5 || SCS05 | 10 || SE01 | 18 || SE02 | 5 || SE03 | 13 || SL01 | 5 || SL02 | 5 || SL03 | 13 |+---------------+---------+14 rows in set (0.00 sec)
Set operations
• union ∪• intersection ∩• difference • division (also see the separate pdf on this!)
• multiplication you are already familiar with! This is done by ','.
Don't forget to select the student number, otherwise it will not be shown in the result!
UNION: combines the elements of two sets
• the elements of each set must have the same type• (but as you have already seen, mysql does NOT check this it only
checks that the tables to be united have the same number of columns, no matter what the contents of these are ...)
• duplicate elements are eliminated from the result
Example: “Which students take either A001 or L004?”
• Those who take A001:
mysql> select studentNumber from courseRegistration where code = 'A001';
+---------------+| studentNumber |+---------------+| SA01 || SA02 || SE01 || SE03 || SL03 |+---------------+5 rows in set (0.00 sec)
• Those who take L004:
mysql> select studentNumber from courseRegistration where code = 'L004';
+---------------+| studentNumber |+---------------+| SCS05 || SL02 || SL03 |+---------------+3 rows in set (0.01 sec)
And now the union of these selections:
mysql> select studentNumber from courseRegistration where code = 'A001' union select studentNumber from courseRegistration where code = 'L004';
+---------------+ | studentNumber |+---------------+ | SA01 || SA02 || SE01 || SE03 || SL03 || SCS05 || SL02 |+---------------+7 rows in set (0.00 sec)
If you feel that these selections are getting a bit complicated, you can always use temporary tables:
mysql> create temporary table listA001 select studentNumber from courseRegistration where code = 'A001';
Query OK, 5 rows affected (0.00 sec)Records: 5 Duplicates: 0 Warnings: 0
mysql> create temporary table listL004 select studentNumber from courseRegistration where code = 'L004';
Query OK, 3 rows affected (0.00 sec)Records: 3 Duplicates: 0 Warnings: 0
As you can see, SL03 who is registered on both courses, is listed only once in the union. If you want to have all occurrences included, use union all.
Now you can use these temporary tables for your union:
mysql> select * from listA001 union select * from listL004;
+---------------+| studentNumber |+---------------+| SA01 || SA02 || SE01 || SE03 || SL03 || SCS05 || SL02 |+---------------+7 rows in set (0.00 sec)
Here you can see examples of aliasing, i e giving an inner selection or an attribute a name that can be used in this query. Mysql demands that a selection made in a FROMclause is given such a name (otherwise you get an error message saying “Every derived table must have its own alias”). tab1 is such an alias.
INTERSECTION: chooses the elements that are in both sets:
Example: “Which students take both A001 and L004?”
• Intersect is not implemented in mysql, so we must implement it ourselves. There are several ways of doing this:
Let's start from the temporary tables that we have already created, listA001 and listL004. Think of these as sets (for that is what they are; no duplicate elements in these tables)! The definition of intersection is that the elements that are in both sets are in the intersection of these sets (and, accordingly, in the result of our query):
The intersection of listA001 and listL004
We can implement this definition directly; start from the members of the first set and check whether they are also in the other set! If so, they are part of the result; if not, they are not part of the result:
mysql> select studentNumber from listA001 where studentNumber IN (select studentNumber from listL004);
The keyword IN corresponds to the set membership operator ∈ .
listA001listL004
SA01
SA02SE01
SE03
SL03
SL02
SCS05
Here you can see examples of aliasing, i e giving an inner selection or an attribute a name that can be used in this query. Mysql demands that a selection made in a FROMclause is given such a name (otherwise you get an error message saying “Every derived table must have its own alias”). tab1 is such an alias.
The aliases list1 and list2 are necessary here, because we have derived tables in the fromclause. sn1 and sn2 are used to differentiate between the student numbers of the first selection and the student numbers of the second selection. Alternatively, you can use the dotnotation (table.attribute):
mysql> select list1.studentNumber from ((select studentNumber from listA001) as list1, (select studentNumber from listL004) as list2) where list1.studentNumber = list2.studentNumber;
+---------------+| studentNumber |+---------------+| SL03 |+---------------+1 row in set (0.01 sec)
If you don't have these temporary tables, you can still make the same query using the original selectclauses:
mysql> select studentNumber from (select studentNumber from courseRegistration where code = 'A001') as tab1 where studentNumber IN (select studentNumber from courseRegistration where code = 'L004');
+---------------+| studentNumber |+---------------+| SL03 |+---------------+1 row in set (0.00 sec)
Another way to implement intersection is (again) to test whether same elements are found in both sets; those that are, will form the result:
Here you can see examples of aliasing, i e giving an inner selection or an attribute a name that can be used in this query. Mysql demands that a selection made in a FROMclause is given such a name (otherwise you get an error message saying “Every derived table must have its own alias”). tab1 is such an alias.
The aliases list1 and list2 are necessary here, because we have derived tables in the fromclause. sn1 and sn2 are used to differentiate between the student numbers of the first selection and the student numbers of the second selection. Alternatively, you can use the dotnotation (table.attribute):
mysql> select list1.studentNumber from ((select studentNumber from listA001) as list1, (select studentNumber from listL004) as list2) where list1.studentNumber = list2.studentNumber;
mysql> select sn1 from ((select studentNumber as sn1 from listA001) as list1, (select studentNumber as sn2 from listL004) as list2) where sn1 = sn2;
+------+| sn1 |+------+| SL03 |+------+1 row in set (0.00 sec)
However, this kind of query is unnecessarily complicated, even though it yields the right result. The easiest way is to use natural join to implement the intersection. (You remember that natural join combines the rows which have the same value for the column or columns in common, and eliminates those rows that do not find a pair.) Normally you use natural join to combine information from several tables, but you can also use it to eliminate information: if you have two sets consisting only of student numbers, naturally joining them eliminates all the student numbers which are not in both sets, and leaves us only with those numbers which are found in both sets and that is the definition of intersection!
The aliases list1 and list2 are necessary here, because we have derived tables in the fromclause. sn1 and sn2 are used to differentiate between the student numbers of the first selection and the student numbers of the second selection. Alternatively, you can use the dotnotation (table.attribute):
mysql> select list1.studentNumber from ((select studentNumber from listA001) as list1, (select studentNumber from listL004) as list2) where list1.studentNumber = list2.studentNumber;
Using natural join to implement intersection:
mysql> select studentNumber from ((select studentNumber from listA001) as list1 natural join (select studentNumber from listL004) as list2);
+---------------+| studentNumber |+---------------+| SL03 |+---------------+1 row in set (0.00 sec)
SET DIFFERENCE: Which elements are in the first set but NOT in the other?
Example: “Which students take A001 but not L004?”
• Set difference is not implemented in mysql, so SQL keywords such as 'except' or 'minus' do not work here. However, it is not difficult to implement set difference.
• Here we shall again use the two temporary tables, listA001 and listL004, that we have created earlier. That makes the query less complicated.
• We shall use the definition of set difference: those elements that are in the first set but NOT in the other set are included in the result:
mysql> select studentNumber from listA001 where studentNumber NOT IN (select studentNumber from listL004);
Because natural join joins the column(s) with the same name, it is essential that the columns here have their original name, studentNumber, and no aliases! The selections made in the fromclause must have aliases, though.
+---------------+| studentNumber |+---------------+| SA01 || SA02 || SE01 || SE03 |+---------------+4 rows in set (0.00 sec)
This part not this part ... and not this part.
If you do not have the temporary tables, you can give the original selections in the query:
mysql> select studentNumber from (select studentNumber from courseRegistration where code = 'A001') as tab1 where studentNumber NOT IN (select studentNumber from courseRegistration where code = 'L004');+---------------+| studentNumber |+---------------+| SA01 || SA02 || SE01 || SE03 |+---------------+4 rows in set (0.00 sec)
listA001
listL004
SA01
SA02
SE01
SE03
SL03
SL02
SCS05
As you can see, the query is almost identical to the definition of intersection; only here we must use NOT IN instead of IN, NOT IN being SQL for ∉.
DIVISION: Which elements share ALL of another relation? I. e., which elements have ALL of this other relation in common?
• Used for queries with “all”:• “Which students take ALL the courses taught by Caravaggio?” (We
might give this query if we want to find out who are Caravaggio's greatest fans.)
◦ All the courses taught by Caravaggio is the set A001, A002 and we ask which students, if any, are registered on ALL of these courses. They have all of these courses in common.
• As intersection and difference, this operation is NOT implemented in mysql. Unfortunately, implementing it is more than a bit tricky.
• The first way to reason about it is as follows:• “I want a list of such students of whom it is true that there is NO course
by Caravaggio that these students do NOT take.” This is using double negation to implement division:
mysql> select distinct studentNumber from courseRegistration as A where NOT EXISTS (select code from whoDoesWhat where teacher = 'Caravaggio' and code NOT IN (select code from courseRegistration as B where A.studentNumber = B.studentNumber));
+---------------+| studentNumber |+---------------+| SA01 || SA02 |+---------------+2 rows in set (0.02 sec)
The only students who have ALL (both) of these courses are numbers SA01 and SA02. They are the result of our query. Let's look at this in detail:
select distinct studentNumber from courseRegistration as A where NOT EXISTS (select code from whoDoesWhat where teacher = 'Caravaggio' and code NOT IN (select code from courseRegistration as B where A.studentNumber = B.studentNumber));
This first part says the following: We wish to find such students from the table courseRegistration for which the following is NOT TRUE:
select distinct studentNumber from courseRegistration as A where NOT EXISTS (select code from whoDoesWhat where teacher = 'Caravaggio' and code NOT IN (select code from courseRegistration as B where A.studentNumber = B.studentNumber));
That is:
• Make a list of all the courses given by Caravaggio.• Check these, one by one, against a list of all the courses that this
student takes.• If there is a course given by Caravaggio that the student is not taking,
make a note of it! • There may be no such notes for this student to be a part of the
result.
That SA02 also takes a third course (given by another teacher) does not affect the result.
Another, but similar way of writing this query is the following:
select distinct studentNumber from courseRegistration as A where NOT EXISTS (select code from whoDoesWhat where teacher = 'Caravaggio' and NOT EXISTS (select B.code from courseRegistration as B where A.studentNumber = B.studentNumber and whoDoesWhat.code = B.code));
+---------------+| studentNumber |+---------------+| SA01 || SA02 |+---------------+2 rows in set (0.00 sec)
The interpretation is as follows (the first part, the outer selection, is the same):
“Find such students from courseRegistration for whom the following is NOT TRUE:”
select distinct studentNumber from courseRegistration as A where NOT EXISTS (select code from whoDoesWhat where teacher = 'Caravaggio' and NOT EXISTS (select B.code from courseRegistration as B where A.studentNumber = B.studentNumber and whoDoesWhat.code = B.code));
That is ...
• Make a list of all the courses given by Caravaggio (the red selection)• Check these, one by one: is the student in question registered on this
course?• If there is a course given by Caravaggio that the student is taking,
make a note of it! These courses are the result of the innermost, yellow selection.
• If the course was taken by the student (if it is a part of the innermost, yellow selection), it is NOT SELECTED in the red selection. NOT EXISTS forbids that. The red selection gives, for each student tested, those courses by Caravaggio that this student is NOT taking.
• If this red selection is empty (meaning that there were NO courses given by Caravaggio that this student did NOT take), we know that this student takes ALL the courses given by Caravaggio and, consequently, this student is part of the result (the outermost, light blue selection).
Whooo!!!
And now that you have survived that, we'll take a look at the easiest way to implement division “count and compare”:
• The idea behind this method is to count how many different values there are in the divisor (i. e. all the values that one should have to be a part of the result).
• When this is applied to our example we first count how many courses are given by Caravaggio (2) .
• Then we count how many of these3 courses our students take. If we get the same result, the student in question has all the courses asked for and will appear in the result.
Let's see how this is done ...
3 By these courses we mean courses by Caravaggio. The student may take other courses by other teachers, but these courses do not interest us, and we do not count them.
mysql> select studentNumber from courseRegistration where code IN (select code from whoDoesWhat where teacher = 'Caravaggio') group by studentNumber having count(code) = (select count(*) from (select code from whoDoesWhat where teacher = 'Caravaggio') as tab);
+---------------+| studentNumber |+---------------+| SA01 || SA02 |+---------------+2 rows in set (0.00 sec)
Let's see what we have here:
mysql> select studentNumber from courseRegistration where code IN (select code from whoDoesWhat where teacher = 'Caravaggio') group by studentNumber having count(code) = (select count(*) from (select code from whoDoesWhat where teacher = 'Caravaggio') as tab);
• The red selection represents Caravaggio's courses. This selection must be done twice.
• Having this selection, we take ALL the registrations for Caravaggio's courses and group them by student number. Now we can count, for each student separately, how many courses by Caravaggio this student is taking (remember that when you use group by, first the groups are formed and only then the aggregate function, here count(code), is applied to each group).
• Now count how many courses Caravaggio gives (the yellow count):
mysql> select studentNumber from courseRegistration where code IN (select code from whoDoesWhat where teacher = 'Caravaggio') group by studentNumber having count(code) = (select count(*) from (select code from whoDoesWhat where teacher = 'Caravaggio') as tab);
• Compare this number to the number of courses by Caravaggio that the student in question is taking (the orange count). Remember that you must use having when you are talking about conditions that describe groups!
• If the number of courses that the student is taking count(code) is equal to the number of courses Caravaggio is giving count(*) you may conclude that the student in question is really taking ALL the courses given by Caravaggio.
When using groups and aggregate functions, remember the following: if there is a WHEREclause, that will be used first to determine which rows are selected. Then the groups are formed as dictated by the GROUP BY criterion. Now aggregate functions are applied to these groups separately. Finally we apply the HAVINGclause to see which “group results” get selected into the final result. I. e.
1) WHERE chooses rows2) GROUP BY forms groups3) AGGREGATE FUNCTIONS work on these groups4) HAVING chooses rows into the final result.