Upload
owen-lynch
View
221
Download
0
Embed Size (px)
Citation preview
SQL Basics
2
SQL: What is it?
• SQL stands for Structured Query Language.• It was originally developed in the early 1970s
by IBM as a way to manipulate and retrieve data stored in IBM’s relational DBMS, System R.
• It can be pronounced “Sequel” or “S-Q-L”.• With some variations, SQL is more or less the
standard query language for relational databases.
SQL
3
When do we use SQL?
• SQL is most commonly used to retrieve data from a database.
• It is also commonly used to modify data.• SQL also contains commands for creating
tables and other database objects, and for maintaining data security through granting and denying privileges to users.
SQL
4
How is SQL used?
• You can use SQL in Access using the Access Query Builder, which will be demonstrated in lab.
• All major DBMS’s, like Oracle and MySql, come with client tools which allow database managers and users to write SQL queries and send them to the database.
• The next slide shows a screen shot of one of these client tools, Microsoft’s SQL Server Management Studio.
SQL
5
Choose a database here.
Type the SQL here, then Execute.
See the results here.
6
SQL and VB Programming• It is fairly easy to create VB programs which
interact with databases in all major DBMSs.• Assignment 2will be a VB program which
interacts with Access databases.
7
What can we do with SQL?• SQL has for major components:1. DQL (Data Query Language): Used to retrieve data
from a database.2. DML (Data Manipulation Language): Used to
modify the data in a database—inserting, updating, and deleting.
3. DDL (Data Definition Language): Used to design tables and other database objects.
4. DCL (Data Control Language): Used to control access to the data using security methods.
• We will only look at DML and DQL in this course.SQL
8
SELECT Queries
• SELECT queries are the most commonly-used queries in SQL.
• They are used to retrieve data from one or more tables.
• Basic format:• SELECT [Fields] FROM [Table(s)] WHERE
[Conditions] ORDER BY [Fields] ASC/DESC• SELECT PlayerName, PlayerPosition FROM
Players WHERE TeamID=2 ORDER BY Age DESC
SQL
9
WHERE CLAUSE• A where clause is a set of conditions
following the word “WHERE” in a query.• Where clauses are used in DELETE FROM,
UPDATE, and SELECT queries.• They are used to restrict the rows to be
deleted, updated, or selected.• If you write a SQL query without a WHERE
clause, all records in the table will be affected: deleted, updated, or selected.
SQL
10
Sample WHERE Clauses• Most WHERE clauses are easy to understand.• They can use AND, OR, or NOT, and parentheses
to change the order of operation.• Examples:– WHERE PlayerID = 7– WHERE TeamID=2 AND Age>24– WHERE NOT (TeamID=1 AND Age<=30)– WHERE Age>20 OR Age<30 [That one will affect all
records]– WHERE Age <20 AND Age>30 [Won’t affect any
records] SQL
11
BETWEEN and IN in WHERE clauses• BETWEEN and IN can be used to simplify WHERE
clauses.• BETWEEN combines two inequalities:
– WHERE Age BETWEEN 24 AND 27– Is the same as– WHERE Age>=24 AND Age<=27– BETWEEN is always inclusive; that is, it includes the
endpoints.• IN combines multiple Ors:
– WHERE TeamID IN (2,3,5,8)– Is the same as– WHERE TeamID=2 OR TeamID=3 OR TeamID=5 OR TeamID=8SQL
12
The Fields list
• The list of fields following “SELECT” can include many things:
• *• Field1, Field2• Field1 AS SomethingElse (aliased field)• A math expression which can include field names (a
calculated field)• An aggregate expression like MAX(Age) or COUNT(*)• A constant value, for example 2, ‘Smith’, or NULL. This can
be useful when normalizing a database by combining several tables with identical fields into one. SQL
13
*
• Using an asterisk (*) after SELECT tells the database to send us all of the fields in the table:
Since the query used an asterisk, it returns all seven columns.
It also returns all of the rows in the table. Why?
SQL
14
SELECT * FROM TableName• “SELECT * FROM TableName” is the easiest way to view an entire table
using SQL.• If you, as a SQL expert, are trying to find data in an unfamiliar
database, this can be a good way to start.• However, if the tables are large (lots of rows), it might take a long time
for that query to run.• If you just want to see the names of the columns, you can run a
SELECT query with an impossible WHERE clause, such as:• SELECT * FROM Players WHERE 1=0• Here are the results of that query:
To summarize: “SELECT *” means “get all fields”; To get all rows, just leave off the WHERE clause. To get no rows, add an impossible WHERE clause. SQL
15
Back to the Field List• While putting “*” after SELECT is easiest, it isn’t always the best
way to query a table.• This is especially true when writing VB code: Since you will need to
refer to individual field names in your code, you can simplify writing the code by putting the actual field names in the query.
• So, instead of writing “SELECT * FROM Players”, we would write:• SELECT PlayerID, TeamID, LastName, FirstName, Age,
PlayerPosition, PhoneNumber FROM Players• Now, when we write our code which accesses these fields, we don’t
have to go back and look at the database to remember what their names were.
• And, if we don’t need certain fields, we can leave them out of the list. If the table is huge, this can make the program run faster.
SQL
16
Aliases
• Field names can be “aliased”—that is, given different names when displayed in the results.
• This is done using the SQL keyword “AS”:• SELECT PhoneNumber AS Phone FROM Players• You can also use this to break the rules about
field names when you display them—just put the “incorrect” alias in square brackets:
• SELECT PhoneNumber AS [Phone #] FROM Players SQL
17
Aliases and Calculated Fields• SQL can do arithmetic. You can include calculated fields in SELECT
queries. The calculations can include the other fields in the table.• Aliases should always be used for calculated fields in SELECT queries.
Examples (sorry, I don’t actually have a CarData table to demonstrate these):
• SELECT CarID, CarName, Miles, Gallons, Miles/Gallons AS [Miles per Gallon] FROM CarData
• SELECT CarID, Miles*1.61 AS Kilometers FROM CarData• SELECT FirstName & ‘ ‘ & LastName AS [Name] FROM Players• Note: SQL generally allows both single (‘) and double (“) quotes to be
used for string constants.• I recommend that you use single quotes, especially when creating SQL
statements in VB.
SQL
18
ORDER BY• Query results can be sorted in a desired order by using
ORDER BY at the end of the query.• You should never rely on query results being sorted UNLESS
the query includes an ORDER BY clause!• SELECT * FROM Players ORDER BY Age ASC• That query will put the youngest players on top of the list,
with the oldest at the bottom.• ASC stands for ascending order. With numbers, this means
smallest to largest; with strings, it means alphabetical order.• DESC is the opposite of ASC.• ASC is the default for ORDER BY; if you simply write “ORDER
BY Age”, SQL will interpret it as “ORDER BY Age ASC”.
SQL
19
ORDER BY Examples• SELECT LastName, FirstName FROM Players ORDER BY
LastName, FirstName• For that query, “FirstName” is the secondary sort order—
the tie breaker. For players with the same last names, those with first names beginning with A will be listed before those beginning with B, and so on. “Jones, Adam” before “Jones, Bob”.
• SELECT * FROM Players ORDER BY TeamID ASC, Age DESC• That query will display all members of team 1 before all
members of team 2, but within each team, the oldest players will be listed first.
SQL
20
TOP• To find only the best (or worst) records in any result set, you can use the
keyword TOP and the number of records you want to see right after SELECT.• If you use TOP, you should always have an ORDER BY clause—remember that
you cannot count on query results being sorted any particular way UNLESS you have an ORDER BY clause.
• SELECT TOP 5 LastName, FirstName, GPA FROM Students ORDER BY GPA• That query will find the five WORST students in the table. Don’t let the word
“TOP” confuse you. In SQL queries, it doesn’t mean “best”—it means the records that will appear at the TOP of the result set. In this case, ORDER BY GPA will list GPAs from lowest to highest, since the default order for ORDER BY is ASC.
• If you want the five BEST students, use• SELECT TOP 5 LastName, FirstName, GPA FROM Students ORDER BY GPA DESC• Note that there is no “BOTTOM” keyword in SQL; to find the bottom, reverse
the sort order and use TOP.• TOP will display ties, at least in Access. For example, in the last query, if there
are seven students with 4.0 GPAs, all seven will displayed as being in the top 5.SQL
21
Aggregates• Aggregates are fields in queries which summarize the values for all
records in the table, or for subsets of the records. The aggregates you should know for this class are COUNT, MAX, MIN, SUM, AVG, and STDEV. – MIN: Find the minimum value (or the earliest date for dates) in the specified
field in the rows meeting the conditions.– MAX: Find the maximum value (or the latest date for dates) in the specified
field in the rows meeting the conditions.– SUM: Add the values in the specified field in the rows meeting the
conditions.– AVG: Find the mean value of the specified field in the rows meeting the
conditions.– STDEV: Find the standard deviation of the values of the specified field in the
rows meeting the conditions.– COUNT: Find out how many rows there are which meet the conditions. For
this course, always use COUNT(*), not COUNT(SomeField). They generally return the same numbers, but doing COUNT on a specific field may return a smaller number if that field contains NULLs. SQL
22
Aggregate Examples• (See the TripSummary table in the SampleData.accdb database to see
what these queries refer to, although you can probably understand them without having to look at the database.)
• SELECT MAX(DistanceKilometers) AS MaxDistance FROM TripSummary• Returns a result set with one column (MaxDistance) and one row. The
one cell in the result set contains the maximum distance from all trips in the table.
• SELECT AVG(DurationMinutes) AS AvgTime FROM TripSummary WHERE Driver=4
• Again, this returns a one-column, one-row result set containing the average duration of driver 4’s trips.
• SELECT COUNT(*) AS NumTrips FROM TripSummary WHERE Driver=37• Returns the number of rows having driver = 37.
SQL
23
AAA: Always Alias Aggregates
• If your query says• SELECT MAX(Age) FROM Players• Access will return the following:
Access doesn’t have a name for MAX(Age), so it makes up one: “Expr1000”
You should always alias aggregates with a meaningful name. SELECT MAX(Age) AS MaxAge FROM Players
SQL
24
GROUP BY• Consider this table called Workers (also available in
SampleData.accdb):
•If you want to know the maximum age of all workers, you use this query:•SELECT MAX(Age) AS MaxAge FROM Workers•which will return
MaxAge64 SQL
25
GROUP BY• But if you want to know the maximum age
by JobType, use GROUP BY:• SELECT JobType, MAX(Age) AS MaxAge
FROM Workers GROUP BY JobType• Which returns
JobType MaxAgeClerical 44Management 51Other 55Professional 64Sales 22Service 24
SQL
26
Another GROUP BY Example
• If you want to know how many men and women there are in the list:
• SELECT Sex, COUNT(*) AS Num FROM Workers GROUP BY Sex
Sex Num
F 5
M 14
SQL
27
And Another!• SELECT JobType, Sex, COUNT(*) AS Num
FROM Workers GROUP BY JobType, Sex
JobType Sex NumClerical F 2Clerical M 1Management M 2Other F 1Other M 4Professional F 1Professional M 3Sales M 2Service F 1Service M 2
SQL
28
MAX/MIN vs. TOP• Above, we saw the results of this query:• SELECT MAX(Age) AS MaxAge FROM
WorkersMaxAge64
However, we can get the same result using TOP:
SELECT TOP 1 Age FROM Workers ORDER BY Age DESC
In addition, we can find out everything about the oldest worker using TOP, something we can’t do with MAX:
SELECT TOP 1 * FROM Workers ORDER BY Age DESCWorkerID YearsOfSchool Sex YearsWorking UnionMember Wage Age JobType1485 18 F 40 No 22.2 64 Professional
SQL
29
Combining TOP and Aggregates• Therefore, if you just want to find one maximum or minimum and want
to know something about the record that possesses that value, use TOP. In most cases, it will be preferable to MAX or MIN. However, if you want to calculate maximum or minimums by categories, use MAX and MIN along with GROUP BY. You can even combine TOP with GROUP BY:
• SELECT TOP 3 JobType, AVG(Wage) AS AvgWage FROM Workers GROUP BY JobType ORDER BY AVG(Wage) DESC
JobType AvgWage
Management 18.695
Professional 16.2225
Other 9.388
This tells us the three highest paying job categories and the corresponding average wages.
SQL