1 Real SQL Programming Embedded SQL Java Database Connectivity Stored Procedures

Preview:

Citation preview

1

Real SQL Programming

Embedded SQLJava Database Connectivity

Stored Procedures

2

Three-Tier Architecture

A common environment for using a database has three tiers of processors:

1. Web servers --- talk to the user.2. Application servers --- execute the

business logic.3. Database servers --- get what the app

servers need from the database.

3

Example: Amazon

Database holds the information about products, customers, etc.

Business logic includes things like “what do I do after someone clicks ‘checkout’?” Answer: Show the “how will you pay

for this?” screen.

4

Environments, Connections, Queries

The database is, in many DB-access languages, an environment.

Database servers maintain some number of connections, so app servers can ask queries or perform modifications.

The app server issues statements : queries and modifications, usually.

5

Diagram to Remember

Environment

Connection

Statement

6

Real SQL Programming

Embedded SQL (write SQL with “markup” and mixed with host language)

Call Level Interface (host level programming via SQL API) Java, C++, Python, Ruby, etc

Stored Procedures User defined procedures/function that

become part of the schema (server level)

7

Embedded SQL

Key idea: A preprocessor turns SQL statements into procedure calls that fit with the surrounding host-language code.

All embedded SQL statements begin with EXEC SQL, so the preprocessor can find them easily.

8

Shared Variables

To connect SQL and the host-language program, the two parts must share some variables.

Declarations of shared variables are bracketed by:EXEC SQL BEGIN DECLARE SECTION;

<host-language declarations>EXEC SQL END DECLARE SECTION;

Alwaysneeded

9

Use of Shared Variables

In SQL, the shared variables must be preceded by a colon. They may be used as constants provided

by the host-language program. They may get values from SQL

statements and pass those values to the host-language program.

In the host language, shared variables behave like any other variable.

10

Example: Looking Up Prices

We’ll use C with embedded SQL to sketch the important parts of a function that obtains a beer and a bar, and looks up the price of that beer at that bar.

Assumes database has our usual Sells(bar, beer, price) relation.

11

Example: C Plus SQL

EXEC SQL BEGIN DECLARE SECTION;char theBar[21], theBeer[21];float thePrice;

EXEC SQL END DECLARE SECTION;/* obtain values for theBar and theBeer */

EXEC SQL SELECT price INTO :thePriceFROM SellsWHERE bar = :theBar AND beer = :theBeer;/* do something with thePrice */

Note 21-chararrays neededfor 20 chars +endmarker

SELECT-INTOas in PSM

12

Embedded Queries

Embedded SQL has the some limitations regarding queries: SELECT-INTO for a query should

produce a single tuple. Otherwise, you have to use a cursor.

13

Cursor Statements

Declare a cursor c with:EXEC SQL DECLARE c CURSOR FOR

<query>; Open and close cursor c with:EXEC SQL OPEN CURSOR c;EXEC SQL CLOSE CURSOR c; Fetch from c by:EXEC SQL FETCH c INTO <variable(s)>;

Macro NOT FOUND is true if and only if the FETCH fails to find a tuple.

14

Example: Print Joe’s Menu

Let’s write C + SQL to print Joe’s menu – the list of beer-price pairs that we find in Sells(bar, beer, price) with bar = Joe’s Bar.

A cursor will visit each Sells tuple that has bar = Joe’s Bar.

15

Example: Declarations

EXEC SQL BEGIN DECLARE SECTION;char theBeer[21]; float thePrice;

EXEC SQL END DECLARE SECTION;EXEC SQL DECLARE c CURSOR FOR

SELECT beer, price FROM SellsWHERE bar = ’Joe’’s Bar’;

The cursor declaration goesoutside the declare-section

16

Example: Executable Part

EXEC SQL OPEN CURSOR c;while(1) {

EXEC SQL FETCH cINTO :theBeer, :thePrice;

if (NOT FOUND) break;/* format and print theBeer and thePrice */

}EXEC SQL CLOSE CURSOR c;

The C styleof breakingloops

17

Need for Dynamic SQL

Most applications use specific queries and modification statements to interact with the database. The DBMS compiles EXEC SQL …

statements into specific procedure calls and produces an ordinary host-language program that uses a library.

What about dynamic sql queries

18

Dynamic SQL

Preparing a query:EXEC SQL PREPARE <query-name>

FROM <text of the query>; Executing a query:EXEC SQL EXECUTE <query-name>; “Prepare” = optimize query. Prepare once, execute many times.

19

Example: A Generic Interface

EXEC SQL BEGIN DECLARE SECTION;char query[MAX_LENGTH];

EXEC SQL END DECLARE SECTION;while(1) {

/* issue SQL> prompt *//* read user’s query into array query */EXEC SQL PREPARE q FROM :query;EXEC SQL EXECUTE q;

}q is an SQL variablerepresenting the optimizedform of whatever statementis typed into :query

20

Execute-Immediate

If we are only going to execute the query once, we can combine the PREPARE and EXECUTE steps into one.

Use:EXEC SQL EXECUTE IMMEDIATE

<text>;

21

Example: Generic Interface Again

EXEC SQL BEGIN DECLARE SECTION;char query[MAX_LENGTH];

EXEC SQL END DECLARE SECTION;while(1) {/* issue SQL> prompt *//* read user’s query into array query */EXEC SQL EXECUTE IMMEDIATE :query;

}

22

Host/SQL Interfaces Via Libraries

Another approach to connecting databases to conventional languages is to use library calls.

1. C + CLI2. Java + JDBC3. PHP + PEAR/DB4. Python+PyGreSQL

23

JDBC

Java Database Connectivity (JDBC) is a library with Java as the host language.

24

Making a Connection

import java.sql.*;

Class.forName(org.postgresql.Driver);

Connection myCon =

DriverManager.getConnection(…);

The JDBC classes

The driverfor Postgres;others exist

URL of the databaseyour name, and passwordgo here.

Loaded byforName

25

Statements

JDBC provides two classes:1. Statement = an object that can

accept a string that is a SQL statement and can execute such a string.

2. PreparedStatement = an object that has an associated SQL statement ready to execute.

26

Creating Statements

The Connection class has methods to create Statements and PreparedStatements.

Statement stat1 = myCon.createStatement();PreparedStatement stat2 =

myCon.createStatement(”SELECT beer, price FROM Sells ” +”WHERE bar = ’Joe’ ’s Bar’ ”

);createStatement with no argument returnsa Statement; with one argument it returnsa PreparedStatement.

27

Executing SQL Statements

JDBC distinguishes queries from modifications, which it calls “updates.”

Statement and PreparedStatement each have methods executeQuery and executeUpdate. For Statements: one argument: the query

or modification to be executed. For PreparedStatements: no argument.

28

Example: Update

stat1 is a Statement. We can use it to insert a tuple as:stat1.executeUpdate(

”INSERT INTO Sells ” +

”VALUES(’Brass Rail’,’Bud’,3.00)”

);

29

Example: Query

stat2 is a PreparedStatement holding the query ”SELECT beer, price FROM Sells WHERE bar = ’Joe’’s Bar’ ”.

executeQuery returns an object of class ResultSet – we’ll examine it later.

The query:ResultSet menu = stat2.executeQuery();

30

Accessing the ResultSet

An object of type ResultSet is something like a cursor.

Method next() advances the “cursor” to the next tuple. The first time next() is applied, it gets

the first tuple. If there are no more tuples, next()

returns the value false.

31

Accessing Components of Tuples

When a ResultSet is referring to a tuple, we can get the components of that tuple by applying certain methods to the ResultSet.

Method getX (i ), where X is some type, and i is the component number, returns the value of that component. The value must have type X.

32

Example: Accessing Components

Menu = ResultSet for query “SELECT beer, price FROM Sells WHERE bar = ’Joe’ ’s Bar’ ”.

Access beer and price from each tuple by:while ( menu.next() ) {

theBeer = Menu.getString(1);

thePrice = Menu.getFloat(2);

/*something with theBeer and thePrice*/

}

33

Stored Procedures

PSM, or “persistent stored modules,” allows us to store procedures as database schema elements.

PSM = a mixture of conventional statements (if, while, etc.) and SQL.

Lets us do things we cannot do in SQL alone.

34

Basic PSM Form: PL/PgSQL

CREATE FUNCTION <name> (<parameter list> )

RETURNS <type> AS $$[ DECLARE declarations ]BEGIN statementsEND;$$ language plpgsql;

35

Example: Stored Procedure

Let’s write a procedure that takes two arguments b and p, and adds a tuple to Sells(bar, beer, price) that has bar = ’Joe’’s Bar’, beer = b, and price = p. Used by Joe to add to his menu more

easily.

36

Kinds of PSM statements – (1)

RETURN <expression> sets the return value of a function. Unlike C, etc., RETURN does not

terminate function execution. DECLARE <name> <type> used to

declare local variables. BEGIN . . . END for groups of

statements. Separate statements by semicolons.

37

Kinds of PSM Statements – (2)

Assignment statements: <variable> := <expression>;

Example: b := ’Bud’; Statement labels: give a statement

a label by prefixing a name and a colon.

38

Example: IF

Let’s rate bars by how many customers they have, based on Frequents(drinker,bar). <100 customers: ‘unpopular’. 100-199 customers: ‘average’. >= 200 customers: ‘popular’.

Function Rate(b) rates bar b.

39

Example: IF (continued)CREATE FUNCTION Rate (b CHAR(20) )

RETURNS CHAR(10) AS $$DECLARE cust INTEGER;

BEGINSELECT COUNT(*) INTO cust FROM Frequents

WHERE bar = b);IF cust < 100 THEN RETURN ’unpopular’ELSEIF cust < 200 THEN RETURN ’average’ELSE RETURN ’popular’END IF;

END;$$

Number ofcustomers ofbar b

Return occurs here, not atone of the RETURN statements

NestedIF statement

40

Loops

Basic form:<loop name>: LOOP <statements> END LOOP;

Exit from a loop by:LEAVE <loop name>

41

Example: Exiting a Loop

loop1: LOOP. . .LEAVE loop1;. . .

END LOOP;

If this statement is executed . . .

Control winds up here

42

Breaking Cursor Loops – (4)

The structure of a cursor loop is thus:cursorLoop: LOOP

FETCH c INTO … ;

IF NotFound THEN LEAVE cursorLoop;

END IF;

END LOOP;

43

Example: Cursor

Let’s write a function that examines Sells(bar, beer, price), and raises by $1 the price of all beers at Joe’s Bar that are under $3. Returns a count of number of changes made.

44

The Needed Declarations

CREATE FUNCTION JoeGouge( ) RETURNS int AS $$ DECLARE theBeer CHAR(20);DECLARE thePrice REAL;

DECLARE cnt INT;DECLARE c CURSOR FOR(SELECT beer, price FROM Sells WHERE bar = ’Joe’’s Bar’);

Used to holdbeer-price pairswhen fetchingthrough cursor c

Returns Joe’s menu

45

The Function BodyBEGIN

OPEN c; cnt := 0;

<<menuLoop>> LOOPFETCH c INTO theBeer, thePrice;IF NOT FOUND THEN LEAVE menuLoop END IF;IF thePrice < 3.00 THEN UPDATE Sells SET price = thePrice + 1.00 WHERE bar = ’Joe’’s Bar’ AND beer = theBeer; cnt := cnt+1;END IF;

END LOOP;CLOSE c;

RETURN cnt;END;

Check if the recentFETCH failed toget a tuple

If Joe charges less than $3 forthe beer, raise its price atJoe’s Bar by $1.

Disks and Files

Record Formats, Files & Indexing

47

Disks, Memory, and Files

Query Optimizationand Execution

Relational Operators

Files and Access Methods

Buffer Management

Disk Space Management

DB

The BIG picture…

48

Hierarchy of Storage

Cache is faster and more expensive than...

RAM is faster and more expensive than... DISK is faster and more expensive than... DVD is faster and more expensive than... Tape is faster and more expensive than...

If more information than will fit at one level, have to store some of it at the next level

Why Not Store Everything in Main Memory?

Costs too much. $100 will buy you either 4GB of RAM or 1TB of disk today.

Main memory is volatile. We want data to be saved between runs. (Obviously!)

Typical storage hierarchy: Main memory (RAM) for currently used data. Disk for the main database (secondary storage). Tapes for archiving older versions of the data

(tertiary storage).

Disks and Files

DBMS stores information on (“hard”) disks. This has major implications for DBMS

design! READ: transfer data from disk to main memory

(RAM). WRITE: transfer data from RAM to disk. Both are high-cost operations, relative to in-

memory operations, so must be planned carefully!

Disks Secondary storage device of choice. Main advantage over tapes: random

access vs. sequential. Data is stored and retrieved in units

called disk blocks or pages. Unlike RAM, time to retrieve a disk page

varies depending upon location on disk. Therefore, relative placement of pages on

disk has major impact on DBMS performance!

52

Pages and Blocks Data files decomposed into pages

Fixed size piece of contiguous information in the file

Unit of exchange between disk and main memory Disk divided into page size blocks of storage

Page can be stored in any block Application’s request for data satisfied by

Read data page to DBMS buffer in main memory Transfer requested data from buffer to application

Application’s request to change data satisfied by Update DBMS buffer (Eventually) copy buffer page to page on disk

Components of a Disk

Platters

The platters spin (7200rpm).

Spindle

The arm assembly is moved in or out to position a head on a desired track. Tracks under heads make a cylinder (imaginary!).

Disk head

Arm movement

Arm assembly

Only one head reads/writes at any one time.

Tracks

Sector

Block size is a multiple of sector size (which is fixed).

54

I/O Time to Access a Page Seek latency- time to position heads

over cylinder containing page (~ 10 - 20 ms)

Rotational latency - additional time for platters to rotate so that start of page is under head (~ 5 - 10 ms)

Transfer time - time for platter to rotate over page (depends on size of page)

Latency = seek latency + rotational latency

Goal - minimize average latency, reduce number of page transfers

Accessing a Disk Page Time to access (read/write) a disk block:

seek time (moving arms to position disk head on track) rotational delay (waiting for block to rotate under head) transfer time (actually moving data to/from disk surface)

Seek time and rotational delay dominate. Seek time varies from about 1 to 20msec Rotational delay varies from 0 to 10msec Transfer rate is about 1msec per 4KB page

Latency = seek latency + rotational latency Goal - minimize average latency, reduce

number of page transfers

56

Reducing Latency Store pages containing related information close

together on disk Justification: If application accesses x, it will next

access data related to x with high probability Page size tradeoff:

Large page size - data related to x stored in same page; hence additional page transfer can be avoided

Small page size - reduce transfer time, reduce buffer size in main memory

Typical page size - 4096 bytes

Arranging Pages on Disk `Next’ block concept:

blocks on same track, followed by blocks on same cylinder, followed by blocks on adjacent cylinder

Blocks in a file should be arranged sequentially on disk (by `next’), to minimize seek and rotational delay.

For a sequential scan, pre-fetching several pages at a time is a big win!

RAID

Disk Array: Arrangement of several disks that gives abstraction of a single, large disk.

Goals: Increase performance and reliability.

Two main techniques: Data striping: Data is partitioned; size of

a partition is called the striping unit. Partitions are distributed over several disks.

Redundancy: More disks => more failures. Redundant information allows reconstruction of data if a disk fails.

Recommended