48
SQL Tips and Techniques Paul Derouin Learning Consultant Teradata Learning

SQL Tips & Techniques

Embed Size (px)

Citation preview

Page 1: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 1/47

SQL Tips and Techniques

Paul DerouinLearning Consultant

Teradata Learning

Page 2: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 2/47

2 pg.

Table of Contents

Using CASE statementRandom SamplingDynamic SQLJoin and Aggregate Index

Timestamp ApplicationsPerformance RemindersSummary

SQL Tips and Techniques

Page 3: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 3/47

3 pg.

Using Union For Set Tagging

SELECT first_name,last_name

, ' employee ' AS"

Employee//Type"

FROM employeeWHERE manager_employee_number = 1019

UNIONSELECT first_name

,last_name

,' manager 'FROM employeeWHERE employee_number = 1019

ORDER BY 2;

Employeefirst_name last_name Type---------------------------- -------------------- --------------Carol Kanieski employeeRon Kubic manager John Stein employee

Show the name of manager 1019 and the names of his direct reports.

Page 4: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 4/47

4 pg.

Using Union For Set Tagging (Cont.)

3) We do an all-AMPs RETRIEVE step from CUSTOMER_SERVICE.employee by way of an all-rowsscan with a condition of ( "CUSTOMER_SERVICE.employee.manager_employee_number = 1019")into Spool 1, which is redistributed by hash code to all AMPs. The size of Spool 1 is estimated with noconfidence to be 3 rows. The estimated time for this step is 0.16 seconds.

4) We do a single-AMP RETRIEVE step from CUSTOMER_SERVICE.employee by way of the uniqueprimary index "CUSTOMER_SERVICE.employee.employee_number = 1019" with no residualconditions into Spool 1, which is redistributed by hash code to all AMPs. Then we do a SORT to orderSpool 1 by the sort

key in spool field1 eliminating duplicate rows. The size of Spool 1 is estimated with high confidence to be2 to 26 rows. The estimated time for this step is 0.15 seconds.

SELECT first_name ,last_name, ' employee ' AS " Employee//Type "FROM employeeWHERE manager_employee_number = 1019

UNIONSELECTfirst_name ,last_name ,' manager '

FROM employeeWHERE employee_number = 1019

ORDER BY 2;

The total estimated time is 0.31 seconds

Page 5: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 5/47

5 pg.

Using CASE For Set Tagging

SELECT first_name,last_name

,CASE WHEN manager_employee_number = 1019 THEN 'employee'WHEN employee_number = 1019 THEN 'manager'ELSE NULL END

FROM employee

WHERE employee_number = 1019OR manager_employee_number = 1019;

first_name last_name <CASE expression>-------------------------- -------------------- ------------------------------Carol Kanieski employeeRon Kubic manager John Stein employee

Show the name of manager 1019 and the names of his direct reports.

Page 6: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 6/47

6 pg.

Using CASE For Set Tagging SELECT first_name ,last_name,CASE WHEN manager_employee_number = 1019 THEN 'employee'

WHEN employee_number = 1019 THEN 'manager'ELSE NULL END

FROM employeeWHERE employee_number = 1019OR manager_employee_number = 1019;

3) We do an all-AMPs RETRIEVE step from CUSTOMER_SERVICE.employee by way of an all-rowsscan with a condition of ( "(CUSTOMER_SERVICE.employee.employee_number = 1019) OR(CUSTOMER_SERVICE.employee.manager_employee_number = 1019)") into Spool 1, which isbuilt locally on the AMPs. The size of Spool 1 is estimated with no confidence to be 4 rows. Theestimated time for this step is 0.15 seconds.

The total estimated time is 0.15 seconds vs 0.31 for UNION

Use of CASE requires only a single table scan.

Page 7: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 7/47

7 pg.

Reporting By Day of Week

Useful for business purposes

Teradata System Calendar provides day of week as a numeric

Requires a join to the System Calendar

Show the sales figures by day of week as seen below.

Day of Week Sales

---------------- ----------Sunday 2950.00Monday 2200.00Tuesday 2000.00Wednesday 2100.00

Thursday 2000.00Friday 2450.00Saturday 3250.00

Page 8: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 8/47

8 pg.

Creating a Day of Week Table

CREATE TABLE day_of_week(numeric_day BYTEINT,char_day CHAR(9))UNIQUE PRIMARY INDEX (numeric_day);

INSERT INTO day_of_week VALUES (1, 'Sunday');INSERT INTO day_of_week VALUES (2, 'Monday');INSERT INTO day_of_week VALUES (3, 'Tuesday');INSERT INTO day_of_week VALUES (4, 'Wednesday');INSERT INTO day_of_week VALUES (5, 'Thursday');INSERT INTO day_of_week VALUES (6, 'Friday');INSERT INTO day_of_week VALUES (7, 'Saturday');

Page 9: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 9/47

9 pg.

Using a Day Of Week Table

SELECT dw.char_day "Day of// Week",SUM(ds.sales) AS Sales

FROM daily_sales ds,sys_calendar.calendar sc, day_of_week dw

WHERE sc.calendar_date = ds.salesdateAND sc.day_of_week = dw.numeric_dayGROUP BY 1, dw.numeric_dayORDER BY dw.numeric_day;

Day of Week Sales

---------------- ----------Sunday 2950.00Monday 2200.00

Tuesday 2000.00Wednesday 2100.00Thursday 2000.00Friday 2450.00Saturday 3250.00

•Requires joining three tables using two join conditions

•Day of Week table has only seven rowsTotal cost of this query is approx .47

Show the sales figures by day of week.

Page 10: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 10/47

10 pg.

Using CASE Statement SELECT CASE sc.day_of_week

WHEN 1 then 'Sunday'WHEN 2 then 'Monday'WHEN 3 then 'Tuesday'WHEN 4 then 'Wednesday'WHEN 5 then 'Thursday'WHEN 6 then 'Friday'WHEN 7 then 'Saturday'ELSE 'Not Found' ENDAS "Day of// Week"

,SUM(ds.sales) AS SalesFROM daily_sales ds ,sys_calendar.calendar scWHERE sc.calendar_date = ds.salesdateGROUP BY 1, sc.day_of_weekORDER BY sc.day_of_week;

Day of Week Sales

---------------- ----------Sunday 2950.00Monday 2200.00Tuesday 2000.00Wednesday 2100.00Thursday 2000.00Friday 2450.00

Saturday 3250.00

Same Result

Requires joining only two tables using one join condition

Total cost of this query is approx .35

Page 11: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 11/47

Page 12: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 12/47

12 pg.

Duplicate value likelihood may be reduced by increasing the size of the RANDOMinterval relative to the size of the table.

Example: Assign a random number between 1 and 100 to each department .

SELECT department_number , RANDOM(1,100)FROM department;

department_number Random(1,100)----------------- -------------

501 15301 19201 71600 75100 61402 41403 81302 31401 59

Duplicate RANDOM Values

Note that no duplicates were generated because the pool of possiblevalues is over ten times the number of rows to be assigned.

Page 13: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 13/47

Page 14: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 14/47

14 pg.

Consider the following distribution of employee salaries.

Salary Range Count------------- -------$ 0 to < $30K 6$30 to < $40K 9$40 to < $50K 4$50K + 7

Problem: Select a sample representing two thirds of the employees makingunder $30,000. Use the RANDOM function to accomplish this.

SELECT employee_number , salary_amount

FROM employeeWHERE (salary_amount < 30000AND RANDOM(1,3) < 3);

employee_number salary_amount--------------- -------------

1006 29450.001023 26500.001013 24500.00

RANDOM Sampling

Because of the nature of randomnumber generation, we end up with a50% sample (3 out of 6) instead of a67% sample (4 out of 6).

Page 15: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 15/47

15 pg.

Using The SAMPLE FunctionA sample of a single group can also be generated and with more accuracy usingthe SAMPLE function.

Solution 2:

4 out of 6 employees represents a 67% sample.

employee_number salary_amount--------------- -------------

1006 29450.001023 26500.001008 29250.001014 24500.00

SELECT employee_number , salary_amountFROM employeeWHERE salary_amount < 30000SAMPLE .67;

Page 16: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 16/47

16 pg.

SAMPLE Function For Multiple SamplesPermits use of percentage or row count specification.Used rows are not reusable for subsequent sample sets.

SELECT department_number ,SAMPLEID

FROM departmentSAMPLE .25, .25, .50ORDER BY SAMPLEID ;

SELECT department_number ,SAMPLEID

FROM departmentSAMPLE 3, 5, 8ORDER BY SAMPLEID;

department_number SampleId301 1403 1302 2401 2100 3

402 3201 3600 3501 3

department_number SampleId301 1403 1302 1401 2100 2

402 2201 2501 2600 3

Page 17: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 17/47

17 pg.

Complex RANDOM Sampling The RANDOM function can be used multiple times in the same SELECT statement, Itcan be used to produce multiple samples, each using a separate criteria.

Example: Create a sample consisting of approximately 67% from each of the under $50,000 salary ranges.

employee_number salary_amount--------------- -------------

1014 24500.001001 25525.001023 26500.001009 31000.001005 31200.001004 36300.001003 37850.00

1021 38750.001020 39500.001002 43100.001024 43700.001010 46000.001007 49700.00

The result shows the following distribution:Under $30,000 — 3 out of 6 (50%)

Between $30,000 and $39,999 — 6 out of 9 (67%)

Between $40,000 and $49,999 — 4 out of 4 (100%)

SELECT employee_number, salary_amountFROM employeeWHERE (salary_amount < 30000

AND RANDOM(1,3) < 3)OR (salary_amount BETWEEN 30001

AND 40000 AND RANDOM(1,3) < 3)OR (salary_amount BETWEEN 40001AND 50000 AND RANDOM(1,3) < 3)

ORDER BY 2;

Page 18: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 18/47

18 pg.

Complex RANDOM Sampling (cont'd)Changing the size of the RANDOM range can affect the size of the returned sample.Example: Perform the same query but change the size of the RANDOM range to 100.

This result shows the following distribution:

Under $30,000 — 2 out of 6 (33%)

Between $30,000 and $39,999 — 4 out of 9 (44%)

Between $40,000 and $49,999 — 1 out of 4 (25%)

SELECT employee_number, salary_amountFROM employeeWHERE (salary_amount < 30000

AND RANDOM(1,100) < 68)OR (salary_amount BETWEEN 30001 AND 40000

AND RANDOM(1,100) < 68)OR (salary_amount BETWEEN 40001 AND 50000

AND RANDOM(1,100) < 68)ORDER BY 2;

employee_number salary_amount

--------------- -------------1013 24500.001023 26500.001005 31200.001022 32300.001004 36300.001003 37850.001007 49700.00

Page 19: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 19/47

19 pg.

Sample Sizing IssuesThe larger the pool of rows to be drawn from, the closer one can get to achieving aspecific percentage of rows in the sample.SEL COUNT(*) FROM agent_salesWHERE (sales_amt BETWEEN 20000 and 39999);

Returns 100 rows exactly

Each of the following examples attempts to return a 50% sample of the target rows.

SEL COUNT(*) FROM agent_salesWHERE (sales_amt BETWEEN 20000 and 39999) AND RANDOM(1,100) < 51;

Returns 58 rows or 58%

SELECT COUNT(*) FROM agent_salesWHERE (sales_amt BETWEEN 20000 and 39999) AND RANDOM(1,10) < 6;

Returns 53 rows or 53%

SELECT COUNT(*) FROM agent_salesWHERE (sales_amt BETWEEN 20000 and 39999) AND RANDOM(1,4) < 3;

The smaller the RANDOM range is defined relative to the size of the pool of rows,the more accurately a specific percentage can be achieved.

Returns 50 rows or 50%

Page 20: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 20/47

20 pg.

Limitations On Use Of RANDOM

RANDOM is non-ANSI standard

RANDOM may be used in a SELECT list or a WHERE clause, but notboth

RANDOM may be used in Updating, Inserting or Deleting rows

RANDOM may not be used with aggregate or OLAP functions

RANDOM cannot be referenced by numeric position in a GROUP BY or ORDER BY clause

Page 21: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 21/47

21 pg.

V2R5 Sampling Features

Before V2R5:

Sampling without replacement

Proportional allocation

- each AMP provides same proportion of sample rows.

With V2R5:

Sampling with or without replacement (User choice)

Proportional allocation

- each AMP provides same proportion of sample rows.

Ramdomized allocation

- randomized across system - not AMP proportional.

Page 22: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 22/47

22 pg.

Dynamic SQL and Static SQL

- technique for generating and executing SQL commands dynamically from astored procedure at runtime.

REPLACE PROCEDURE static_sql (IN sal DEC(9,2),IN emp_num INT)BEGIN

UPDATE emp1SET salary_amount = :salWHERE employee_number = :emp_num);

END;

CALL static_sql(50000, 1018);

- pre-constructed SQL compiled into the stored procedure.- may be parameterized.- still optimized prior to each execution

Dynamic SQL

Static SQL

Static SQL Example

Page 23: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 23/47

23 pg.

Dynamic SQL (cont'd)Dynamic SQL Example

REPLACE PROCEDURE dyn_sql (IN col1 CHAR(15),IN val1 CHAR(10),IN emp_num CHAR(8))

BEGINCALL DBC.SysExecSQL('UPDATE emp1 SET '|| :col1 || '= ' || :val1 || ' WHEREemployee_number = ' || :emp_num);END;

CALL dyn_sql('salary_amount','50000','1018'); /* Updates employee 1018 salary_amount to $50,000 */

CALL dyn_sql('job_code','567890','1018'); /* Updates employee 1018 job_code to 567890 */

Dynamic SQL- Constructed as a concatenated character string

- Passed to DBC.SysExecSQL for execution

- May be subject to run-time errors

Page 24: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 24/47

24 pg.

The creating user must also be the owner of the procedure in order to havethe right to use dynamic SQL.

The size of the SQL command string cannot exceed 32000.

Multi-statement requests are not supported.

The ending semi-colon is optional on the SQL command.

The following SQL statements cannot be used as dynamic SQL in storedprocedures:

CALL SELECTCREATE PROCEDURE SELECT INTODATABASE SET SESSION ACCOUNTEXPLAIN SET SESSION COLLATIONHELP SET SESSION DATEFORMREPLACE PROCEDURE SET TIME ZONESHOW

Dynamic SQL (cont'd)The following are restrictions on the use of Dynamic SQL within stored procedures:

Restrictions

Page 25: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 25/47

25 pg.

Join IndexesA Join Index is an optional index which may be created by the

user for one of the following three purposes:− Pre-join multiple tables (Multi-table Join Index)− Distribute the rows of a single table on the hash value of a

foreign key value (Single-table Join Index)− Aggregate one or more columns of a single table or multipletables into a summary table (Aggregate Join Index)

If possible, the optimizer will use a Join Index rather than accesstables directly

This typically will result in much better performance

Join Indexes are automatically updated as the table rows are

updatedA Join Index may not be accessed directly

It is a option which the optimizer may choose if the index ‘covers’the query

Page 26: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 26/47

26 pg.

Customer and Order Tables

CREATE TABLE customer ( cust_id INTEGER NOT NULL,cust_name CHAR(15),cust_addr CHAR(25) )UNIQUE PRIMARY INDEX ( cust_id );

CREATE TABLE orders( order_id INTEGER NOT NULL,order_date DATE FORMAT 'yyyy-mm-dd',cust_id INTEGER,

order_status CHAR(1)) UNIQUE PRIMARY INDEX ( order_id );

CUSTOMERS ORDERS49 11

49 valid customers have orders.1 valid customer has no orders.1 order has an invalid customer.

Page 27: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 27/47

27 pg.

Single Table Query

How many orders have assigned customers?

SELECT COUNT(order_id) FROM ordersWHERE cust-id IS NOT NULL;

Count(order_id)----------------------

50

A join index will not help this queryThe table ‘orders’ covers the query

CUSTOMERS ORDERS49 11

Page 28: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 28/47

28 pg.

Will Join Index Help?

How many orders have assigned valid customers?

SELECT COUNT(o.order_id) FROM customer c INNER JOIN orders oON c.cust_id = o.cust_id;

Count(order_id)------------------------

49

A join index can help this queryTwo tables are needed to cover the query

CUSTOMERS ORDERS49 11

Query cost: .39 secs

Page 29: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 29/47

29 pg.

Creating a Join Index CREATE JOIN INDEX cust_ord_ix ASSELECT (c.cust_id, cust_name),(order_id, order_status, order_date)FROM customer c, orders oWHERE c.cust_id = o.cust_id

PRIMARY INDEX (cust_id);

CUST_ID CUST_NAME ORDER_ID ORDER_STATUS ORDER_DATE1001 ABC Corp 501 C 990120

502 C 990220503 C 990320504 C 990420505 C 990520506 C 990620

1002 BCD Corp 507 C 990122508 C 990222509 C 990322

Fixed Portion Variable Portion

: : :

Page 30: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 30/47

30 pg.

With Join Index How many orders have assigned valid customers?

SELECT COUNT(o.order_id) FROM customer c INNER JOIN orders oON c.cust_id = o.cust_id;

Same SQL queryOptimizer picks Join Index rather than doing a joinJoin Index covers query

CUSTOMERS ORDERS49 11

Without Join Index .39 secsWith Join Index .17 secs

Count(order_id)------------------------

49

Page 31: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 31/47

31 pg.

Join Index Coverage

How many valid customers have assigned orders in January 1999?

SELECT COUNT(C.CUST_ID) FROM customer c INNER JOIN orders o

ON c.cust_id = o.cust_idWHERE o.order_date BETWEEN 990101 AND 990131;Count(cust_id)----------------------

9

Without Join Index .40 secsWith Join Index .17 secs

CUST_ID CUST_NAME ORDER_ID ORDER_STATUS ORDER_DATE1001 ABC Corp 501 C 990120

502 C 990220503 C 990320504 C 990420505 C 990520506 C 990620

1002 BCD Corp 507 C 990122508 C 990222509 C 990322

Order_date is part of Join IndexJoin Index covers queryOptimizer picks Join Index

Join Index

Page 32: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 32/47

32 pg.

Join Index ComparisonName the valid customers who have open orders in January 1999?

SELECT c.cust_name FROM customer c INNER JOIN orders oON c.cust_id = o.cust_idWHERE o.order_date BETWEEN 990101 and 990131AND o.order_status = ‘O’;

Without Join Index .23 secsWith Join Index .15 secs

cust_name----------------JKL Corp

All referenced columns part of join indexJoin Index covers queryOptimizer picks Join Index

CUST_ID CUST_NAME ORDER_ID ORDER_STATUS ORDER_DATE1001 ABC Corp 501 C 990120

502 C 990220503 C 990320504 C 990420505 C 990520506 C 990620

1002 BCD Corp 507 C 990122508 C 990222509 C 990322

Join Index

Page 33: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 33/47

33 pg.

Aggregate Join IndexesAggregate Join Indexes are:

• Designed for queries which use counts, sums and averages

• Extracted aggregated data optionally based on months or years

• An alternative to summary tables

• Automatically updated as base tables change

• An option for the optimizer when the index covers the query• Are not compatible with Multiload or Fastload

Page 34: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 34/47

34 pg.

Traditional AggregationSELECT EXTRACT(YEAR FROM salesdate) AS Yr

, EXTRACT(MONTH FROM salesdate)AS Mon, SUM(sales)

FROM daily_sales

WHERE itemid = 10 AND Yr IN (‘1997’, ‘1998’)GROUP BY 1,2ORDER BY 1,2;

Yr Mon Sum(sales)----------- ----------- --------------

1997 1 2150.001997 2 1950.001997 8 1950.001997 9 2100.001998 1 1950.001998 2 2100.001998 8 2200.001998 9 2550.00

Explanation

--------------------------------------------------------------------------1) First, we do a SUM step to aggregate from PED1.daily_sales byway of the primary index "PED1.daily_sales.itemid = 10" with aresidual condition of ("((EXTRACT(YEAR FROM(PED1.daily_sales.salesdate )))= 1997) OR ((EXTRACT(YEAR FROM (PED1.daily_sales.salesdate )))= 1998)"), and the groupingidentifier in field 1. Aggregate Intermediate Results arecomputed locally, then placed in Spool 2. The size of Spool 2 isestimated with high confidence to be 1 to 1 rows.

Page 35: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 35/47

35 pg.

Creating An Aggregate Index

CREATE JOIN INDEX monthly_sales AS

SELECT itemid AS Item,EXTRACT(YEAR FROM salesdate) AS Yr ,EXTRACT(MONTH FROM salesdate) ASMon,SUM(sales) AS SumSales

FROM daily_salesGROUP BY 1,2,3;

CREATE SET TABLE daily_sales ,NOFALLBACK ,

(itemid INTEGER,

salesdate DATE FORMAT 'YY/MM/DD',sales DECIMAL(9,2))PRIMARY INDEX ( itemid );

Page 36: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 36/47

36 pg.

Query Using Aggregate Index

SELECT EXTRACT(YEAR FROM salesdate)AS Yr , EXTRACT(MONTH FROM salesdate)AS Mon, SUM(sales)

FROM daily_sales

WHERE itemid = 10 AND Yr IN (‘1997’, ‘1998’)GROUP BY 1,2ORDER BY 1,2;

Yr Mon Sum(sales)------------ ----------- --------------

1997 1 2150.001997 2 1950.001997 8 1950.001997 9 2100.00

1998 1 1950.001998 2 2100.001998 8 2200.001998 9 2550.00

Explanation-----------------------------------------------------------------------

1) First, we do a SUM step to aggregate from join index tablePED1.monthly_sales by way of the primary index"PED1.monthly_sales.Item = 10", and the grouping identifier infield 1. Aggregate Intermediate Results are computed locally,then placed in Spool 2. The size of Spool 2 is estimated with lowconfidence to be 4 to 4 rows.

Page 37: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 37/47

37 pg.

Join Index Summary

Is a denormalization tool

Pre-joins existing tablesAggregates existing columnsCan improve performance for covered queriesCan join more than two tablesCan use inner, outer and cross joinsCosts additional disk spaceCosts additional maintenance processing for updatesCannot be accessed directly by SQLIs a choice for the optimizer

A Join Index:

Page 38: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 38/47

38 pg.

ANSI TimestampTimestamp combines date and time into a single column.

TIMESTAMP(n) - Where n=(0-6)

Consists of 6 fields of information

YEAR,MONTH,DAY,HOUR,MINUTE,SECONDInternal format is DATE(4 bytes) + TIME(6 bytes) = 10 bytes

Timestamp representation Character conversionTIMESTAMP(0) 2001-12-07 11:37:58 CHAR(19)TIMESTAMP(6) 2001-12-07 11:37:58.213000 CHAR(26)

CREATE TABLE tblb (tmstampb TIMESTAMP);INSERT INTO tblb (CURRENT_TIMESTAMP);

SELECT * FROM tblb;tmstampb---------------------------------------2001-11-06 13:48:38.580000

Page 39: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 39/47

39 pg.

Timestamp + Interval

Timestamp may be combined with anyday-time interval to produce a newtimestamp.

TIMESTAMP +

YEAR YEAR TO MONTHMONTHDAYDAY TO HOURDAY TO MINUTEDAY TO SECONDHOURHOUR TO MINUTEMINUTEMINUTE TO SECONDSECOND

= TIMESTAMP

SELECT TIMESTAMP '1999-10-01 09:30:22'- INTERVAL '01:20:10' HOUR TO SECOND;

1999-10-01 08:10:12

SELECT TIMESTAMP '1999-10-01 09:30:22'- INTERVAL '2-06' YEAR TO MONTH;

1997-04-01 09:30:22

Subtract 2 yrs and 6 mos from the designated timestamp:

Subtract 1 hr, 20 mins and 10 secs from designated timestamp:

Page 40: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 40/47

40 pg.

Timestamp Subtraction

SELECT (TIMESTAMP '1999-10-20 10:25:40' -TIMESTAMP '1998-09-19 08:20:00') MONTH;

13

SELECT (TIMESTAMP '1999-10-20 10:25:40' -TIMESTAMP '1998-09-19 08:20:00') YEAR;

1

SELECT (TIMESTAMP '1999-10-20 10:25:40' -TIMESTAMP '1998-09-19 08:20:00') DAY(3);

396

Given the following two timestamps, calculate the difference

between them as directed:In months?

In years?

In days?

TIMESTAMP - TIMESTAMP =

YEAR YEAR TO MONTHMONTHDAYDAY TO HOURDAY TO MINUTEDAY TO SECONDHOURHOUR TO MINUTEMINUTEMINUTE TO SECONDSECOND

Page 41: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 41/47

41 pg.

Using Timestamp In An Application

CREATE TABLE Repair_time( serial_number INTEGER,product_desc CHAR(8),start_time TIMESTAMP(0)

,end_time TIMESTAMP(0))UNIQUE PRIMARY INDEX (serial_number);

serial_number product_desc start_time end_time

-------------------- ----------------- ---------------------------- ----------------------------100 TV 2000-01-15 10:30:00 2000-01-17 13:20:00101 TV 2000-01-20 08:30:00 2000-01-23 12:20:00102 TV 2000-01-25 13:40:00 2000-01-26 14:20:00103 TV 2000-02-02 11:30:00 2000-02-09 08:50:00104 TV 2000-02-07 09:00:00 2000-02-10 08:50:00105 TV 2000-02-10 08:40:00 2000-02-12 14:50:00106 TV 2000-02-15 12:30:00 2000-02-20 15:20:00107 TV 2000-02-19 14:30:00 2000-02-21 10:50:00108 TV 2000-02-21 11:30:00 2000-02-23 16:40:00

SELECT * FROM Repair_time ORDER BY 1;

Page 42: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 42/47

42 pg.

Calculating Time IntervalsProduce a report showing each TV by serial number and how long in days, hours and minutes it took to repair the TV?

SELECT serial_number, (end_time - start_time) DAY TO MINUTE AS work_time FROMRepair_time ORDER BY 1;

serial_number work_time------------------- --------------

100 2 02:50101 3 03:50102 1 00:40

103 6 21:20104 2 23:50105 2 06:10106 5 02:50107 1 20:20108 2 05:10

What is the average amount of time it takes to repair a TV? Show the answer in days, hours and minutes.

SELECT AVG( (end_time - start_time) DAY TO MINUTE)AS avg_repair_time

FROM Repair_time;

avg_repair_time

---------------------3 01:40

Page 43: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 43/47

43 pg.

Comparing IntervalsShow the serial number and the number of days required for each TV that took longer than 2 days to repair .

SELECT serial_number,(end_time - start_time) DAY TO MINUTE

AS #_DaysHrsMnsFROM Repair_timeWHERE #_DaysHrsMns >

INTERVAL '02 00:00' DAY TO MINUTE;

serial_number #_DaysHrsMns-------------------- --------------------

106 5 02:50101 3 03:50108 2 05:10

100 2 02:50104 2 23:50103 6 21:20105 2 06:10

Page 44: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 44/47

44 pg.

Advanced Use of Timestamp - Example 1

Produce a list which pairs by serial number any two TV’s thatwere being repaired at the same time.

SELECT a.serial_number, b.serial_number FROM Repair_time a CROSS JOIN Repair_time bWHERE (a.start_time, a.end_time) OVERLAPS

(b.start_time, b.end_time)AND a.serial_number < b.serial_number; serial_number serial_number

------------------- -------------------

106 107103 104104 105

SELECT DISTINCT a.serial_number, b.serial_number FROM Repair_time a CROSS JOIN Repair_time bWHERE (a.start_time, a.end_time) OVERLAPS

(b.start_time, b.end_time);

Alternative Approach Using DISTINCT

f

Page 45: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 45/47

45 pg.

Advanced Use of Timestamp - Example 2

SELECT (100 * COUNT(serial) / cnt) (FORMAT '99%')FROM (SELECT COUNT(*) FROM Repair_time) AS temp1(cnt),

(SELECT serial_number, (end_time - start_time) day AS Num_DaysFROM Repair_time

WHERE Num_days > INTERVAL '02' DAY ) AS temp2(serial, Number_days)GROUP BY cnt;

((100*Count(serial))/cnt)----------------------------------

33%

SELECT (100 * COUNT(serial) / cnt) (FORMAT '99%')FROM (SELECT COUNT(*) FROM Repair_time) AS temp1(cnt),

(SELECT serial_number, (end_time - start_time) day AS Num_DaysFROM Repair_timeWHERE Num_days > INTERVAL '02 00:00' DAY TO MINUTE ;)

AS temp2(serial, Number_days)GROUP BY cnt;((100*Count(serial))/cnt)----------------------------------

78%

What percentage of all TV’s took 2 or more days to repair?

Incorrect Answer

Correct Answer

f d

Page 46: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 46/47

46 pg.

Performance Reminders

• Consider use of CASE for small set values testing

•Use appropriate sampling functions - RANDOM or SAMPLE

• Use Dynamic SQL with Stored Procedures

• Join indexes can help query performance by pre-joining tables

• Aggregate indexes are preferable to aggregated views or tables

• Use TIMESTAMP and INTERVALS for time-related processing

Summary

Page 47: SQL Tips & Techniques

8/8/2019 SQL Tips & Techniques

http://slidepdf.com/reader/full/sql-tips-techniques 47/47

47 pg.

Summary

• SQL is a very versatile language

• Usually, if there’s a will, there’s a way• Often there are several ways to write a query

• Find the one that performs best, using EXPLAIN