72
1 HS2 Solutions confidential and proprietary. 9/15/2016 By Donald Bales, Rails Practice Lead USING END-OF-TIME ACTIVE DATE SEMANTICS TO IMPROVE PERFORMANCE

Using End-Of-Time Date Semantics to Improve Performance

Embed Size (px)

Citation preview

Page 1: Using End-Of-Time Date Semantics to Improve Performance

1HS2 Solutions confidential and proprietary.

9 / 1 5 / 2 0 1 6

By Donald Bales, Rails Practice Lead

USING END-OF-TIME ACTIVE DATE SEMANTICS TO IMPROVE PERFORMANCE

Page 2: Using End-Of-Time Date Semantics to Improve Performance

2HS2 Solutions confidential and proprietary.

Underlying almost every Rails application is a relational database management system. Let me show you how important it is to apply some fundamental time rules to your application's database in order to get the best response times possible.

ABSTRACT

Page 3: Using End-Of-Time Date Semantics to Improve Performance

3HS2 Solutions confidential and proprietary.

For 20 years, we have helped some of the best brand and eCommerce companies leverage the internet and digital marketing.

We value smart engineering and team members that collaborate well internally as well as with our clients and their agencies.

Leveraging Technology & Intelligence to Drive Results

WHO WE ARE

Page 4: Using End-Of-Time Date Semantics to Improve Performance

4HS2 Solutions confidential and proprietary.

HS2 HISTORY

1 9 9 4

Software DevelopmentHollyer & Schwartz (H&S)

1 9 9 9

eBusinessH&S acquired by XOR, Inc.

2 0 0 1

Precision Marketing

XOR merges with Seurot

2 0 0 3

HS2 FormedSeurat acquired by Fair Isaac. HS2

Solutions formed.

Hollyer & Schwartz was founded in 1994 as a software development and systems integration company. The core team has evolved and grown together for over 15 years into a full-service eBusiness and Precision Marketing company.

Page 5: Using End-Of-Time Date Semantics to Improve Performance

WHO WE WORK WITH

HS2 Solutions confidential and proprietary. 5

Page 6: Using End-Of-Time Date Semantics to Improve Performance

6HS2 Solutions confidential and proprietary.

WHAT WE DO

ECOMMERCE, WEB & MOBILE DEVELOPMENT

ANALYTICS & INSIGHTSEXPERIENCE DESIGN (UX/UI)

INTERACTIVE MARKETING

Page 7: Using End-Of-Time Date Semantics to Improve Performance

7HS2 Solutions confidential and proprietary.

“D O N A L D B A L E S

The obvious is always illusive

Page 8: Using End-Of-Time Date Semantics to Improve Performance

8HS2 Solutions confidential and proprietary.

• There’s not a bit of Ruby or Rails code

• Fundamentals

• Structured Query Language (SQL)

• Data Definition Language (DDL)

• Data Manipulation Language (DML)

DISCLAIMER

PROGRAMMER

EXPLICIT CONTENTADVISORY

Page 9: Using End-Of-Time Date Semantics to Improve Performance

9HS2 Solutions confidential and proprietary.

don=# create table test_integer (an_integer integer);

CREATE TABLE

don=# insert into test_integer (an_integer) values (1.5);

INSERT 0 1

don=# select * from test_integer;

an_integer

------------

?

WARM-UP EXERCISE QUESTION

Page 10: Using End-Of-Time Date Semantics to Improve Performance

10HS2 Solutions confidential and proprietary.

don=# select * from test_integer;

an_integer

------------

2

(1 row)

WARM-UP EXERCISE ANSWER

Page 11: Using End-Of-Time Date Semantics to Improve Performance

11HS2 Solutions confidential and proprietary.

where start_date <= CURRENT_DATE

and end_date >= CURRENT_DATE

V.

where start_date <= CURRENT_DATE

and (end_date >= CURRENT_DATE or end_date is NULL)

WHAT’S THE DIFFERENCE BETWEEN THESE?

Page 12: Using End-Of-Time Date Semantics to Improve Performance

12HS2 Solutions confidential and proprietary.

where CURRENT_DATE between start_date and end_date

V.

where start_date <= CURRENT_DATE

and (end_date >= CURRENT_DATE or end_date is NULL)

WHAT’S THE DIFFERENCE BETWEEN THESE?

Page 13: Using End-Of-Time Date Semantics to Improve Performance

13HS2 Solutions confidential and proprietary.

ANSWER: EFFICIENCY AND PERFORMANCE

Page 14: Using End-Of-Time Date Semantics to Improve Performance

14HS2 Solutions confidential and proprietary.

In this context we are talking about determining if something is active at some point in time by comparing that point in time against an item's start and end dates.

So if I have a time line:

WHAT DOES IT MEAN TO BE ACTIVE?

Jan 1

start

Mar 31

Active Period Inactive Period

PointIn Time

end

Page 15: Using End-Of-Time Date Semantics to Improve Performance

15HS2 Solutions confidential and proprietary.

We can say that at this point in time, that is, when the point in time is between the start and end dates, the item is active.

WHAT DOES IT MEAN TO BE ACTIVE?

Jan 1

start

Mar 31

Active Period Inactive Period

PointIn Time

end

Page 16: Using End-Of-Time Date Semantics to Improve Performance

16HS2 Solutions confidential and proprietary.

WHAT DOES IT MEAN TO BE ACTIVE?

Jan 1

start

Mar 31

Active Period Inactive Period

PointIn Time

end

where CURRENT_DATE between start_date and end_dateIn SQL:

Page 17: Using End-Of-Time Date Semantics to Improve Performance

17HS2 Solutions confidential and proprietary.

But what do we do if we don't know when an item will become inactive?

The typical and intuitive programming solution is not to specify an end date:

WHAT IF WE DON’T KNOW THE END DATE?

Jan 1

start

Active Period

PointIn Time

Page 18: Using End-Of-Time Date Semantics to Improve Performance

18HS2 Solutions confidential and proprietary.

Name Null? Type

------------------------------- -------- ----------------------

ID NOT NULL NUMBER(38)

CODE NOT NULL VARCHAR2(30)

DESCRIPTION VARCHAR2(4000)

START_DATE NOT NULL DATE

END_DATE DATE

WHAT IF WE DON’T KNOW THE END DATE?

Page 19: Using End-Of-Time Date Semantics to Improve Performance

19HS2 Solutions confidential and proprietary.

WHAT IF WE DON’T KNOW THE END DATE?

Jan 1

start

Active Period

PointIn Time

where start_date <= CURRENT_DATEand (end_date >= CURRENT_DATE or end_date is NULL)

In SQL:

Page 20: Using End-Of-Time Date Semantics to Improve Performance

20HS2 Solutions confidential and proprietary.

But using this semantic for “active” is wholly inefficient for queries against a database. Is there a more efficient yet equivalent way to represent no end date?

WHAT IF WE DON’T KNOW THE END DATE?

Jan 1

start

Active Period

PointIn Time

Page 21: Using End-Of-Time Date Semantics to Improve Performance

21HS2 Solutions confidential and proprietary.

Yes! We can substitute a code-able notion of the end of time: 12/31/9999

Now the item is still active, at this moment, and through the end-of-time, but it’s no longer NULL! And, that, makes all the difference.

USE A KNOWN VALUE TO REPRESENT THE END-OF-TIME

Jan 1

start

Active Period

PointIn Time

Dec 31, 9999

end-of-time

end

Page 22: Using End-Of-Time Date Semantics to Improve Performance

22HS2 Solutions confidential and proprietary.

• How about December 31, 9999 or 12/31/9999

• It works for all these:• DB2

• MariaDB/MySQL

• Microsoft SQL server

• Oracle

• PostgreSQL

• Sysbase

DEFINE AN END-OF-TIME FOR UNKNOWN END DATES

Page 23: Using End-Of-Time Date Semantics to Improve Performance

23HS2 Solutions confidential and proprietary.

Name Null? Type

------------------------------- -------- ----------------------

ID NOT NULL NUMBER(38)

CODE NOT NULL VARCHAR2(30)

DESCRIPTION VARCHAR2(4000)

START_DATE NOT NULL DATE

END_DATE NOT NULL DATE

USE A KNOWN VALUE TO REPRESENT THE END-OF-TIME

Page 24: Using End-Of-Time Date Semantics to Improve Performance

24HS2 Solutions confidential and proprietary.

USE A KNOWN VALUE TO REPRESENT THE END-OF-TIME

Jan 1

start

Active Period

PointIn Time

where CURRENT_DATE between start_date and end_date

Dec 31, 9999

end-of-time

In SQL:

Page 25: Using End-Of-Time Date Semantics to Improve Performance

25HS2 Solutions confidential and proprietary.

In the future, when someone wants to make the item truly inactive, they update the end date to a non-end-of-time value:

USE A KNOWN VALUE TO REPRESENT THE END-OF-TIME

Jan 1

start

Active Period

PointIn Time

Dec 31, 9999

end-of-time

end

Mar 31

Inactive Period

Page 26: Using End-Of-Time Date Semantics to Improve Performance

26HS2 Solutions confidential and proprietary.

Using end-of-time semantics for end date instead of no end date is extremely important if one is concerned about efficiency and performance.

Why?

That's what we will discuss in the remainder of this presentation.

WANT EFFICIENCY AND PERFORMANCE?

Page 27: Using End-Of-Time Date Semantics to Improve Performance

27HS2 Solutions confidential and proprietary.

LET’S REVIEW

Page 28: Using End-Of-Time Date Semantics to Improve Performance

28HS2 Solutions confidential and proprietary.

Determining if an entry is active is done by testing if the start date is in the past or the current moment and the end date, if it exists, is in the current moment or the future, and if it does not exist, that it is NULL. In SQL:

where CURRENT_DATE >= start_date

and (CURRENT_DATE <= end_date or end_date is NULL)

WITH A NULLABLE END DATE

Page 29: Using End-Of-Time Date Semantics to Improve Performance

29HS2 Solutions confidential and proprietary.

This is not an optimal situation, because a NULL value cannot be indexed, and accordingly, the database will have to do a full table scan, or index scan against the start date if it is indexed.

WITH A NULLABLE END DATE

Page 30: Using End-Of-Time Date Semantics to Improve Performance

30HS2 Solutions confidential and proprietary.

An optimal way to state that something is active is to populate both the start and end date. By setting the end date to the end of code-able time, say 12/31/9999, determining if an entry is active is done by testing if the start date is in the past or the current moment and the end date is in the current moment or the future. In SQL:

where CURRENT_DATE >= start_date

and CURRENT_DATE <= end_date

AN OPTIMAL APPROACH

Page 31: Using End-Of-Time Date Semantics to Improve Performance

31HS2 Solutions confidential and proprietary.

Another way to write it in SQL:

where CURRENT_DATE between start_date and end_date

AN OPTIMAL APPROACH

Page 32: Using End-Of-Time Date Semantics to Improve Performance

32HS2 Solutions confidential and proprietary.

• NULL values typically can’t be indexed

• This leads to full table scans or

• This leads to full index scans or

• This leads to partial index scans with partial table scans

• Full and partial table scans flush the database’s buffers (cache)

• Un-necessary work consumes capacity that may be needed elsewhere

• As table and index size grow, queries slow

WHY ARE NULL VALUES A PROBLEM?

Page 33: Using End-Of-Time Date Semantics to Improve Performance

33HS2 Solutions confidential and proprietary.

• An example using a dictionary• Find by Definition (full scan) – “free from legal or discriminatory restrictions”

• Start at the first page

• Read definitions

• Compare

• Match? You’ve found it, so you’re done, else go to next page…

FULL TABLE SCANS

Found It!

Page 34: Using End-Of-Time Date Semantics to Improve Performance

34HS2 Solutions confidential and proprietary.

• Find by Word (index scan) - “source”

• Start anywhere

• Read word

• Compare

• Match? You’ve found it! You’re done

• Less than, jump forward and compare again

• Greater than, jump backward and compare again

INDEX SCANS

Found It!

Page 35: Using End-Of-Time Date Semantics to Improve Performance

35HS2 Solutions confidential and proprietary.

PERFORMANCE AS A FACTOR OF SIZE

- 5,000,000 10,000,000 -

5,000 10,000 15,000 20,000 25,000 30,000 35,000

FULL TABLE SCAN

Full Scan

“As tables grow, queries slow.”

milliseconds

rows

Page 36: Using End-Of-Time Date Semantics to Improve Performance

36HS2 Solutions confidential and proprietary.

PERFORMANCE AS A FACTOR OF ORDER

- 5,000,000 10,000,000 -

5,000 10,000 15,000 20,000 25,000 30,000 35,000

UNCACHED INDEX RANGE SCANS

Full ScanStart-End UncachedStart-Null End Uncached

Look at how poorly that null-able end

date performs

Page 37: Using End-Of-Time Date Semantics to Improve Performance

37HS2 Solutions confidential and proprietary.

That is, using memory I/O instead of physical disk I/O

• NOTE: Full table scans flush cache, making it useless

PERFORMANCE AS A FACTOR OF CACHE

Page 38: Using End-Of-Time Date Semantics to Improve Performance

38HS2 Solutions confidential and proprietary.

PERFORMANCE AS A FACTOR OF CACHE

- 5,000,000 10,000,000 -

5,000 10,000 15,000 20,000 25,000 30,000 35,000

CACHED INDEX RANGE SCANS

Full ScanStart-End UncachedStart-Null End UncachedStart-End CachedStart-Null End Cached

A full table scan is faster than that

null-able end date

Page 39: Using End-Of-Time Date Semantics to Improve Performance

39HS2 Solutions confidential and proprietary.

LET’S TAKE A CLOSER LOOK

Page 40: Using End-Of-Time Date Semantics to Improve Performance

40HS2 Solutions confidential and proprietary.

PERFORMANCE AS A FACTOR OF SIZE

- 200 400 600 800 1,000 - 1 2 3 4 5 6 7

FULL TABLE SCAN

Full Scan

Page 41: Using End-Of-Time Date Semantics to Improve Performance

41HS2 Solutions confidential and proprietary.

PERFORMANCE AS A FACTOR OF ORDER

- 200 400 600 800 1,000 - 1 2 3 4 5 6 7

UNCACHED INDEX RANGE SCANS

Full ScanStart-End UncachedStart-Null End Uncached

Page 42: Using End-Of-Time Date Semantics to Improve Performance

42HS2 Solutions confidential and proprietary.

PERFORMANCE AS A FACTOR OF CACHE

- 200 400 600 800 1,000 - 1 2 3 4 5 6 7

CACHED INDEX RANGE SCANS

Full ScanStart-End UncachedStart-Null End UncachedStart-End CachedStart-Null End Cached

Page 43: Using End-Of-Time Date Semantics to Improve Performance

43HS2 Solutions confidential and proprietary.

AN EVEN CLOSER LOOK!

Page 44: Using End-Of-Time Date Semantics to Improve Performance

44HS2 Solutions confidential and proprietary.

PERFORMANCE AS A FACTOR OF CACHE

- 10 20 30 40 50 60 70 80 90 100 - 1 2 3 4 5 6 7

CACHED INDEX RANGE SCANS

Full ScanStart-End UncachedStart-Null End UncachedStart-End CachedStart-Null End Cached

Page 45: Using End-Of-Time Date Semantics to Improve Performance

45HS2 Solutions confidential and proprietary.

PERFORMANCE AS A FACTOR OF CACHE

- 5 10 15 20 25 - 1 2 3 4 5 6 7

CACHED INDEX RANGE SCANS

Full ScanStart-End UncachedStart-Null End UncachedStart-End CachedStart-Null End Cached

Page 46: Using End-Of-Time Date Semantics to Improve Performance

46HS2 Solutions confidential and proprietary.

PERFORMANCE CHARACTERISTICS AND TIME

Page 47: Using End-Of-Time Date Semantics to Improve Performance

47HS2 Solutions confidential and proprietary.

• In the beginning• Database is small

• Tables are small

• Indexes are small

• Temporal width is small

• Queries are fast

PERFORMANCE CHARACTERISTICS CHANGE OVER TIME

Page 48: Using End-Of-Time Date Semantics to Improve Performance

48HS2 Solutions confidential and proprietary.

“YO U R A P P L I C AT I O N U S E R S

Boy! This application is great!

Page 49: Using End-Of-Time Date Semantics to Improve Performance

49HS2 Solutions confidential and proprietary.

• As time goes by• Database gets larger

• Tables get larger

• Indexes get larger

• Temporal width gets larger

• Queries take longer

PERFORMANCE CHARACTERISTICS CHANGE OVER TIME

Page 50: Using End-Of-Time Date Semantics to Improve Performance

50HS2 Solutions confidential and proprietary.

“A N O N Y M O U S

*Sigh* This application sucks!

Page 51: Using End-Of-Time Date Semantics to Improve Performance

51HS2 Solutions confidential and proprietary.

• Now that you have index-able data, you need optimal indexes• Indexes speed up queries

• Always on the Primary Key

• Almost always on Foreign Keys

• As needed on temporal columns

• You can’t index every column• Indexes slow down inserts and affected updates

• An indexes value is proportional to its selectivity

MORE ON THAT OPTIMAL APPROACH

Page 52: Using End-Of-Time Date Semantics to Improve Performance

52HS2 Solutions confidential and proprietary.

• start_date, end_date? – the intuitive choice!

• end_date, start_date?

• How much history will you keep around?• A whole-heck-of-a-lot? Then end_date, start_date

• Only Recent? Then start_date, end_date

SELECTIVITY

Page 53: Using End-Of-Time Date Semantics to Improve Performance

53HS2 Solutions confidential and proprietary.

PERFORMANCE AS A FACTOR OF CACHE

- 5,000,000 10,000,000 -

5,000 10,000 15,000 20,000 25,000 30,000 35,000

CACHED INDEX RANGE SCANS

Full ScanStart-End CachedStart-Null End CachedEnd-Start CachedNull End-Start Cached

The combination of start date and null end date is off the chart!

Page 54: Using End-Of-Time Date Semantics to Improve Performance

54HS2 Solutions confidential and proprietary.

PERFORMANCE AS A FACTOR OF CACHE

1,000,000 5,500,000 10,000,000 -

50

100

150

200

250

300 CACHED INDEX RANGE SCANS

Full ScanStart-End CachedStart-Null End CachedEnd-Start CachedNull End-Start Cached

The combination of end date and start date performs the best!

Page 55: Using End-Of-Time Date Semantics to Improve Performance

55HS2 Solutions confidential and proprietary.

NULL END DATE, START DATE

-------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 16 | 10104 (1)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 16 | | | | 2 | CONCATENATION | | | | | | |* 3 | INDEX RANGE SCAN| T10_000_000_NIA | 487K| 7621K| 3607 (1)| 00:00:01 | |* 4 | INDEX RANGE SCAN| T10_000_000_NIA | 849K| 12M| 6497 (1)| 00:00:01 | --------------------------------------------------------------------------------------

END-OF-TIME END DATE, START DATE

------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 16 | 5224 (1)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 16 | | | |* 2 | INDEX RANGE SCAN| T10_000_000_IA | 1413K| 21M| 5224 (1)| 00:00:01 | ------------------------------------------------------------------------------------

WHAT’S HAPPENING HERE (AT TEN MILLION ROWS)?

end-of-time is ½ the cost

Page 56: Using End-Of-Time Date Semantics to Improve Performance

56HS2 Solutions confidential and proprietary.

PERFORMANCE AS A FACTOR OF CACHE

- 200 400 600 800 1,000 - 1 2 3 4 5 6 7

CACHED INDEX RANGE SCANS

Full ScanStart-End CachedStart-Null End CachedEnd-Start CachedNull End-Start Cached

Page 57: Using End-Of-Time Date Semantics to Improve Performance

57HS2 Solutions confidential and proprietary.

NULL END DATE, START DATE

--------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 16 | 4 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 16 | | | | 2 | CONCATENATION | | | | | | |* 3 | INDEX RANGE SCAN| T1_000_NIA | 45 | 720 | 2 (0)| 00:00:01 | |* 4 | INDEX RANGE SCAN| T1_000_NIA | 85 | 1360 | 2 (0)| 00:00:01 | ---------------------------------------------------------------------------------

END-OF-TIME END DATE, START DATE

------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 16 | 2 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 16 | | | |* 2 | INDEX RANGE SCAN| T1_000_IA | 134 | 2144 | 2 (0)| 00:00:01 | -------------------------------------------------------------------------------

WHAT’S HAPPENING AT ONE THOUSAND ROWS?

end-of-time is still ½ the cost

Page 58: Using End-Of-Time Date Semantics to Improve Performance

58HS2 Solutions confidential and proprietary.

PERFORMANCE AS A FACTOR OF CACHE

- 10 20 30 40 50 60 70 80 90 100 - 1 2 3 4 5 6 7

CACHED INDEX RANGE SCANS

Full ScanStart-End CachedStart-Null End CachedEnd-Start CachedNull End-Start Cached

Page 59: Using End-Of-Time Date Semantics to Improve Performance

59HS2 Solutions confidential and proprietary.

NULL END DATE, START DATE

----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 16 | 1 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 16 | | | |* 2 | INDEX SKIP SCAN| T100_NIA | 11 | 176 | 1 (0)| 00:00:01 | -----------------------------------------------------------------------------

END-OF-TIME END DATE, START DATE

----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 16 | 1 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 16 | | | |* 2 | INDEX RANGE SCAN| T100_IA | 11 | 176 | 1 (0)| 00:00:01 | -----------------------------------------------------------------------------

WHAT’S HAPPENING AT ONE HUNDRED ROWS?

end-of-time is the same cost

Page 60: Using End-Of-Time Date Semantics to Improve Performance

60HS2 Solutions confidential and proprietary.

PERFORMANCE AS A FACTOR OF CACHE

- 1 2 3 4 5 6 7 8 9 10 -

2

CACHED INDEX RANGE SCANS

Full ScanStart-End CachedStart-Null End CachedEnd-Start CachedNull End-Start Cached

Page 61: Using End-Of-Time Date Semantics to Improve Performance

61HS2 Solutions confidential and proprietary.

USING COST INSTEAD OF ELAPSED TIME

Page 62: Using End-Of-Time Date Semantics to Improve Performance

62HS2 Solutions confidential and proprietary.

A LOOK AT EXPLAIN PLAN COSTS

- 5,000,000 10,000,000 -

10,000 20,000 30,000 40,000 50,000 60,000 70,000 80,000

CACHED INDEX RANGE SCANS

Full ScanStart-Null EndEnd-Start

Page 63: Using End-Of-Time Date Semantics to Improve Performance

63HS2 Solutions confidential and proprietary.

A LOOK BACK AT ELAPSED TIMES

1,000,000 5,500,000 10,000,000 -

50

100

150

200

250

300 CACHED INDEX RANGE SCANS

Full ScanStart-EndStart-Null EndEnd-StartNull End-Start

The combination of end date and start date performs the best!

Page 64: Using End-Of-Time Date Semantics to Improve Performance

64HS2 Solutions confidential and proprietary.

START DATE, NULL END DATE

------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 16 | 78992 (1)| 00:00:04 | | 1 | SORT AGGREGATE | | 1 | 16 | | | |* 2 | INDEX RANGE SCAN| T10_000_000_ANI | 1337K| 20M| 78992 (1)| 00:00:04 | -------------------------------------------------------------------------------------

END-OF-TIME END DATE, START DATE

------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 16 | 5224 (1)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 16 | | | |* 2 | INDEX RANGE SCAN| T10_000_000_IA | 1413K| 21M| 5224 (1)| 00:00:01 | ------------------------------------------------------------------------------------

WHAT’S HAPPENING HERE (AT TEN MILLION ROWS)?

end-of-time is 15 TIMES less costly!

Page 65: Using End-Of-Time Date Semantics to Improve Performance

65HS2 Solutions confidential and proprietary.

START DATE, NULL END DATE

-------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 16 | 10 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 16 | | | |* 2 | INDEX RANGE SCAN| T1_000_ANI | 130 | 2080 | 10 (0)| 00:00:01 | --------------------------------------------------------------------------------

END-OF-TIME END DATE, START DATE

------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 16 | 2 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 16 | | | |* 2 | INDEX RANGE SCAN| T1_000_IA | 134 | 2144 | 2 (0)| 00:00:01 | -------------------------------------------------------------------------------

WHAT’S HAPPENING AT ONE THOUSAND ROWS?

end-of-time is 5 TIMES less costly

Page 66: Using End-Of-Time Date Semantics to Improve Performance

66HS2 Solutions confidential and proprietary.

START DATE, NULL END DATE

------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 16 | 1 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 16 | | | |* 2 | INDEX RANGE SCAN| T100_ANI | 11 | 176 | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------

END-OF-TIME END DATE, START DATE

----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 16 | 1 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 16 | | | |* 2 | INDEX RANGE SCAN| T100_IA | 11 | 176 | 1 (0)| 00:00:01 | -----------------------------------------------------------------------------

WHAT’S HAPPENING AT ONE HUNDRED ROWS?

end-of-time is the same cost

Page 67: Using End-Of-Time Date Semantics to Improve Performance

67HS2 Solutions confidential and proprietary.

• Make end dates not null

• Define an end-of-time value to use everywhere

• Use the end-of-time value when you don’t know the end date

• Index in end date, start date order

• Eliminate full table scans

• Performance tests under load

• Performance tests with “real-life” data and table sizes

• Retest periodically

CONCLUSIONS

Page 68: Using End-Of-Time Date Semantics to Improve Performance

68HS2 Solutions confidential and proprietary.

• Move non-transactional data to another database

• Brush your teeth twice a day, and don’t forget to floss

• Listen to your spouse, I didn’t say obey did I?

OTHER GOOD HABITS

Page 69: Using End-Of-Time Date Semantics to Improve Performance

69HS2 Solutions confidential and proprietary.

The use of null values in a column that is queried often is a common performance problem that Oracle addressed years ago by allowing the creation of functional indexes. Using a functional index, one can work-around the null values by having the database calculate a replacement value using the nvl() function and storing the calculated value in the index instead. This means that every query against the table must use the same replacement value syntax if it is going to take advantage of the functional index. One caveat, you must use HINTS.

AN ORACLE-ENABLED WORKAROUND

Page 70: Using End-Of-Time Date Semantics to Improve Performance

70HS2 Solutions confidential and proprietary.

• You can use hints /*+ INDEX() */

• You can pin tables in cache (don’t do this with large tables)

ANOTHER ORACLE ENABLED WORKAROUNDS

Page 71: Using End-Of-Time Date Semantics to Improve Performance

71HS2 Solutions confidential and proprietary.

NULL VALUES ARE THE ZERO OF THE 21ST CENTURY

Known Unknown

Unknowable

NULL Values

Page 72: Using End-Of-Time Date Semantics to Improve Performance

CONTACT:Phone: (773) 296-2600

Email: [email protected]

THANK YOU!