30
Why? How? Conclusions PROC SQL: Why and How Nate Derby Stakana Analytics Seattle, WA SUCCESS, 4/27/11 Nate Derby PROC SQL: Why and How 1 / 30

PROC SQL: Why and How - SAS Group Presentations... · Why? How? Conclusions Introduction, Examples Working with SAS Data Sets Other Aspects Much More Interesting Example PROC SQLCode

  • Upload
    others

  • View
    5

  • Download
    4

Embed Size (px)

Citation preview

Why?How?

Conclusions

PROC SQL: Why and How

Nate Derby

Stakana AnalyticsSeattle, WA

SUCCESS, 4/27/11

Nate Derby PROC SQL: Why and How 1 / 30

Why?How?

Conclusions

Outline

1 Why?What?Why?

2 How?Introduction, ExamplesWorking with SAS Data SetsOther Aspects

3 Conclusions

Nate Derby PROC SQL: Why and How 2 / 30

Why?How?

Conclusions

What?Why?

What is SQL?

SQL = Structured Query Language

Pronounced “Sequel” or “S-Q-L”.Intended for managing data in relational database systems.Developed by IBM in early 1970s.Now very robust:

Lots of functionality (not just for queries!).Implemented in most computing packages (including SAS).

Nate Derby PROC SQL: Why and How 3 / 30

Why?How?

Conclusions

What?Why?

Why SQL?

Because of what we just said:

Designed for databases:

Works very well/quickly with large data sets.

Sometimes faster than SAS code.Sometimes works when SAS code doesn’t.

Easily implements data constraints.Easily implements views.Easily implements indexes.

Nate Derby PROC SQL: Why and How 4 / 30

Why?How?

Conclusions

What?Why?

Why SQL?

Because of what we just said:

Lots of functionality:

SQL can be used to do many things.Some things easier in SQL than in base SAS code.

Examples to follow.

Implemented in most computing packages:

Great for program portability.Can cut and paste between SAS and another language.

Some things aren’t portable.

Nate Derby PROC SQL: Why and How 5 / 30

Why?How?

Conclusions

Introduction, ExamplesWorking with SAS Data SetsOther Aspects

Intro to SQL in SAS

Implemented in SAS via PROC SQL:

PROC SQL SetupPROC SQL <options>;

[SQL Statements]

QUIT;

Nate Derby PROC SQL: Why and How 6 / 30

Why?How?

Conclusions

Introduction, ExamplesWorking with SAS Data SetsOther Aspects

Easy Examples

PROC SQL CodePROC SQL;

SELECT name, sex, age, height, weightFROM sashelp.class;

QUIT;

Nate Derby PROC SQL: Why and How 7 / 30

Why?How?

Conclusions

Introduction, ExamplesWorking with SAS Data SetsOther Aspects

Easy Examples

PROC SQL CodePROC SQL;

SELECT a.name, a.sex, a.age, a.height, a.weightFROM sashelp.class a;

QUIT;

Nate Derby PROC SQL: Why and How 8 / 30

Why?How?

Conclusions

Introduction, ExamplesWorking with SAS Data SetsOther Aspects

Easy Examples

PROC SQL CodePROC SQL;

SELECT boo.name, boo.sex, boo.age, boo.height, boo.weightFROM sashelp.class boo;

QUIT;

Nate Derby PROC SQL: Why and How 9 / 30

Why?How?

Conclusions

Introduction, ExamplesWorking with SAS Data SetsOther Aspects

Easy Examples

PROC SQL CodePROC SQL;

SELECT *FROM sashelp.class;

QUIT;

Nate Derby PROC SQL: Why and How 10 / 30

Why?How?

Conclusions

Introduction, ExamplesWorking with SAS Data SetsOther Aspects

Easy Examples

PROC SQL CodePROC SQL;

SELECT a.*FROM sashelp.class a;

QUIT;

Nate Derby PROC SQL: Why and How 11 / 30

Why?How?

Conclusions

Introduction, ExamplesWorking with SAS Data SetsOther Aspects

Easy Examples

PROC SQL CodePROC SQL;

SELECT name, sex, age, height format=6.2, weightformat=6.2

FROM sashelp.class;QUIT;

Nate Derby PROC SQL: Why and How 12 / 30

Why?How?

Conclusions

Introduction, ExamplesWorking with SAS Data SetsOther Aspects

Easy Examples

PROC SQL CodePROC SQL;

SELECT name, sex, age, height format=6.2, weightformat=6.2

FROM sashelp.classWHERE sex = 'M';

QUIT;

Nate Derby PROC SQL: Why and How 13 / 30

Why?How?

Conclusions

Introduction, ExamplesWorking with SAS Data SetsOther Aspects

Easy Examples

PROC SQL CodePROC SQL;

SELECT name, sex, age, height format=6.2, weightformat=6.2

FROM sashelp.classWHERE sex = 'M'ORDER BY age;

QUIT;

Nate Derby PROC SQL: Why and How 14 / 30

Why?How?

Conclusions

Introduction, ExamplesWorking with SAS Data SetsOther Aspects

More Interesting Example

PROC SQL CodePROC SQL;

SELECT sex,mean( age ) as mage label='Mean Age' format=6.2,min( height ) as minh label='Min Height' format=6.2,max( height ) as maxh label='Max Height' format=6.2

FROM sashelp.classGROUP BY sex;

QUIT;

Nate Derby PROC SQL: Why and How 15 / 30

Why?How?

Conclusions

Introduction, ExamplesWorking with SAS Data SetsOther Aspects

Much More Interesting Example

Nate Derby PROC SQL: Why and How 16 / 30

Why?How?

Conclusions

Introduction, ExamplesWorking with SAS Data SetsOther Aspects

Much More Interesting Example

PROC SQL CodePROC SQL;

SELECT a.name, a.sex, a.age, b.mageFROM sashelp.class a JOIN (SELECT sex, mean( age ) as mage

label='Mean Age by Sex' format=6.2FROM sashelp.classGROUP BY sex) b

ON a.sex = b.sexORDER BY age, name;

QUIT;

Nate Derby PROC SQL: Why and How 17 / 30

Why?How?

Conclusions

Introduction, ExamplesWorking with SAS Data SetsOther Aspects

Creating Tables

PROC SQL CodePROC SQL;NOPRINT;CREATE TABLE work.blah ASSELECT name, sex, age, height, weightFROM sashelp.class;

QUIT;

Nate Derby PROC SQL: Why and How 18 / 30

Why?How?

Conclusions

Introduction, ExamplesWorking with SAS Data SetsOther Aspects

Creating Tables

PROC SQL CodePROC SQL NOPRINT;

CREATE TABLE blah ASSELECT sex, mean( age ) as mage

label='Mean Age by Sex' format=6.2FROM sashelp.classGROUP BY sex;

CREATE TABLE final ASSELECT a.name, a.sex, a.age, b.mageFROM sashelp.class a join blah bON a.sex = b.sex

ORDER BY age, name;QUIT;

Nate Derby PROC SQL: Why and How 19 / 30

Why?How?

Conclusions

Introduction, ExamplesWorking with SAS Data SetsOther Aspects

Log Output

Since all within one PROC, we get timing for entire block of code:

Nate Derby PROC SQL: Why and How 20 / 30

Why?How?

Conclusions

Introduction, ExamplesWorking with SAS Data SetsOther Aspects

The STIMER Option

The STIMER gives times for each step within PROC SQL:

PROC SQL CodePROC SQL NOPRINT STIMER;

CREATE TABLE blah ASSELECT sex, mean( age ) as mage

label='Mean Age by Sex' format=6.2FROM sashelp.classGROUP BY sex;

CREATE TABLE final ASSELECT a.name, a.sex, a.age, b.mageFROM sashelp.class a join blah bON a.sex = b.sex

ORDER BY age, name;QUIT;

Nate Derby PROC SQL: Why and How 21 / 30

Why?How?

Conclusions

Introduction, ExamplesWorking with SAS Data SetsOther Aspects

Log Output

Now we get listings for each step:

Nate Derby PROC SQL: Why and How 22 / 30

Why?How?

Conclusions

Introduction, ExamplesWorking with SAS Data SetsOther Aspects

Use Different Table Names

Valid in SAS code, but PROC SQL complains:

PROC SQL CodePROC SQL NOPRINT STIMER;

CREATE TABLE blah AS...;

CREATE TABLE blah ASSELECT a.name, a.sex, a.age, b.mageFROM sashelp.class a join blah bON a.sex = b.sex

ORDER BY age, name;QUIT;

Nate Derby PROC SQL: Why and How 23 / 30

Why?How?

Conclusions

Introduction, ExamplesWorking with SAS Data SetsOther Aspects

ALTER, UPDATE, DELETE Statements

ALTER TABLE statement changes columns:

PROC SQL CodePROC SQL NOPRINT STIMER;

CREATE TABLE blah ASSELECT *FROM sashelp.class;

ALTER TABLE blahDROP ageMODIFY height FORMAT=6.2, weight FORMAT=6.2;

QUIT;

The ALTER TABLE statement cannot be used to change the data!

Nate Derby PROC SQL: Why and How 24 / 30

Why?How?

Conclusions

Introduction, ExamplesWorking with SAS Data SetsOther Aspects

ALTER, UPDATE, DELETE Statements

UPDATE statement changes data, DELETE FROM deletes rows:

PROC SQL CodePROC SQL NOPRINT STIMER;

CREATE TABLE blah ASSELECT *FROM sashelp.class;

UPDATE blahSET weightWHERE name = 'Robert';

DELETE FROM blahWHERE name = 'Judy';

QUIT;

UPDATE, DELETE FROM cannot be used to modify the columns!

Nate Derby PROC SQL: Why and How 25 / 30

Why?How?

Conclusions

Introduction, ExamplesWorking with SAS Data SetsOther Aspects

Views and Indexes

SQL was developed for databases, so PROC SQL works really wellwith views and indexes:

A view a “virtual table” = executable instructions for a table,but holds no data.

Great for getting updated info, restricting access.

An index is one or more columns used to identify each rowwithin a table

Great for cutting down on time for processing data in (very)large tables.

More info: See Kirk Lafler’s book!

Nate Derby PROC SQL: Why and How 26 / 30

Why?How?

Conclusions

Introduction, ExamplesWorking with SAS Data SetsOther Aspects

Accessing a Database

Surprise! PROC SQL can access a database really well.

Weird: No one writes about this ... except Katherine Prairie.

Two ways:

Pass-Through facility:

Uses ODBC (external to SAS).Nests non-SAS SQL code within SAS code.Thus, much SAS functionality can’t be used. Sometimes haveto be creative with SAS code to make non-SAS code.Also: Messy code!

SAS/ACCESS and the LIBNAME statement: Much cleanercode, allows using SAS functionality with the database.

Nate Derby PROC SQL: Why and How 27 / 30

Why?How?

Conclusions

Conclusions

PROC SQL can be a powerful tool!

Very flexibly queries (often more flexible than the standardSAS DATA steps).Sometimes easier than in standard SAS code.Facilitates program portability.Allows for views, indexes and integrity constraints.Quicker than SAS code for large data sets.For very large data sets, sometimes SAS code crashes, SQLcode doesn’t.

Nate Derby PROC SQL: Why and How 28 / 30

Appendix

Further Resources

Pete Lund.An Introduction to SQL in SAS.http://pugsug.org (Presentations), 2010.

Howard Schreier.PROC SQL by Example: Using SQL within SAS.SAS Press, 2008.

Kirk Lafler.PROC SQL: Beyond the Basics Using SAS.SAS Press, 2004.

Katherine Prairie.The Essential PROC SQL Handbook.SAS Press, 2005.

Nate Derby PROC SQL: Why and How 29 / 30

Appendix

Further Resources

Nate Derby: http://[email protected]

Nate Derby PROC SQL: Why and How 30 / 30