PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com

Preview:

DESCRIPTION

PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com. Nethra Sambamoorthi, PhD Northwestern University Master of Science in Predictive Analytics Program . Data Processing Terminologies Across Data Sciences…. Why PROC SQL or What Can It Do For Analysts?. - PowerPoint PPT Presentation

Citation preview

PROC SQL – Select Codes To Master For Power Programming

Codes and Examples from SAS.com

Nethra Sambamoorthi, PhDNorthwestern University

Master of Science in Predictive Analytics Program

Data Processing Terminologies Across Data Sciences…

Why PROC SQL or What Can It Do For Analysts?• Generate reports• Generate summary statistics• Retrieve data from tables or views• Combine data from tables or views• Create tables, views, and indexes• Update the data values in PROC SQL tables• Update and retrieve data from database management system (DBMS) tables• Modify a PROC SQL table by adding, modifying, or dropping columns• PROC SQL can be used in an interactive SAS session or within batch programs, and

it• Can include global statements, such as TITLE and OPTIONS.

An Example of Extracting, Summarizing, and Printing Using Data Steptitle 'Large Countries Grouped by Continent';proc summary data=sql.countries;where Population > 1000000;class Continent;var Population;output out=sumPop sum=TotPop; run;

proc sort data=SumPop;by totPop; run;

proc print data=SumPop noobs;var Continent TotPop;format TotPop comma15.;where _type_=1; run;

/* Extracting and summarizing */

/* Sorting to arrange the output */

/* Printing */

Creating The Same Using PROC SQLproc sql;title 'Population of Large Countries Grouped by Continent';select Continent, sum(Population) as TotPop format=comma15.from sql.countrieswhere Population gt 1000000group by Continentorder by TotPop;quit;

Countries Table

WordCityCoords Table

USCityCoords Table

UnitedStates Table

PostalCodes Table

Worldtemps Table

Oilprod Table

OILRSRVS Table

CONTINENTS Table

FEATURES Table

SELECT statement

Three Important Aspects – Describe, Print, Quit/* Helps understand the structure of the table */PROC SQL;Describe table sql.unitedstates; Quit;

SELECT means PRINTING is Included Unless• SELECT * /* all columns */

• SELECT city, state /* specific columns */

• SELECT distinct continent /* specific columns but avoid dup */

So it is possible to run this

The output is…

Suppress column headings…

Calculated columns and alias name…

Retrieving Data From Multiple Tables• Means we are JOINING tables• If there is no JOIN statement, it means (1) Cartesian product of

records [no subset condition ] or (2) inner joins [ we need some subset condition]• Alias names can be used for tables too; it helps simplify calling specific

columns of a table

SELECT … FROM table1, table2; A Cartesian Product

Order the output from INNER JOIN

INNER JOIN can be used explicitly

INNER JOIN with comparison values on another column…

Effect of Null Values on JOINS

NOT MISSING option

Multicolumn JOINS

Columns are directly comparable between two tables…

Capitals FROM sql.unitedstates

City FROM sql.uscitycoord

Postalcodes FROM sql.postalcodes

Is it possible to do SELFJOIN?

Two Types of OUTERJOIN – LEFTJOIN and RIGHTJOIN

FULLJOIN …

SPECIALTY JOINS

NATURAL is applicable for both LEFT and RIGHT JOIN. The purpose is to reduce verbose to match on multiple common columns…

Gives the same output;

Non matching rows have missing values

Use COALESCE to combine multiple columns to create new matching variables

Using SUB QUERY or NESTED QUERY – SINGLE VALUE

=

Correlated SUBQUERY = NESTED QUERY

Where “EXISTS” option

Multiple NESTED QUERY

Combine a JOIN with a SUBQUERY

QUERY strategies…

UNION is ROWWISE (PROC APPEND), while JOIN is COLUMNWISE (MERGE by)

Keep the dups

OUTER UNION = KEEP ONLY FROM – Key word EXCEPT

To overlay data better: keyword CORRESPONDING

Recommended