11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3...

Preview:

Citation preview

11

Chapter 1: Getting Started

1.1 Course Logistics

1.2 Introducing the Structured Query Language

1.3 Introducing the Business Scenario

22

Chapter 1: Getting Started

1.1 Course Logistics1.1 Course Logistics

1.2 Introducing the Structured Query Language

1.3 Introducing the Business Scenario

3

Objectives Explain the naming convention that is used for the

course files. Compare the three levels of exercises that are used

in the course. Describe at a high level how data is used and stored

at Orion Star Sports & Outdoors. Navigate to the SAS Help facility.

3

4

Filename Conventions

Code Type

a Activity

d Demo

e Exercise

s Solution

4

Example: The SAS® SQL 1: Essentials course ID is s1, so s104d01= SQL Chapter 4, Demo 1.

s104d01x

course ID chapter # item #type placeholder

s104a01

s104a02

s104a02s

s104d01

s104d02

s104e01

s104e02

s104s01

s104s02

5

Three Levels of Exercises

5

Level 1 The exercise mimics an example presented in the section.

Level 2 Less information and guidance are provided in the exercise instructions.

Level 3 Only the task you are to perform or the results to be obtained are provided. Typically, you will need to use the Help facility.

You are not expected to complete all of the exercises in the time allotted. Choose the exercise or exercises that are at the level you are most comfortable with.

6

Orion Star Sports & Outdoors

Orion Star Sports & Outdoors is a fictitious global sports and outdoors retailer with traditional stores, an online store, and a large catalog business.

The corporate headquarters is located in the United States with offices and stores in many countries throughout the world.

Orion Star has about 1,000 employees and 90,000 customers, processes approximately 150,000 orders annually, and purchases products from 64 suppliers.

6

7

Orion Star DataAs is the case with most organizations, Orion Star has a large amount of data about its customers, suppliers, products, and employees. Much of this information is stored in transactional systems in various formats.

Using applications and processes such as SAS Data Integration Studio, this transactional information was extracted, transformed, and loaded into a data warehouse.

Data marts were created to meet the needs of specific departments such as Marketing.

7

8

The SAS Help Facility

8

9

The SAS Help Facility Invoke the SAS Help facility by doing one of the

following actions:

– Type Help on the command line.

– Select Help from the menu.– Select the Help button on the toolbar.

Additional help and documentation are available at www.support.sas.com/documentation.

9

10

The SAS Help Facility

10

1111

12

Setup for the Poll Start your SAS session. Open the SAS Help facility.

12

13

1.01 PollWere you able to open the Help facility in your SAS session?

Yes

No

13

14

1.02 Multiple Choice PollWhich choice best describes your programming and SQL experience level?

a. I have little or no programming experience.

b. I can write programs in languages other than SQL.

c. I can write database-specific SQL programs.

d. I can write SAS PROC SQL programs.

e. I can program in multiple languages, including SQL.

14

15

1.03 Multiple Choice PollWhat version of SAS do you use?

a. I do not use SAS.

b. SAS 8.2

c. SAS®9

d. SAS 9.1

e. SAS 9.2

f. Other

15

1616

Chapter 1: Getting Started

1.1 Course Logistics

1.2 Introducing the Structured Query Language1.2 Introducing the Structured Query Language

1.3 Introducing the Business Scenario

17

Objectives Describe the historical development of Structured

Query Language (SQL). Explain how SQL is used.

17

18

Structured Query LanguageStructured Query Language (SQL) is a standardized language originally designed as a relational database query tool.

SQL is currently used in many software products to retrieve and update data.

18

19

Structured Query Language: Timeline

19

1970 1980 1990 2000

1981 – First commercialSQL product is released.

1989 – More than 75 SQL-basedsystems exist. SAS 6.06includes PROC SQL.

2004 – PROC SQL isenhanced for SAS®9.

IBM develops SQL.

1970 – Dr. E. F. Coddof IBM proposes SQL.

1999 – PROC SQL isenhanced for SAS 8.

20

The SQL ProcedureThe SQL procedure has the following characteristics: enables the use of SQL in SAS is part of Base SAS software follows American National Standards Institute (ANSI)

standards includes enhancements for compatibility with

SAS software

20

21

The SQL Procedure FeaturesWith PROC SQL, you can use SQL language syntax to do the following: query SAS data sets generate reports from SAS data sets combine SAS data sets in many ways create and delete SAS data sets, views, and indexes update existing SAS data sets sometimes reproduce the results of multiple DATA

and procedure steps with a single query

21

22

Structured Query Language

22

Input Output

PROCSQL

Report

PROC SQL

SAS Data Set

SAS Data View

SAS Data Set

SAS Data View

DBMS Table

DBMS Table

2323

24

Setup for the Poll Issue a LIBNAME statement for the orion library,

which contains the course data. You can use the s101a01 program if you want. Change the data location, if necessary.

Submit the program s101a02. Answer the following questions:

– What is the name of the input SAS data set?– Do the column names appear in the SELECT

statement?

24s101a02

25

1.04 Multiple Choice PollWhat is the name of the input SAS data set?

a. orion.Employee_payroll

b. SQL

c. SELECT

d. None of the above

25

26

1.04 Multiple Choice Poll – Correct AnswerWhat is the name of the input SAS data set?

a. orion.Employee_payroll

b. SQL

c. SELECT

d. None of the above

26

27

1.05 PollDid the names of the columns that appeared in the results appear in the SELECT statement in the code?

Yes

No

27

28

1.05 Poll – Correct AnswerDid the names of the columns that appeared in the results appear in the SELECT statement in the code?

Yes

No

28

29

The SQL ProcedureThe SQL procedure is a tool for querying data a tool for data manipulation and management an augmentation to the DATA step.

The SQL procedure is not a DATA step replacement a custom reporting tool.

29

30

SAS Data SetsA SAS data set can be any of the following: a SAS data file that stores data descriptions and data

values together in native SAS format a DBMS table accessed via a SAS/ACCESS engine a SAS data view, using one of the following

technologies:– PROC SQL view – a stored SQL query that

retrieves data stored in other tables– DATA step view – a stored DATA step that retrieves

data stored in other files– SAS/ACCESS view – a stored ACCESS descriptor

containing information required to retrieve data stored in a DBMS (older technology)

30

31

Terminology

31

DataProcessing

SAS SQL

File Data Set Table

Record Observation Row

Field Variable Column

3232

3333

Chapter 1: Getting Started

1.1 Course Logistics

1.2 Introducing the Structured Query Language

1.3 Introducing the Business Scenario1.3 Introducing the Business Scenario

34

Objectives Describe the data used in this course. Explain the relationships between the various tables.

34

35

The Orion Star Company

Analyze a subset of Orion Star data including the following: employees in the United States and Australia customers from Australia, Canada, Germany, Israel,

South Africa, the United States, and Turkey the years 2002 through 2007

The tables and columns are related as shown on the next slide.

35

36

Orion Star Data RelationshipsHuman Resources Data

36

Employee_ID is the keycolumn for HR data.

37

Orion Star Data RelationshipsOrder Data

37

Order_ID is the keycolumn for Order data.

Product_ID is the keycolumn for Product data.

38

Orion Star Data RelationshipsCustomer Data

38

Customer_ID is the keycolumn for Customer data.

39

Orion Star Data RelationshipsRelationships between Types of Data

39

4040

41

1.06 Multiple Answer PollWhich of the Order data tables contain the column Employee_ID?

a. orion.QTR1_2007

b. orion.QTR2_2007

c. orion.Order_Fact

d. orion.Price_List

e. orion.Product_Dimf. All of them

41

42

1.06 Multiple Answer Poll – Correct AnswerWhich of the Order data tables contain the column Employee_ID?

a. orion.QTR1_2007

b. orion.QTR2_2007

c. orion.Order_Fact

d. orion.Price_List

e. orion.Product_Dimf. All of them

42

43

Orion Country Codes

43

Code Country

AU Australia

CA Canada

DE Germany

IL Israel

TR Turkey

US United States

ZA South Africa

44

Orion Product ID CodesCodes are numeric in the form XXYYZZZZZZZZ.

44

XXYYZZZZZZZZ

Individual Product IdentifierProduct Type

Subcategory

45

Orion Product ID Codes

45

Code Product Type

21 Children

22 Clothes and Shoes

23 Outdoors

24 Sports

4646

47

1.07 QuizUse the data relationship charts on pages 1-16 through1-18 to answer the following question:

Which table(s) contains the column Order_Date?

47s101a03

48

1.07 Quiz – Correct AnswerUse the data relationship charts on pages 1-16 through1-18 to answer the following question:

Which table(s) contains the column Order_Date?

1. orion.Order_Fact2. orion.Qtr1_20073. orion.Qtr2_2007

48s101a03

Recommended