48
1 1 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

Embed Size (px)

Citation preview

Page 1: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

11

Chapter 1: Getting Started

1.1 Course Logistics

1.2 Introducing the Structured Query Language

1.3 Introducing the Business Scenario

Page 2: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

22

Chapter 1: Getting Started

1.1 Course Logistics1.1 Course Logistics

1.2 Introducing the Structured Query Language

1.3 Introducing the Business Scenario

Page 3: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

3

Objectives Explain the naming convention that is used for the

course files. Compare the three levels of exercises that are used

in the course. Describe at a high level how data is used and stored

at Orion Star Sports & Outdoors. Navigate to the SAS Help facility.

3

Page 4: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

4

Filename Conventions

Code Type

a Activity

d Demo

e Exercise

s Solution

4

Example: The SAS® SQL 1: Essentials course ID is s1, so s104d01= SQL Chapter 4, Demo 1.

s104d01x

course ID chapter # item #type placeholder

s104a01

s104a02

s104a02s

s104d01

s104d02

s104e01

s104e02

s104s01

s104s02

Page 5: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

5

Three Levels of Exercises

5

Level 1 The exercise mimics an example presented in the section.

Level 2 Less information and guidance are provided in the exercise instructions.

Level 3 Only the task you are to perform or the results to be obtained are provided. Typically, you will need to use the Help facility.

You are not expected to complete all of the exercises in the time allotted. Choose the exercise or exercises that are at the level you are most comfortable with.

Page 6: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

6

Orion Star Sports & Outdoors

Orion Star Sports & Outdoors is a fictitious global sports and outdoors retailer with traditional stores, an online store, and a large catalog business.

The corporate headquarters is located in the United States with offices and stores in many countries throughout the world.

Orion Star has about 1,000 employees and 90,000 customers, processes approximately 150,000 orders annually, and purchases products from 64 suppliers.

6

Page 7: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

7

Orion Star DataAs is the case with most organizations, Orion Star has a large amount of data about its customers, suppliers, products, and employees. Much of this information is stored in transactional systems in various formats.

Using applications and processes such as SAS Data Integration Studio, this transactional information was extracted, transformed, and loaded into a data warehouse.

Data marts were created to meet the needs of specific departments such as Marketing.

7

Page 8: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

8

The SAS Help Facility

8

Page 9: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

9

The SAS Help Facility Invoke the SAS Help facility by doing one of the

following actions:

– Type Help on the command line.

– Select Help from the menu.– Select the Help button on the toolbar.

Additional help and documentation are available at www.support.sas.com/documentation.

9

Page 10: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

10

The SAS Help Facility

10

Page 11: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

1111

Page 12: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

12

Setup for the Poll Start your SAS session. Open the SAS Help facility.

12

Page 13: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

13

1.01 PollWere you able to open the Help facility in your SAS session?

Yes

No

13

Page 14: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

14

1.02 Multiple Choice PollWhich choice best describes your programming and SQL experience level?

a. I have little or no programming experience.

b. I can write programs in languages other than SQL.

c. I can write database-specific SQL programs.

d. I can write SAS PROC SQL programs.

e. I can program in multiple languages, including SQL.

14

Page 15: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

15

1.03 Multiple Choice PollWhat version of SAS do you use?

a. I do not use SAS.

b. SAS 8.2

c. SAS®9

d. SAS 9.1

e. SAS 9.2

f. Other

15

Page 16: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

1616

Chapter 1: Getting Started

1.1 Course Logistics

1.2 Introducing the Structured Query Language1.2 Introducing the Structured Query Language

1.3 Introducing the Business Scenario

Page 17: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

17

Objectives Describe the historical development of Structured

Query Language (SQL). Explain how SQL is used.

17

Page 18: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

18

Structured Query LanguageStructured Query Language (SQL) is a standardized language originally designed as a relational database query tool.

SQL is currently used in many software products to retrieve and update data.

18

Page 19: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

19

Structured Query Language: Timeline

19

1970 1980 1990 2000

1981 – First commercialSQL product is released.

1989 – More than 75 SQL-basedsystems exist. SAS 6.06includes PROC SQL.

2004 – PROC SQL isenhanced for SAS®9.

IBM develops SQL.

1970 – Dr. E. F. Coddof IBM proposes SQL.

1999 – PROC SQL isenhanced for SAS 8.

Page 20: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

20

The SQL ProcedureThe SQL procedure has the following characteristics: enables the use of SQL in SAS is part of Base SAS software follows American National Standards Institute (ANSI)

standards includes enhancements for compatibility with

SAS software

20

Page 21: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

21

The SQL Procedure FeaturesWith PROC SQL, you can use SQL language syntax to do the following: query SAS data sets generate reports from SAS data sets combine SAS data sets in many ways create and delete SAS data sets, views, and indexes update existing SAS data sets sometimes reproduce the results of multiple DATA

and procedure steps with a single query

21

Page 22: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

22

Structured Query Language

22

Input Output

PROCSQL

Report

PROC SQL

SAS Data Set

SAS Data View

SAS Data Set

SAS Data View

DBMS Table

DBMS Table

Page 23: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

2323

Page 24: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

24

Setup for the Poll Issue a LIBNAME statement for the orion library,

which contains the course data. You can use the s101a01 program if you want. Change the data location, if necessary.

Submit the program s101a02. Answer the following questions:

– What is the name of the input SAS data set?– Do the column names appear in the SELECT

statement?

24s101a02

Page 25: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

25

1.04 Multiple Choice PollWhat is the name of the input SAS data set?

a. orion.Employee_payroll

b. SQL

c. SELECT

d. None of the above

25

Page 26: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

26

1.04 Multiple Choice Poll – Correct AnswerWhat is the name of the input SAS data set?

a. orion.Employee_payroll

b. SQL

c. SELECT

d. None of the above

26

Page 27: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

27

1.05 PollDid the names of the columns that appeared in the results appear in the SELECT statement in the code?

Yes

No

27

Page 28: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

28

1.05 Poll – Correct AnswerDid the names of the columns that appeared in the results appear in the SELECT statement in the code?

Yes

No

28

Page 29: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

29

The SQL ProcedureThe SQL procedure is a tool for querying data a tool for data manipulation and management an augmentation to the DATA step.

The SQL procedure is not a DATA step replacement a custom reporting tool.

29

Page 30: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

30

SAS Data SetsA SAS data set can be any of the following: a SAS data file that stores data descriptions and data

values together in native SAS format a DBMS table accessed via a SAS/ACCESS engine a SAS data view, using one of the following

technologies:– PROC SQL view – a stored SQL query that

retrieves data stored in other tables– DATA step view – a stored DATA step that retrieves

data stored in other files– SAS/ACCESS view – a stored ACCESS descriptor

containing information required to retrieve data stored in a DBMS (older technology)

30

Page 31: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

31

Terminology

31

DataProcessing

SAS SQL

File Data Set Table

Record Observation Row

Field Variable Column

Page 32: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

3232

Page 33: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

3333

Chapter 1: Getting Started

1.1 Course Logistics

1.2 Introducing the Structured Query Language

1.3 Introducing the Business Scenario1.3 Introducing the Business Scenario

Page 34: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

34

Objectives Describe the data used in this course. Explain the relationships between the various tables.

34

Page 35: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

35

The Orion Star Company

Analyze a subset of Orion Star data including the following: employees in the United States and Australia customers from Australia, Canada, Germany, Israel,

South Africa, the United States, and Turkey the years 2002 through 2007

The tables and columns are related as shown on the next slide.

35

Page 36: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

36

Orion Star Data RelationshipsHuman Resources Data

36

Employee_ID is the keycolumn for HR data.

Page 37: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

37

Orion Star Data RelationshipsOrder Data

37

Order_ID is the keycolumn for Order data.

Product_ID is the keycolumn for Product data.

Page 38: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

38

Orion Star Data RelationshipsCustomer Data

38

Customer_ID is the keycolumn for Customer data.

Page 39: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

39

Orion Star Data RelationshipsRelationships between Types of Data

39

Page 40: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

4040

Page 41: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

41

1.06 Multiple Answer PollWhich of the Order data tables contain the column Employee_ID?

a. orion.QTR1_2007

b. orion.QTR2_2007

c. orion.Order_Fact

d. orion.Price_List

e. orion.Product_Dimf. All of them

41

Page 42: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

42

1.06 Multiple Answer Poll – Correct AnswerWhich of the Order data tables contain the column Employee_ID?

a. orion.QTR1_2007

b. orion.QTR2_2007

c. orion.Order_Fact

d. orion.Price_List

e. orion.Product_Dimf. All of them

42

Page 43: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

43

Orion Country Codes

43

Code Country

AU Australia

CA Canada

DE Germany

IL Israel

TR Turkey

US United States

ZA South Africa

Page 44: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

44

Orion Product ID CodesCodes are numeric in the form XXYYZZZZZZZZ.

44

XXYYZZZZZZZZ

Individual Product IdentifierProduct Type

Subcategory

Page 45: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

45

Orion Product ID Codes

45

Code Product Type

21 Children

22 Clothes and Shoes

23 Outdoors

24 Sports

Page 46: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

4646

Page 47: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

47

1.07 QuizUse the data relationship charts on pages 1-16 through1-18 to answer the following question:

Which table(s) contains the column Order_Date?

47s101a03

Page 48: 11 Chapter 1: Getting Started 1.1 Course Logistics 1.2 Introducing the Structured Query Language 1.3 Introducing the Business Scenario

48

1.07 Quiz – Correct AnswerUse the data relationship charts on pages 1-16 through1-18 to answer the following question:

Which table(s) contains the column Order_Date?

1. orion.Order_Fact2. orion.Qtr1_20073. orion.Qtr2_2007

48s101a03