Upload
others
View
13
Download
0
Embed Size (px)
Citation preview
Hash-Object Programming in SAS
Malcolm MacRae, AESO
Sean McCarthy, ENMAX
0
INTRODUCTION TO HASH OBJECTSMalcolm MacRae
1
Introduction to Hash Objects
• What is a hash object?
– Adelson-Velsky Landis tree
– In-memory object
– Associates key value with specific data values
• Why is it important?
– Provides look-up functionality
– Similar to exact-match VLOOKUP function in Excel
2
Presentation Overview
• Create a hash object
• Populate a hash object
• Use a hash object
3
SAS Code Template
data work.hash_demo;
/* Define return code */
length rc 8;
/* Create hash object */
/* Populate hash object */
/* Use hash object */
/* Terminate hash object */
rc = h.delete();
run;
4
Presentation Overview
• Create a hash object
– Hash object structure
– Multi-data hash objects
• Populate a hash object
• Use a hash object
5
Create a Hash Object
6
One key variable Multiple key variables
One data variable Simple hash object Composite-key hash object
Multiple data variables Multi-variable hash object Complex hash object
• Hash object structure is defined by:
1. Key variables; and,
2. Data variables
• Four types of hash object structure:
Simple Hash Object
/* Define variables */
length key data 8;
/* Initialize hash object */
declare hash h();
/* Define hash object */
h.defineKey(’key’);
h.defineData(’data’);
h.defineDone();
/* Initialize variables */
call missing(key,data);
7
datakey
Hash Object h
Program Data Vector
key data rc
Multi-Variable Hash Object
/* Define variables */
length key data1 data2 8;
/* Initialize hash object */
declare hash h();
/* Define hash object */
h.defineKey(’key’);
h.defineData(’data1’,’data2’);
h.defineDone();
/* Initialize variables */
call missing(key,data1,data2);
8
data1 data2key
Hash Object h
Program Data Vector
key data1 rcdata2
Composite-Key Hash Object
/* Define variables */
length key1 key2 data 8;
/* Initialize hash object */
declare hash h();
/* Define hash object */
h.defineKey(’key1’,’key2’);
h.defineData(’data’);
h.defineDone();
/* Initialize variables */
call missing(key1,key2,data);
9
datakey2key1
Hash Object h
Program Data Vector
key1 key2 data rc
Complex Hash Object
/* Define variables */
length key1 key2 data1 data2 8;
/* Initialize hash object */
declare hash h();
/* Define hash object */
h.defineKey(’key1’,’key2’);
h.defineData(’data1’,’data2’);
h.defineDone();
/* Initialize variables */
call missing(key1,key2,
data1,data2);
10
data1 data2key2key1
Hash Object h
Program Data Vector
key2 data1 data2 rckey1
Multi-Data Hash Objects
• Hash object by default
– Associates one key value with one data value
• Multi-data hash object
– Associates one key value with multiple data values
– Supports simple hash objects
– Supports multi-variable hash objects
– Supports composite-key hash objects
– Supports complex hash objects
11
key1key1
Simple Multi-Data Hash Object
/* Define variables */
length key data 8;
/* Initialize hash object */
declare hash h(multidata:’y’);
/* Define hash object */
h.defineKey(’key’);
h.defineData(’data’);
h.defineDone();
/* Initialize variables */
call missing(key,data);
12
datakey
Hash Object h
Program Data Vector
rc data key
key1key1 key1key1
Complex Multi-Data Hash Object
/* Define variables */
length key1 key2 data1 data2 8;
/* Initialize hash object */
declare hash h(multidata:’y’);
/* Define hash object */
h.defineKey(’key1’,’key2’);
h.defineData(’data1’,’data2’);
h.defineDone();
/* Initialize variables */
call missing(key1,key2,
data1,data2);
13
data1 data2key2key1
Hash Object h
Program Data Vector
rc data key
Presentation Overview
• Created a hash object
• Populate a hash object
– Populate manually
– Populate from data set
• Use a hash object
14
Example: Modulus Function
• Remainder in integer division:
f(x) = x mod 3
• SAS function:
f(x) = mod(x, 3);
x f(x)
1 1
2 2
3 0
4 1
5 2
6 0
7 1
8 2
9 0
15
key1key1
Simple Multi-Data Hash Object
/* Define variables */
length key data 8;
/* Initialize hash object */
declare hash h(multidata:’y’);
/* Define hash object */
h.defineKey(’key’);
h.defineData(’data’);
h.defineDone();
/* Initialize variables */
call missing(key,data);
16
datakey
Hash Object h
Program Data Vector
rc data key
Manually Populate Hash Object
/* Populate hash-object LUT. */
do data = 1 to 9;
key = mod(data,3);
rc = h.add();
end;
17
datakey
Hash Object h
Program Data Vector
rc data key
1 1
2 2
0 3
1 12 23 04 10
45 12
Manually Populate Hash Object
/* Populate hash-object LUT. */
do data = 1 to 9;
key = mod(data,3);
rc = h.add();
end;
18
1
datakey
1
0
Hash Object h
Program Data Vector
rc data key
2 2
0 369
58
47
6 07 18 29 0
9
Manually Populate Hash Object
/* Populate hash-object LUT. */
do data = 1 to 9;
key = mod(data,3);
rc = h.add();
end;
19
1
datakey
1
0 0
Hash Object h
Program Data Vector
rc data key
2 2
0 369
58
47
Populate Hash Object from Data Set
/* Create sample data set */
data work.lut;
do data = 1 to 9;
key = mod(data,3);
output;
end;
run;
data key
1 1
2 2
3 0
4 1
5 2
6 0
7 1
8 2
9 0
20
work.lut
Populate Hash Object from Data Set
/* Define variables */
length key data 8;
/* Initialize hash object. */
declare hash h(
dataset:'work.lut',
multidata:'y');
/* Define hash object */
h.defineKey(key:'key');
h.defineData(data:'data');
h.defineDone();
/* Initialize variables */
call missing(key, data);
21
1
datakey
1
Hash Object h
Program Data Vector
rc data key
2 2
0 369
58
47
Presentation Overview
• Created a hash object
• Populated a hash object
• Use a hash object
– Hash-object look-up table
– Hash-object iterators
22
96
85
74
Hash-Object LUT
/* Perform look-up */
do key = -1 to 3;
rc = h.find();
do while (rc = 0);
output;
rc = h.find_next();
end;
end;
23
1
datakey
1
Hash Object h
Program Data Vector
rc data key
2 2
0 3
-1160038 00 3
69
69160038
Output Data Set
Hash-Object LUT
key data rc
0 3 0
0 6 0
0 9 0
1 1 0
1 4 0
1 7 0
2 2 0
2 5 0
2 8 0
24
Hash Object Iterator
• Iterates through hash object LUT
• Iterator methods:
– .first()
– .next()
– .prev()
– .last()
25
key1key1
Hash Object Iterator
/* Define variables */
length key data 8;
/* Initialize hash object */
declare hash h(multidata:’y’);
/* Define hash object */
h.defineKey(’key’);
h.defineData(’data’);
h.defineDone();
/* Initialize variables */
call missing(key,data);
/* Hash-object iterator. */
declare hiter hit(’h’);
26
datakey
Hash Object h
Program Data Vector
rc data key
96
85
74
Hash-Object Iterator
/* Iterate through LUT */
rc = hit.first();
do until (rc = 0);
output;
rc = hit.next();
end;
27
1
datakey
1
Hash Object h
Program Data Vector
rc data key
2 2
0 3
30 6
6
9
9
0 2
Output Data Set
Hash Object LUT
key data rc
0 3 0
0 6 0
0 9 0
1 1 0
1 4 0
1 7 0
2 2 0
2 5 0
2 8 0
Hash Object Iterator
key data rc
3 0
6 0
9 0
2 0
5 0
8 0
1 0
4 0
7 0
28
key1key1
Ordered Hash Object Iterator
/* Define variables */
length key data 8;
/* Initialize hash object */
declare hash h(
multidata:’y’,
ordered:’ascending’);
/* Define hash object */
h.defineKey(’key’);
h.defineData(’key’,’data’);
h.defineDone();
/* Initialize variables */
call missing(key,data);
/* Hash-object iterator. */
declare hiter hit(’h’);
29
datakey
Hash Object h
Program Data Vector
rc data key
Output Data Set
Hash-Object LUT
key data rc
0 3 0
0 6 0
0 9 0
1 1 0
1 4 0
1 7 0
2 2 0
2 5 0
2 8 0
Ordered Hash-Object Iterator
key data rc
0 3 0
0 6 0
0 9 0
1 1 0
1 4 0
1 7 0
2 2 0
2 5 0
2 8 0
30
Presentation Overview
• Created a hash object
• Populated a hash object
• Used a hash object
31
Additional Resources
• Burlew, Michelle. Hash Object Programming Made Easy. SAS Press.
32
APPLICATIONS OF HASH OBJECTSSean McCarthy
33
Join Customers Table with Sales Table
34
Item Sold Price Cust_ID
iPhone 7 $800 2
iPad Air $700 1
Customer_ID Name
1 Sean
2 Malcolm
Sales Table
Customers Table
data work.add_cust_name;
if _N_ = 1 then do;
declare hash cust(dataset: “work.Customers”);
cust.defineKey(“Customer_ID”);
cust.defineData(“Name”);
cust.defineDone();
end;
set work.Sales;
length Name $ 7;
if cust.find(key: Cust_ID) NE 0
then call missing(Name);
run;
Item Sold Price Cust_ID Name
iPhone 7 $800 2 Malcolm
iPad Air $700 1 Sean
New Table
No resource-intensive sorting required—explicit or implicit!
Aggregating with Hash Objects
35
data _NULL_;
declare hash rev();
rev.defineKey(“Name”);
rev.defineData(“Name”, “Revenue”);
rev.defineDone();
done = 0;
do until (done = 1);
set work.Sales end=done;
format Revenue dollar9.;
if rev.find(key: Name) NE 0
then call missing(Revenue);
Revenue = sum(Revenue, Price);
rev.replace();
end;
rev.output(dataset: “work.Revenue”);
run;
Item Sold Price Name
iPhone 7 $800 Malcolm
iPad Air $700 Sean
MacBook $900 Sean
iPad Mini $500 Malcolm
Sales Table
Name Revenue
Malcolm $1,300
Sean $1,800
New Revenue Table
THANKS!
36