37
APL Optimization Techniques Eugene Ying Senior Software Developer Fiserv, Inc. September 14, 2012 1

APL Optimization Techniques

  • Upload
    nuri

  • View
    71

  • Download
    1

Embed Size (px)

DESCRIPTION

APL Optimization Techniques. Eugene Ying Senior Software Developer Fiserv, Inc. September 14 , 2012. Topics. File I/O Optimization. Component File Fragmentation. Storing Numbers in a Native File. CPU Optimization. The Outer Product. The Inner Product. The Match Function. 2. - PowerPoint PPT Presentation

Citation preview

Page 1: APL Optimization Techniques

APL Optimization Techniques

Eugene YingSenior Software Developer

Fiserv, Inc.September 14, 2012

1

Page 2: APL Optimization Techniques

Topics

Component File Fragmentation

The Match Function

The Inner Product

Storing Numbers in a Native File

The Outer Product

File I/O Optimization

CPU Optimization

2

Page 3: APL Optimization Techniques

A Component File where each Component Contains 100 Rows of Data

Updating component 2 with 150 rows of data comp 2

file is fragmented

Updating component 2 with 50 rows of data

3

Page 4: APL Optimization Techniques

Suppose your data will not have more than 500 rows of data.To minimize the chance of fragmentation, you allocate 500 rows of data for each component.

Initializing a Component File

(500 10⍴' ')⎕FAPPEND TIE ⍝ Component 1(500 4⍴0)⎕FAPPEND TIE ⍝ Component 2(500 20⍴' ')⎕FAPPEND TIE ⍝ Component 3(500 5⍴0)⎕FAPPEND TIE ⍝ Component 4(500 15⍴' ')⎕FAPPEND TIE ⍝ Component 5

4

Page 5: APL Optimization Techniques

Initializing a Component File

IntendedInitialization

ActualInitialization

comp 1 comp 2

comp 1

comp 3 comp 4 comp 5

comp 5comp 3comp 2 comp 4

characters characters characters

characters characters characters

numbers numbers

numbers numbers

Numeric Components are greatly under-allocated in size

5

Page 6: APL Optimization Techniques

Storage Sizes of APL Numbers

BOOLEAN←1000⍴1 ⎕SIZE 'BOOLEAN'

144 INTEGER1←1000⍴2

⎕SIZE 'INTEGER1'1016 INTEGER2←1000⍴128

⎕SIZE 'INTEGER2'2016

INTEGER4←1000⍴32768 ⎕SIZE 'INTEGER4'

4016 FLOAT8←1000⍴0.1

⎕SIZE 'FLOAT8'8016

6

Page 7: APL Optimization Techniques

The Default APL Number 0

X←1000⍴0 ⎕SIZE 'X'

144 X←1000⍴0.1-0.1

⎕SIZE 'X'144 X←1000⍴0×0.1

⎕SIZE 'X'144

X←1000↑0⍴0.1 ⎕SIZE 'X'

144 X←0×1000⍴0.1

⎕SIZE 'X'144

7

Page 8: APL Optimization Techniques

F64_0←1⊃11 645 ⎕DR 1000⍴0 ⍝ Floating pt # 0 ⎕SIZE 'F64_0'

8016 B32_999←1⊃163 323 ⎕DR 1000⍴999 ⍝ Binary-32 # 999 ⎕SIZE 'B32_999'

4032 B16_2←1⊃83 163 ⎕DR 1000⍴2 ⍝ Binary-16 # 2

⎕SIZE 'B16_2'2032

B8_0←1⊃11 83 ⎕DR 1000⍴0 ⍝ Binary-8 # 0 ⎕SIZE 'B8_0'

1016

How Do You Create A Vector of Integer Zerosor A Vector of Floating Point Zeros?

8

Page 9: APL Optimization Techniques

Declaring NumbersUsing a Defined Function to Preserve Numeric Type

F64←64 DCL 1000⍴0 ⍝ Floating pt # 0 ⎕SIZE 'F64'

8016 I32←32 DCL 1000⍴999 ⍝ Binary-32 # 999

⎕SIZE 'I32'4032

I16←16 DCL 1000⍴2 ⍝ Binary-16 # 2 ⎕SIZE 'I16'

2032

I8←8 DCL 1000⍴0 ⍝ Binary-8 # 0 ⎕SIZE 'I8'

1016

9

Page 10: APL Optimization Techniques

The DCL (Declare) Function

[0] Z←X DCL Y;D;R [1] ⍝ Declare a floating point or integer array so that each[2] ⍝ item occupies the number of bits requested by the X argument[3] ⍝ X: # of bits that each number in the array will occupy [4] ⍝ 8 for 8-bit (1-byte) integer (¯128 to 127) [5] ⍝ 16 for 16-bit (2-byte) integer (¯32768 to 32767) [6] ⍝ 32 for 32-bit (4-byte) integer (¯2147483648 to 2147483647) [7] ⍝ 64 for 64-bit (8-byte) floating point # [8] ⍝ Y: Numeric array declared [9] ⍝ Z: Numeric array that occupies the space you requested [10] [11] D←⎕DR Y ⍝ Current data type of Y [12] :Select ⍬⍴X [13] :Case 8 ⋄ R←83 [14] :Case 16 ⋄ R←163 [15] :Case 32 ⋄ R←323 [16] :Case 64 ⋄ R←645[17] :Else ⋄ ∘ ⍝ Stop if requested data type not supported[18] :EndSelect [19] →(D>R)↑'∘' ⍝ Stop if numeric overflow[20] Z←1⊃(D,R)⎕DR Y ⍝ Convert to requested data type

10

Page 11: APL Optimization Techniques

For more accurate initialization:

Initialization as Intended

(500 10⍴' ')⎕FAPPEND TIE ⍝ Component 1(64 DCL 500 4⍴0)⎕FAPPEND TIE ⍝ Component 2(500 20⍴' ')⎕FAPPEND TIE ⍝ Component 3(32 DCL 500 5⍴0)⎕FAPPEND TIE ⍝ Component 4(500 15⍴' ')⎕FAPPEND TIE ⍝ Component 5

11

Page 12: APL Optimization Techniques

Changing the Floating Point 0

Z1000←64 DCL 1000⍴0 ⍝ 1,000 Floating pt 0 ⎕SIZE 'Z1000'

8016

Z2000←2000↑Z1000 ⍝ 2,000 Floating pt 0 ⎕SIZE 'Z2000'

268

Z2000←64 DCL 2000⍴0 ⍝ 2,000 Floating pt 0 ⎕SIZE 'Z2000'

16016

12

Page 13: APL Optimization Techniques

The internal representation of the result R←X DR Y⎕is guaranteed to remain unmodified until it is re-assigned (or partially re-assigned) with the result of any function (ref: Dyalog Apl Reference Manual Chapter 6)

Precaution

Do not change a Declared array and then re-use it.If you need another similar array but of different dimensions, you should declare the new one from scratch.

Reason:

13

Page 14: APL Optimization Techniques

Storing Numbers in a Native File

14

Page 15: APL Optimization Techniques

Blanks and commas are the most frequently used separators for numbers stored in a text file. Index Generator is also frequently used.

N1←'40001 40002'   

Storing Numbers as Characters

N3←'40000+⍳2'N2←'40001,40002'

:For I :In ⍳10000 X←⍎N1 Y←⍎N2 Z←⍎N3 :EndFor

⍝ Elapsed time = 72 ms⍝ Elapsed time = 89 ms⍝ Elapsed time = 94 ms

The character strings are executed to retrieve the numbers

15

Page 16: APL Optimization Techniques

:For I :In ⍳100 X←⍎N1 Y←⍎N2 Z←⍎N3:EndFor

⍝ Run Time   96 ms⍝ Run Time  661 ms

Storing 1,000 Numbers as Characters

⍝ Run Time  504 ms

N1←⍕N

N2←N1((N2=' ')/N2)←','

N3←¯1↓,'(',(⍕⍪¯1+(1000⍴1 0)/N),500 5⍴'+⍳2),'

N←4000+(1500⍴1 1 0)/⍳1500

⍝ (4000+⍳2),(4003+⍳2),... Comma separated Index generated

⍝ 4001,4002,4004,4005,... comma separated

⍝ 4001 4002 4004 4005 ... space separated

16

Page 17: APL Optimization Techniques

Space Wasted by Trailing Blanks

Character Matrix with 2 records

Record 1 can be compressed a little bit by the Index Generator so that record 2 has less trailing blanks

But in a nested vector, record 2 naturally has no trailing blanks

2 9 1 1 0 2 9 1 0 6 2 9 9 1 1 1 2 9 1 1 3 2 9 1 1 5 2 9 1 1 4

2 9 2 4 6

( 2 9 1 0 9 + ⍳ 2 ) , 2 9 1 0 6 , ( 2 9 1 1 2 + ⍳ 2 ) , 2 9 1 1 5

2 9 2 4 6

2 9 1 1 0 2 9 1 0 6 2 9 9 1 1 1 2 9 1 1 3 2 9 1 1 5 2 9 1 1 4

2 9 2 4 6

17

Page 18: APL Optimization Techniques

File I/O Optimization Suggestions

• Use the DCL function to Declare arrays to initialize the numeric components of a component file, otherwise the numeric components are under-allocated in size and the component file becomes fragmented too quickly.

• To store purely numeric data in a native file, do not use commas to separate the numbers, even though CSV format is very popular, because APL commas are being executed as primitive functions.

18

Page 19: APL Optimization Techniques

Outer Product

19

Page 20: APL Optimization Techniques

Replacing Outer Product by Indexing

Y←⍳32000:For I :In ⍳5

L←1≠+/Y∘.=Y M←Y∊((⍳⍴Y)≠Y⍳Y)/Y

:EndFor

⎕WA2656824552 X←1≠+/D∘.=D←⍳33000LIMIT ERROR

⎕WA 270924 ⍝ 10,000 times smaller WS

X←D∊((⍳⍴D)≠D⍳D)/D←⍳33000 ⍝ No LIMIT ERROR

⍝ 1,000 times faster

⍝ 21724 ms⍝ 20 ms

20

Page 21: APL Optimization Techniques

Replacing Outer Product by Simple Logic

M←100000↑50000⍴⍳13:For I :In ⍳1000 L←1≠×/×M∘.-1 12 N←(M≥1)^M≤12:EndFor

M←100000↑50000⍴⍳13 ⎕WA1397828 L←1≠×/×M∘.-1 12WS FULL

⎕WA37832 L←(M≥1)^M≤12

⍝ 40 times smaller WS⍝ No WS FULL

⍝ 9210 ms⍝ 813 ms⍝ 10 times faster

21

Page 22: APL Optimization Techniques

Replacing Outer Product by a Loop

:For J :In ⍳10 X←+/((⍳⍴A)∘.≥⍳⍴A)^A∘.<B Y←⍬ :For I :In ⍳⍴B Y,←+/A[I]<I↑B :EndFor:EndFor

⎕WA2047735492 X←+/((⍳⍴A)∘.≥⍳⍴A)^A∘.<BLIMIT ERROR

⎕WA405316X←⍬ :For I :In ⍳⍴B X,←+/A[I]<I↑B:EndFor

⍝ 3 times faster

⍝ 5,000 times smaller workspace

A←32800?32800B←20000+32800?32800

⍝ No LIMIT ERROR

⍝ 75810 ms

⍝ 26422 ms

22

Page 23: APL Optimization Techniques

Inner Product

23

Page 24: APL Optimization Techniques

Matrix on the (wrong) Side of the Expression Requiring a Matrix Transpose

'ABC'^.=⍉((1↑⍴D),3)↑D

(((1↑⍴D),3)↑D)^.='ABC'

⍝ Transpose needed

⍝ Transpose not needed

24

“one less pair of parentheses”

Page 25: APL Optimization Techniques

Transposed Inner Product

VECTOR^.=⍉MATRIX

Y←10000 6⍴⎕A:For I :In ⍳10000 L←'EFGHIJ'^.=⍉Y M←Y^.='EFGHIJ' :EndFor

MATRIX^.=VECTOR

⍝ 14561 ms⍝ 2302 ms

25

vs

Page 26: APL Optimization Techniques

Array Comparisons

26

Page 27: APL Optimization Techniques

Comparing Array Contents with a scalar

^/M^.=' '

or^/^/M=' '

orM≡(⍴M)⍴' '

M←1000 1000⍴⎕AV

27

Page 28: APL Optimization Techniques

Character Comparison Efficiency

M←1000 1000⍴⎕AV :For I :In ⍳10000 {}^/M^.=' ' {}^/^/M=' ' {}M≡(⍴M)⍴' ':EndFor

⍝ 9108 ms⍝ 9060 ms⍝ 587 ms

28

Page 29: APL Optimization Techniques

Numeric Comparison Efficiency

M←1000 1000⍴ ⍳10000 :For I :In ⍳10000 {}^/M^.=0 {}^/^/M=0 {}M≡(⍴M)⍴0:EndFor

⍝ 12254 ms⍝ 12201 ms⍝ 52 ms

29

Page 30: APL Optimization Techniques

Comparing Vectors

A←10000?10000B←10000?10000

C←A^.=B

:For I :In ⍳10000 {}A^.=B {}A≡B :EndFor

C←A≡B

⍝ 1244 ms⍝ 135 ms

30

Page 31: APL Optimization Techniques

Comparing Vectors of Unequal Lengths

A←10000?10000 B←9999?9999

C←A^.=BLENGTH ERROR C←A^.=B ^

31

Page 32: APL Optimization Techniques

Comparing Vectors of Unequal Lengths

L←(⍴A)⌈⍴B C←(L↑A)^.=L↑B

or:If C←(⍴A)=⍴B :AndIf C←A^.=B:EndIf

orC←A≡B

To avoid LENGTH ERROR

32

Page 33: APL Optimization Techniques

Checking the Return Code of a Function

→(¯1∊DATA←FUNCTION_1)/ERR

But there are still many functions written such that the result returned can be either the data or the return code.

Nowadays, many functions are written such that a 2-item nested vector is returned where one item contains the result and another item contains the return code.

E.g. if ¯1 returned by a function means an error has occurred; then we need to be very careful with the use of the ∊ membership function.

33

Page 34: APL Optimization Techniques

Example of Function Return Code

A popular IBM APL utility function to read text file is called ∆FM (File Matrix I/O). When ∆FM reads a text file and encounters an error, instead of returning the data, it returns an error code of 28.

Thus many programmers would write the text file I/O coding in the following way. →(28∊DATA←∆FM 'file.csv')/ERR

34

Page 35: APL Optimization Techniques

Example of Return Code Inefficiency

Y←∆FM 'file.csv'⎕SIZE'Y'

9979076⍴Y

72312 138 :For I :In ⍳1000 {}28∊Y {}28≡Y:EndFor

⍝ 3208521 ms⍝ 4 ms

35

Page 36: APL Optimization Techniques

CPU Optimization Suggestions

When an elegant outer product generates a sparse matrix that causes LIMIT ERROR, WS FULL, or computational slow down, replace the outer product by a simpler but not so elegant expression.

Example of code elegance: 1≠×/×M∘.-1 12 vs (M≥1)^M≤12

Try to avoid unnecessary transpose of a matrix when you perform an inner product of a matrix with a vector.

Remember that in some cases, the match function can run much faster than the inner product or the membership function.

36

Page 37: APL Optimization Techniques

The End

37

Eugene YingFiserv, Inc.