162
© 2014 Apple Inc. All rights reserved. Redistribution or public display not permitted without written permission from Apple. #WWDC14 What’s New in the Accelerate Framework Session 703 Geoff Belter Engineer, Vector and Numerics Group Core OS

What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

© 2014 Apple Inc. All rights reserved. Redistribution or public display not permitted without written permission from Apple.

#WWDC14

What’s New in the Accelerate Framework

Session 703 Geoff Belter Engineer, Vector and Numerics Group

Core OS

Page 2: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

What is Available?

Image processing (vImage)

Digital signal processing (vDSP)

Math functions (vForce, vMathLib, vBigNum)

Linear algebra (LAPACK, BLAS)

Page 3: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

What is the Accelerate Framework?

High performance • Fast

• Energy efficient

OS X and iOS

All generations of hardware

Page 4: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Session Goals

New features in vImage

Introduce

• LinearAlgebra

• <simd/simd.h>

Page 5: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

vImageHigh performance image processing

Page 6: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

vImage Some things you can do

Page 7: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

vImage Some things you can do

Page 8: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Getting Data into vImageCGImageRef

Page 9: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Getting Data into vImageCGImageRef

// CGImageRef —-> vImage_BuffervImage_Buffer buf;vImage_CGImageFormat fmt = { .bitsPerComponent = 8, .bitsPerPixel = 32, … };vImage_Error err = vImageBuffer_initWithCGImage( &buf, &fmt, NULL, cgImage, kvImageNoFlags );

Page 10: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Getting Data into vImageCGImageRef

// CGImageRef —-> vImage_BuffervImage_Buffer buf;vImage_CGImageFormat fmt = { .bitsPerComponent = 8, .bitsPerPixel = 32, … };vImage_Error err = vImageBuffer_initWithCGImage( &buf, &fmt, NULL, cgImage, kvImageNoFlags );

// vImage_Buffer —-> CGImageRefcgImage = vImageCreateCGImageFromBuffer( &buf, &fmt, NULL, NULL, kvImageNoFlags, &err );

Page 11: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Conversion SupportvImageConvert_AnyToAny

// 1) Make converter: srcFormat —-> destFormatvImage_CGImageFormat srcFormat = { .bitsPerComponent = 8, … };vImage_CGImageFormat destFormat = { .bitsPerComponent = 16, … }; vImageConverterRef c = vImageConverter_CreateWithCGImageFormat( &srcFormat, &destFormat, NULL, kvImageNoFlags, &err); !

// 2) ConvertvImage_Buffer srcBuf = {…}, destBuf = {…};err = vImageConvert_AnyToAny(c, &srcBuf, &destBuf, NULL, flags);

Page 12: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Conversion SupportvImageConvert_AnyToAny

// 1) Make converter: srcFormat —-> destFormatvImage_CGImageFormat srcFormat = { .bitsPerComponent = 8, … };vImage_CGImageFormat destFormat = { .bitsPerComponent = 16, … }; vImageConverterRef c = vImageConverter_CreateWithCGImageFormat( &srcFormat, &destFormat, NULL, kvImageNoFlags, &err); !

// 2) ConvertvImage_Buffer srcBuf = {…}, destBuf = {…};err = vImageConvert_AnyToAny(c, &srcBuf, &destBuf, NULL, flags);

Page 13: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Conversion SupportvImageConvert_AnyToAny

// 1) Make converter: srcFormat —-> destFormatvImage_CGImageFormat srcFormat = { .bitsPerComponent = 8, … };vImage_CGImageFormat destFormat = { .bitsPerComponent = 16, … }; vImageConverterRef c = vImageConverter_CreateWithCGImageFormat( &srcFormat, &destFormat, NULL, kvImageNoFlags, &err); !

// 2) ConvertvImage_Buffer srcBuf = {…}, destBuf = {…};err = vImageConvert_AnyToAny(c, &srcBuf, &destBuf, NULL, flags);

Page 14: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Conversion SupportvImageConvert_AnyToAny

// 1) Make converter: srcFormat —-> destFormatvImage_CGImageFormat srcFormat = { .bitsPerComponent = 8, … };vImage_CGImageFormat destFormat = { .bitsPerComponent = 16, … }; vImageConverterRef c = vImageConverter_CreateWithCGImageFormat( &srcFormat, &destFormat, NULL, kvImageNoFlags, &err); !

// 2) ConvertvImage_Buffer srcBuf = {…}, destBuf = {…};err = vImageConvert_AnyToAny(c, &srcBuf, &destBuf, NULL, flags);

Page 15: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

What You Had to Say

“functions that convert vImage_Buffer objects to CGImageRef objects and back 👍👍👍👍👍👍” !

!

Twitter user

Page 16: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

What You Had to Say

“functions that convert vImage_Buffer objects to CGImageRef objects and back 👍👍👍👍👍👍” !

!

Twitter user

“vImageConvert_AnyToAny is magical. Threaded and vectorized conversion between nearly any two pixel formats.”

Twitter user

Page 17: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Video—RGB, Grayscale, and Y’CbCrCVPixelBufferRef (a video frame)

Page 18: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Video—RGB, Grayscale, and Y’CbCrCVPixelBufferRef (a video frame)

// CVPixelBufferRef —-> vImageBuffervImageBuffer_InitWithCVPixelBuffer( &buf, &desiredFormat, cvPixelBuffer, NULL, NULL, kvImageNoFlags );

Page 19: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Video—RGB, Grayscale, and Y’CbCrCVPixelBufferRef (a video frame)

// CVPixelBufferRef —-> vImageBuffervImageBuffer_InitWithCVPixelBuffer( &buf, &desiredFormat, cvPixelBuffer, NULL, NULL, kvImageNoFlags );

// vImageBuffer —-> CVPixelBufferRefvImageBuffer_CopyToCVPixelBuffer( &buf, &bufFormat, cvPixelBuffer, NULL, NULL, kvImageNoFlags );

Page 20: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Getting video into vImage

Lower level interfaces • 41 video conversions

• Manage chroma siting, transfer function, conversion matrix, etc.

• RGB colorspaces for video formats

vImageConvert_AnyToAny() for video - vImageConverter_CreateForCGtoCVImageFormat - vImageConverter_CreateForCVtoCGImageFormat

Page 21: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

VideoToolbox PerformanceHigher is better

MPi

xels

/sec

ond

750

1500

2250

3000

420v yuvs y420 v210

1219

1869 1739

2035

0 460

344 458

OS X 10.9OS X 10.10 (with vImage)

Page 22: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

LinearAlgebra (LA)Simple to use high performance

Page 23: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Solving System of Linear EquationsWith LAPACK

Given the system A, and the right-hand-side B, how do you find X (AX = B)?

Page 24: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Solving System of Linear EquationsWith LAPACK

Given the system A, and the right-hand-side B, how do you find X (AX = B)?

!

__CLPK_integer n = matrix_size; __CLPK_integer nrhs = number_right_hand_sides; __CLPK_integer lda = column_stride_A; __CLPK_integer ldb = column_stride_B; __CLPK_integer *ipiv = malloc(sizeof(__CLPK_integer)*n); __CLPK_integer info; sgesv_(&n, &nrhs, A, &lda, ipiv, B, &ldb, &info); free(ipiv);

Page 25: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Solving System of Linear EquationsWith LA

Given the system A, and the right-hand-side B, how do you find X (AX = B)?

Page 26: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Solving System of Linear EquationsWith LA

Given the system A, and the right-hand-side B, how do you find X (AX = B)?

la_object_t X = la_solve(A,B);

Page 27: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

LinearAlgebra

New in iOS 8.0 and OS X Yosemite

Page 28: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

LinearAlgebra

New in iOS 8.0 and OS X Yosemite

Simple with good performance

Page 29: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

LinearAlgebra

New in iOS 8.0 and OS X Yosemite

Simple with good performance

Single and double precision

Page 30: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

LinearAlgebra

New in iOS 8.0 and OS X Yosemite

Simple with good performance

Single and double precision

Native Objective-C Object

Page 31: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

What’s Available?

Element-wise arithmetic

Matrix product

Transpose

Norms / normalization

Linear systems

Slice

Splat

Page 32: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

LA Objects

Page 33: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

LA Objects

Reference counted opaque objects • Objective-C objects when appropriate

Page 34: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

LA Objects

Reference counted opaque objects • Objective-C objects when appropriate

Managed for you • Data buffer/memory

• Dimension details

• Errors and warnings

• Scalar data type (float/double)

Page 35: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Memory Management

Page 36: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Memory Management

C

la_release() la_retain()

la_object_t A, B, C; // create A and B C = la_sum(A,B); la_release(A); la_release(B); // use C la_release(C);

Page 37: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Memory Management

C

la_release() la_retain()

la_object_t A, B, C; // create A and B C = la_sum(A,B); la_release(A); la_release(B); // use C la_release(C);

Objective-C no ARC

-[release] -[retain]

la_object_t A, B, C; // create A and B C = la_sum(A,B); [A release]; [B release]; // use C [C release];

Page 38: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Memory Management

C

la_release() la_retain()

la_object_t A, B, C; // create A and B C = la_sum(A,B); la_release(A); la_release(B); // use C la_release(C);

Objective-C no ARC

-[release] -[retain]

la_object_t A, B, C; // create A and B C = la_sum(A,B); [A release]; [B release]; // use C [C release];

Objective-C with ARC

la_object_t A, B, C; // create A and B C = la_sum(A,B); !!// use C

Page 39: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Buffer to LA ObjectWith copy

double *A = malloc(sizeof(double) * num_rows * row_stride); // Fill A as row major matrix !

// Data copied into object, user still responsible for A la_object_t Aobj = la_matrix_from_double_buffer(A, num_rows, num_cols, row_stride, LA_NO_HINT, LA_DEFAULT_ATTRIBUTES); !

// User retains all rights to A. User must clean up A free(A);

Page 40: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Buffer to LA ObjectWith copy

double *A = malloc(sizeof(double) * num_rows * row_stride); // Fill A as row major matrix !

// Data copied into object, user still responsible for A la_object_t Aobj = la_matrix_from_double_buffer(A, num_rows, num_cols, row_stride, LA_NO_HINT, LA_DEFAULT_ATTRIBUTES); !

// User retains all rights to A. User must clean up A free(A);

Page 41: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Buffer to LA ObjectWith copy

double *A = malloc(sizeof(double) * num_rows * row_stride); // Fill A as row major matrix !

// Data copied into object, user still responsible for A la_object_t Aobj = la_matrix_from_double_buffer(A, num_rows, num_cols, row_stride, LA_NO_HINT, LA_DEFAULT_ATTRIBUTES); !

// User retains all rights to A. User must clean up A free(A);

Page 42: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Buffer to LA ObjectWith copy

double *A = malloc(sizeof(double) * num_rows * row_stride); // Fill A as row major matrix !

// Data copied into object, user still responsible for A la_object_t Aobj = la_matrix_from_double_buffer(A, num_rows, num_cols, row_stride, LA_NO_HINT, LA_DEFAULT_ATTRIBUTES); !

// User retains all rights to A. User must clean up A free(A);

Page 43: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Hints

la_object_t o = la_matrix_from_double_buffer(A, num_rows, num_cols, row_stride, LA_SHAPE_DIAGONAL, LA_DEFAULT_ATTRIBUTES);

Page 44: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Hints

la_object_t o = la_matrix_from_double_buffer(A, num_rows, num_cols, row_stride, LA_SHAPE_DIAGONAL, LA_DEFAULT_ATTRIBUTES);

Allow for better performance

Page 45: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Hints

la_object_t o = la_matrix_from_double_buffer(A, num_rows, num_cols, row_stride, LA_SHAPE_DIAGONAL, LA_DEFAULT_ATTRIBUTES);

Allow for better performance

Insight about the data buffer • Diagonal

• Triangular

• Symmetric

• Positive Definite

Page 46: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Lazy Evaluation

la_object_t foo(la_object_t A, la_object_t x) { // At = A’ la_object_t At = la_transpose(A); !

// sum odd elements of x to even elements of x la_object_t x2 = la_sum(la_vector_slice(x,0,2,la_vector_length(x)/2), la_vector_slice(x,1,2,la_vector_length(x)/2)); !

// Atx2 = A’ * x2 * 3.2 la_object_t Atx2 = la_scale_with_float(la_matrix_product(At,x2), 3.2f); if (la_status(Atx2) < 0) { // error } return Atx2 }

Page 47: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Lazy Evaluation

la_object_t foo(la_object_t A, la_object_t x) { // At = A’ la_object_t At = la_transpose(A); !

// sum odd elements of x to even elements of x la_object_t x2 = la_sum(la_vector_slice(x,0,2,la_vector_length(x)/2), la_vector_slice(x,1,2,la_vector_length(x)/2)); !

// Atx2 = A’ * x2 * 3.2 la_object_t Atx2 = la_scale_with_float(la_matrix_product(At,x2), 3.2f); if (la_status(Atx2) < 0) { // error } return Atx2 }

A x

Page 48: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Lazy Evaluation

la_object_t foo(la_object_t A, la_object_t x) { // At = A’ la_object_t At = la_transpose(A); !

// sum odd elements of x to even elements of x la_object_t x2 = la_sum(la_vector_slice(x,0,2,la_vector_length(x)/2), la_vector_slice(x,1,2,la_vector_length(x)/2)); !

// Atx2 = A’ * x2 * 3.2 la_object_t Atx2 = la_scale_with_float(la_matrix_product(At,x2), 3.2f); if (la_status(Atx2) < 0) { // error } return Atx2 }

At

A x

Page 49: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Lazy Evaluation

la_object_t foo(la_object_t A, la_object_t x) { // At = A’ la_object_t At = la_transpose(A); !

// sum odd elements of x to even elements of x la_object_t x2 = la_sum(la_vector_slice(x,0,2,la_vector_length(x)/2), la_vector_slice(x,1,2,la_vector_length(x)/2)); !

// Atx2 = A’ * x2 * 3.2 la_object_t Atx2 = la_scale_with_float(la_matrix_product(At,x2), 3.2f); if (la_status(Atx2) < 0) { // error } return Atx2 }

x.odd x.even

x2

At

A x

Page 50: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Atx2

3.2

Lazy Evaluation

la_object_t foo(la_object_t A, la_object_t x) { // At = A’ la_object_t At = la_transpose(A); !

// sum odd elements of x to even elements of x la_object_t x2 = la_sum(la_vector_slice(x,0,2,la_vector_length(x)/2), la_vector_slice(x,1,2,la_vector_length(x)/2)); !

// Atx2 = A’ * x2 * 3.2 la_object_t Atx2 = la_scale_with_float(la_matrix_product(At,x2), 3.2f); if (la_status(Atx2) < 0) { // error } return Atx2 }

x.odd x.even

x2

At

A x

Page 51: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Lazy EvaluationDetails

No computation

No data buffer allocation

Triggered by - la_matrix_to_float_buffer - la_matrix_to_double_buffer - la_vector_to_float_buffer - la_vector_to_double_buffer

Page 52: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Performance Comparison

Netlib BLAS • Open source

Page 53: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Performance ComparisonHigher is better

GFL

OPS

12.5

25

37.5

50

Matrix Size

32 160 288 416 544 672 800 928 1024

LAAccelerate BLASNetlib BLAS

Page 54: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Performance ComparisonHigher is better

GFL

OPS

12.5

25

37.5

50

Matrix Size

32 160 288 416 544 672 800 928 1024

LAAccelerate BLASNetlib BLAS

Page 55: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Performance ComparisonHigher is better

GFL

OPS

12.5

25

37.5

50

Matrix Size

32 160 288 416 544 672 800 928 1024

LAAccelerate BLASNetlib BLAS

Page 56: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Performance ComparisonHigher is better

GFL

OPS

12.5

25

37.5

50

Matrix Size

32 160 288 416 544 672 800 928 1024

LAAccelerate BLASNetlib BLAS

Page 57: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Error Handling

la_object_t AB = la_matrix_product( A, la_transpose(B) ); if (la_status(AB) < 0) { // handle error } !

la_object_t result = la_sum( AB, la_scale_with_float( C, 3.2f ) ); if (la_status(result) < 0) { // handle error } !

la_status_t status = la_matrix_to_float_buffer(buffer, leading_dim, result);

Page 58: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Error Handling

la_object_t AB = la_matrix_product( A, la_transpose(B) ); if (la_status(AB) < 0) { // handle error } !

la_object_t result = la_sum( AB, la_scale_with_float( C, 3.2f ) ); if (la_status(result) < 0) { // handle error } !

la_status_t status = la_matrix_to_float_buffer(buffer, leading_dim, result);

Page 59: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Error Handling

la_object_t AB = la_matrix_product( A, la_transpose(B) ); !

!

la_object_t result = la_sum( AB, la_scale_with_float( C, 3.2f ) ); !

!

la_status_t status = la_matrix_to_float_buffer(buffer, leading_dim, result); if (status == LA_SUCCESS) { // No errors, buffer is filled with good data. } else if (status > 0) { // No errors occurred, but result does not have full accuracy. } else { // An error occurred. assert(0); }

Page 60: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Debugging

Enable logging • LA_ATTRIBUTE_ENABLE_LOGGING

Page 61: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Debugging

Enable logging • LA_ATTRIBUTE_ENABLE_LOGGING

Error log la_object_t la_sum(la_object_t, la_object_t): LA_DIMENSION_MISMATCH_ERROR: Encountered a dimension mismatch obj_left rows must be equal to obj_right rows; failed comparison: 8 == 9

Page 62: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Solve

la_object_t x = la_solve(A,b);

• If A is square and non-singular, compute the solution x to Ax = b

• If A is square and singular, produce an error

Page 63: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

SlicingWhat is slicing

Light weight access to partial object • No buffer allocations

• No buffer copies

Page 64: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

SlicingWhat is slicing

Light weight access to partial object • No buffer allocations

• No buffer copies

Three pieces of information • Offset

• Stride

• Dimension

Page 65: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

SlicingWhat is slicing

la_vector_slice (vector, 7, // offset -2, // stride

3); // dimension

[ 0 1 2 3 4 5 6 7 8 9 ]

Page 66: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

SlicingWhat is slicing

la_vector_slice (vector, 7, // offset -2, // stride

3); // dimension [ 7 ]

[ 0 1 2 3 4 5 6 7 8 9 ]

Page 67: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

SlicingWhat is slicing

la_vector_slice (vector, 7, // offset -2, // stride

3); // dimension [ 7 ]5

[ 0 1 2 3 4 5 6 7 8 9 ]

Page 68: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

SlicingWhat is slicing

la_vector_slice (vector, 7, // offset -2, // stride

3); // dimension [ 7 ]5 3

[ 0 1 2 3 4 5 6 7 8 9 ]

Page 69: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Slice ExampleTiling

la_object_t A,B,C; // A and B are matrices of dimension MxN !

for (int i = 0; i < 2; ++i) { for (int j = 0; j < 2; ++j) { la_object_t Atile = la_matrix_slice(A,i*M/2,j*N/2,1,1,M/2,N/2); la_object_t Btile = la_matrix_slice(B,i*M/2,j*N/2,1,1,M/2,N/2); C = la_sum(Atile, Btile); // use of C tile } }

Page 70: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Slice ExampleTiling

la_object_t A,B,C; // A and B are matrices of dimension MxN !

for (int i = 0; i < 2; ++i) { for (int j = 0; j < 2; ++j) { la_object_t Atile = la_matrix_slice(A,i*M/2,j*N/2,1,1,M/2,N/2); la_object_t Btile = la_matrix_slice(B,i*M/2,j*N/2,1,1,M/2,N/2); C = la_sum(Atile, Btile); // use of C tile } }

= +A B

C

Page 71: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Slice ExampleTiling

la_object_t A,B,C; // A and B are matrices of dimension MxN !

for (int i = 0; i < 2; ++i) { for (int j = 0; j < 2; ++j) { la_object_t Atile = la_matrix_slice(A,i*M/2,j*N/2,1,1,M/2,N/2); la_object_t Btile = la_matrix_slice(B,i*M/2,j*N/2,1,1,M/2,N/2); C = la_sum(Atile, Btile); // use of C tile } }

= +A B

C

Page 72: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Slice ExampleTiling

la_object_t A,B,C; // A and B are matrices of dimension MxN !

la_object_t sum = la_sum(A,B); !

for (int i = 0; i < 2; ++i) { for (int j = 0; j < 2; ++j) { C = la_matrix_slice(sum,i*M/2,j*N/2,1,1,M/2,N/2); // use of C tile } }

Page 73: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Slice ExampleTiling

la_object_t A,B,C; // A and B are matrices of dimension MxN !

la_object_t sum = la_sum(A,B); !

for (int i = 0; i < 2; ++i) { for (int j = 0; j < 2; ++j) { C = la_matrix_slice(sum,i*M/2,j*N/2,1,1,M/2,N/2); // use of C tile } }

Page 74: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Slice ExampleTiling

la_object_t A,B,C; // A and B are matrices of dimension MxN !

la_object_t sum = la_sum(A,B); !

for (int i = 0; i < 2; ++i) { for (int j = 0; j < 2; ++j) { C = la_matrix_slice(sum,i*M/2,j*N/2,1,1,M/2,N/2); // use of C tile } }

sum=C

Page 75: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Slice ExampleTiling

la_object_t A,B,C; // A and B are matrices of dimension MxN !

la_object_t sum = la_sum(A,B); !

for (int i = 0; i < 2; ++i) { for (int j = 0; j < 2; ++j) { C = la_matrix_slice(sum,i*M/2,j*N/2,1,1,M/2,N/2); // use of C tile } }

sum=C = +

A B

Page 76: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

SplatWhat is splat

la_splat_from_float(5.0f );

Page 77: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

SplatWhat is splat

la_splat_from_float(5.0f );

Page 78: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

SplatWhat is splat

la_splat_from_float(5.0f );

Add 2 to every element of a vector • la_sum(vector, la_splat_from_double(2.0));

Page 79: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

LinearAlgebraSummary

Simple API

Modern language and run-time features

Good performance

Page 80: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

LINPACKThe joy of benchmarking

Stephen Canon Engineer, Vector and Numerics Group

Page 81: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

How fast can you solve a system of linear equations?

Page 82: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

LINPACK tests both hardware and software

Page 83: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Accelerate vs. “Brand A”

Page 84: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

2013LINPACK performance in GFLOPS (bigger is better)

Accelerate on iPhone 5

“Brand A”

1 2 3 4

Page 85: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

2014LINPACK performance in GFLOPS (bigger is better)

Accelerate on iPhone 5

“Brand A”

1 2 3 4

Page 86: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Let’s find some new competition

Ourselves

Page 87: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Let’s find some new competition

Ourselves

Page 88: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

2014LINPACK performance in GFLOPS (bigger is better)

iPhone 5

2010 MacBook Air

87654321

10.4 GFLOPS

Page 89: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

11

2014LINPACK performance in GFLOPS (bigger is better)

iPhone 5s

2010 MacBook Air

10987654321

10.4 GFLOPS

Page 90: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

10987654321 131211

2014LINPACK performance in GFLOPS (bigger is better)

iPhone 5s

2010 MacBook Air

10.4 GFLOPS

Page 91: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

10987654321 1514131211

2014LINPACK performance in GFLOPS (bigger is better)

iPad Air

2010 MacBook Air

14.6 GFLOPS

Page 92: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

<simd/simd.h>Short vector and matrix math

Page 93: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

<simd/simd.h>

New (iOS 8 and OS X Yosemite) library with three purposes:

Page 94: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

<simd/simd.h>

New (iOS 8 and OS X Yosemite) library with three purposes:• 2D, 3D, and 4D vector math and geometry

Page 95: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

<simd/simd.h>

New (iOS 8 and OS X Yosemite) library with three purposes:• 2D, 3D, and 4D vector math and geometry

• Features of Metal in C, Objective-C, and C++ on the CPU

Page 96: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

<simd/simd.h>

New (iOS 8 and OS X Yosemite) library with three purposes:• 2D, 3D, and 4D vector math and geometry

• Features of Metal in C, Objective-C, and C++ on the CPU

• Abstraction over architecture-specific SIMD types and intrinsics

Page 97: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and GeometryWish list

Page 98: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and GeometryWish list

Inline implementations

Page 99: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and GeometryWish list

Inline implementations

Concise functions without extra parameters

Page 100: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and GeometryWish list

Inline implementations

Concise functions without extra parameters

!

!

!

float a = cblas_sdot(3, 1.0f, x, 1, y, 1);

Page 101: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and GeometryWish list

Inline implementations

Concise functions without extra parameters

!

!

!

float a = GLKVector3DotProduct(x, y);

Page 102: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and GeometryWish list

Inline implementations

Concise functions without extra parameters

!

!

!

float a = vector_dot(x, y);

Page 103: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and GeometryWish list

Inline implementations

Concise functions without extra parameters

!

!

using namespace simd; float a = dot(x, y);

Page 104: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and GeometryWish list

Inline implementations

Concise functions without extra parameters

Page 105: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and GeometryWish list

Inline implementations

Concise functions without extra parameters

Arithmetic should use operators

Page 106: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and GeometryWish list

Inline implementations

Concise functions without extra parameters

Arithmetic should use operators

!

!

z = GLKVector4MultiplyScalar(GLKVector4Add(x,y),0.5);

Page 107: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and GeometryWish list

Inline implementations

Concise functions without extra parameters

Arithmetic should use operators

!

!

z = 0.5*(x + y);

Page 108: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and GeometryWish list

Inline implementations

Concise functions without extra parameters

Arithmetic should use operators

!

!

z = 0.5*(x + y);

Page 109: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and GeometryTypes

In C and Objective-C, the primary type is vector_floatN, where N is 2, 3, or 4

In C++ you can use simd::floatN

Based on clang “extended vectors”

Page 110: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and GeometryArithmetic

Your favorite arithmetic operators (+,–,*,/) work with both vectors and scalars

Page 111: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and GeometryArithmetic

Your favorite arithmetic operators (+,–,*,/) work with both vectors and scalars vector_float3 vector_reflect(vector_float3 x, vector_float3 n) { !

}

Page 112: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and GeometryArithmetic

Your favorite arithmetic operators (+,–,*,/) work with both vectors and scalars vector_float3 vector_reflect(vector_float3 x, vector_float3 n) { !

}

x

Page 113: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and GeometryArithmetic

Your favorite arithmetic operators (+,–,*,/) work with both vectors and scalars vector_float3 vector_reflect(vector_float3 x, vector_float3 n) { !

}

xn

Page 114: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and GeometryArithmetic

Your favorite arithmetic operators (+,–,*,/) work with both vectors and scalars vector_float3 vector_reflect(vector_float3 x, vector_float3 n) { !

}

xn

Page 115: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and GeometryArithmetic

Your favorite arithmetic operators (+,–,*,/) work with both vectors and scalars vector_float3 vector_reflect(vector_float3 x, vector_float3 n) { !

}

xn

vector_reflect(x,n)

Page 116: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and GeometryArithmetic

Your favorite arithmetic operators (+,–,*,/) work with both vectors and scalars vector_float3 vector_reflect(vector_float3 x, vector_float3 n) { return x - 2*vector_dot(x,n)*n; }

xn

vector_reflect(x,n)

Page 117: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and GeometryElements and subvectors

Array subscripting vector_float4 a = { 0, 1, 2, 3 }; float z = x[2]; // z = 2

Page 118: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and GeometryElements and subvectors

Named subvectors vector_float4 a = { 0, 1, 2, 3 }; vector_float2 b = a.lo; // b = { 0, 1 } a.even = -b; // a = { 0, 1,-1, 3 }

Page 119: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and Geometry

<simd/math.h>

<simd/common.h>

<simd/geometry.h>

Page 120: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and Geometry

<simd/math.h>

<simd/common.h>

<simd/geometry.h>

C/Objective-C

fabs(x) sqrt(x) floor(x) sin(x)…vector_clamp(x,min,max) vector_mix(x,y,t) vector_recip(x) vector_step(x,edge)…vector_dot(x,y) vector_length(x) vector_normalize(x) vector_reflect(x,n)…

Page 121: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and Geometry

<simd/math.h>

<simd/common.h>

<simd/geometry.h>

C/Objective-C

fabs(x) sqrt(x) floor(x) sin(x)…vector_clamp(x,min,max) vector_mix(x,y,t) vector_recip(x) vector_step(x,edge)…vector_dot(x,y) vector_length(x) vector_normalize(x) vector_reflect(x,n)…

C++ (and Metal)

fabs(x) sqrt(x) floor(x) sin(x)…clamp(x,min,max) mix(x,y,t) recip(x) step(x,edge)…dot(x,y) length(x) normalize(x) reflect(x,n)…

Page 122: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and Geometry

Some functions have two flavors: “precise” and “fast”

Page 123: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and Geometry

Some functions have two flavors: “precise” and “fast”• “precise” is the default…

Page 124: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and Geometry

Some functions have two flavors: “precise” and “fast”• “precise” is the default…

• … but if you compile with -ffast-math, you get the “fast” versions

Page 125: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and Geometry

Even with -ffast-math, you can call the precise functions by name:

!

float len = vector_precise_length(x);

Page 126: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Vector Math and Geometry

You can call the fast versions by name too:

!

x = fast::normalize(x);

Page 127: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

MatricesC and Objective-C

“matrix_floatNxM”, where N and M are 2, 3, or 4 • N is number of columns, M is number of rows.

Page 128: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

MatricesC and Objective-C

Create matrices matrix_from_diagonal(vector) matrix_from_columns(vector, vector, …) matrix_from_rows(vector, vector, …)

Arithmetic matrix_scale(matrix, scalar) matrix_linear_combination(matrix, scalar, matrix, scalar) matrix_transpose(matrix) matrix_invert(matrix) matrix_multiply(matrix/vector, matrix/vector)

Page 129: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

MatricesC++ (and Metal)

Create matrices float4x4() float2x2(diagonal) float3x4(column0, column1, …)

Arithmetic float3x4 A, B; A -= 2.f * B; float4x3 C = transpose(A) vector3 x; vector4 y = A*x;

Page 130: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Abstract SIMDAdditional types

Doubles, signed and unsigned integers

Longer vectors (8, 16, and 32 elements)

Unaligned vectors

Page 131: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Abstract SIMDInteger operators and conversions

Page 132: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Abstract SIMDInteger operators and conversions

Arithmetic operators: +, -, *, /, %

Page 133: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Abstract SIMDInteger operators and conversions

Arithmetic operators: +, -, *, /, %

Bitwise operators: >>, <<, &, |, ^, ~

Page 134: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Abstract SIMDInteger operators and conversions

Arithmetic operators: +, -, *, /, %

Bitwise operators: >>, <<, &, |, ^, ~

Conversions: vector_float x; vector_ushort y = vector_ushort(x); vector_char z = vector_char_sat(x);

Page 135: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Abstract SIMDComparisons

Vector comparisons: ==, !=, >, <, >=, <= • Result is a vector of integers; each lane is –1 if comparison is true, 0 if false

Page 136: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Abstract SIMDComparisons

Vector comparisons: ==, !=, >, <, >=, <= • Result is a vector of integers; each lane is –1 if comparison is true, 0 if false

0.0 1.0 2.0 3.0x

Page 137: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Abstract SIMDComparisons

Vector comparisons: ==, !=, >, <, >=, <= • Result is a vector of integers; each lane is –1 if comparison is true, 0 if false

0.0 1.0 2.0 3.0x

0.0 3.14159 –infinity 42.0y

Page 138: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Abstract SIMDComparisons

Vector comparisons: ==, !=, >, <, >=, <= • Result is a vector of integers; each lane is –1 if comparison is true, 0 if false

0.0 1.0 2.0 3.0x

0.0 3.14159 –infinity 42.0y

x < y

Page 139: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Abstract SIMDComparisons

Vector comparisons: ==, !=, >, <, >=, <= • Result is a vector of integers; each lane is –1 if comparison is true, 0 if false

0x00000000

0.0 1.0 2.0 3.0x

0.0 3.14159 –infinity 42.0y

x < y

Page 140: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Abstract SIMDComparisons

Vector comparisons: ==, !=, >, <, >=, <= • Result is a vector of integers; each lane is –1 if comparison is true, 0 if false

0x00000000 0xffffffff

0.0 1.0 2.0 3.0x

0.0 3.14159 –infinity 42.0y

x < y

Page 141: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Abstract SIMDComparisons

Vector comparisons: ==, !=, >, <, >=, <= • Result is a vector of integers; each lane is –1 if comparison is true, 0 if false

0x00000000 0xffffffff 0x00000000

0.0 1.0 2.0 3.0x

0.0 3.14159 –infinity 42.0y

x < y

Page 142: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Abstract SIMDComparisons

Vector comparisons: ==, !=, >, <, >=, <= • Result is a vector of integers; each lane is –1 if comparison is true, 0 if false

0x00000000 0xffffffff 0x00000000 0xffffffff

0.0 1.0 2.0 3.0x

0.0 3.14159 –infinity 42.0y

x < y

Page 143: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Abstract SIMDComparisons

Vector comparisons: ==, !=, >, <, >=, <= • Result is a vector of integers; each lane is –1 if comparison is true, 0 if false

• Type of result usually isn’t important, because you’ll use one of the following: if (vector_any(x < 0)) { /* executed if any lane of x is negative */ } if (vector_all(y != 0)) { /* executed if every lane of y is non-zero */ } z = vector_bitselect(x, y, x > y); /* minimum of x and y */

Page 144: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

String CopyScalar implementation

Page 145: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

String CopyScalar implementation

void string_copy(char *dst, const char *src) { while ((*dst++ = *src++)); }

Page 146: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

String CopySSE intrinsic implementation

void vector_string_copy(char *dst, const char *src) { while ((uintptr_t)src % 16) if ((*dst++ = *src++) == 0) return; while (1) { __m128i data = _mm_load_si128((const __m128i *)src); __m128i contains_zero = _mm_cmpeq_epi8(data, _mm_set1_epi8(0)); if (_mm_movemask_epi8(contains_zero)) break; _mm_storeu_si128((__m128i *)dst, data); src += 16; dst += 16; } string_copy((char *)vec_dst, (const char *)vec_src); }

Page 147: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

String Copy<simd/simd.h> implementation

void vector_string_copy(char *dst, const char *src) { while ((uintptr_t)src % 16) if ((*dst++ = *src++) == 0) return; const vector_char16 *vec_src = (const vector_char16 *)src; packed_char16 *vec_dst = (packed_char16 *)dst; while (!vector_any(*vec_src == 0)) *vec_dst++ = *vec_src++; string_copy((char *)vec_dst, (const char *)vec_src); }

Page 148: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

String Copy<simd/simd.h> implementation

void vector_string_copy(char *dst, const char *src) { while ((uintptr_t)src % 16) if ((*dst++ = *src++) == 0) return; const vector_char16 *vec_src = (const vector_char16 *)src; packed_char16 *vec_dst = (packed_char16 *)dst; while (!vector_any(*vec_src == 0)) *vec_dst++ = *vec_src++; string_copy((char *)vec_dst, (const char *)vec_src); }

Page 149: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

String Copy<simd/simd.h> implementation

void vector_string_copy(char *dst, const char *src) { while ((uintptr_t)src % 16) if ((*dst++ = *src++) == 0) return; const vector_char16 *vec_src = (const vector_char16 *)src; packed_char16 *vec_dst = (packed_char16 *)dst; while (!vector_any(*vec_src == 0)) *vec_dst++ = *vec_src++; string_copy((char *)vec_dst, (const char *)vec_src); }

Page 150: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

String Copy<simd/simd.h> implementation

void vector_string_copy(char *dst, const char *src) { while ((uintptr_t)src % 16) if ((*dst++ = *src++) == 0) return; const vector_char16 *vec_src = (const vector_char16 *)src; packed_char16 *vec_dst = (packed_char16 *)dst; while (!vector_any(*vec_src == 0)) *vec_dst++ = *vec_src++; string_copy((char *)vec_dst, (const char *)vec_src); }

Page 151: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

String Copy<simd/simd.h> implementation

void vector_string_copy(char *dst, const char *src) { while ((uintptr_t)src % 16) if ((*dst++ = *src++) == 0) return; const vector_char16 *vec_src = (const vector_char16 *)src; packed_char16 *vec_dst = (packed_char16 *)dst; while (!vector_any(*vec_src == 0)) *vec_dst++ = *vec_src++; string_copy((char *)vec_dst, (const char *)vec_src); }

Page 152: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

PerformanceHigher is better

Byte

s co

pied

per

nan

osec

ond

1.5

3

4.5

6

String length in bytes

1 2 4 8 16 32 64 128 256 512 1024

Scalar<simd/simd.h>Libc

Page 153: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

PerformanceHigher is better

Byte

s co

pied

per

nan

osec

ond

1.5

3

4.5

6

String length in bytes

1 2 4 8 16 32 64 128 256 512 1024

Scalar<simd/simd.h>Libc

Page 154: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

PerformanceHigher is better

Byte

s co

pied

per

nan

osec

ond

1.5

3

4.5

6

String length in bytes

1 2 4 8 16 32 64 128 256 512 1024

Scalar<simd/simd.h>Libc

Page 155: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

PerformanceHigher is better

Byte

s co

pied

per

nan

osec

ond

1.5

3

4.5

6

String length in bytes

1 2 4 8 16 32 64 128 256 512 1024

Scalar<simd/simd.h>Libc

Page 156: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Summary

Simple interfaces for complex operations

New libraries still have a few rough edges

Let us know what use cases matter to you, and what additional features you need

Page 157: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

More Information

Paul Danbold Core OS Technology Evangelist [email protected]

George Warner DTS Sr. Support Scientist [email protected]

Documentation vImage Programming Guide http://developer.apple.com/library/mac/#documentation/Performance/Conceptual/vImage/Introduction/Introduction.html

Page 158: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

More Information

Documentation vDSP Programming Guide http://developer.apple.com/library/mac/#documentation/Performance/Conceptual/vDSP_Programming_Guide/Introduction/Introduction.html

vImage Headers /System/Library/Frameworks/Accelerate.framework/Frameworks/vImage.framework/Headers/vImage.h

vDSP Headers /System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Headers/vDSP.h

Page 159: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

More Information

Documentation LinearAlgebra Headers /System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Headers/LinearAlgebra/LinearAlgebra.h

<simd/simd.h> /usr/include/simd/simd.h

Apple Developer Forums http://devforums.apple.com

Bug Report http://bugreport.apple.com

Page 160: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Related Sessions

• Working with Metal: Overview Pacific Heights Wednesday 9:00AM

• Working with Metal: Fundamentals Pacific Heights Wednesday 10:15AM

• Working with Metal: Advanced Pacific Heights Wednesday 11:30AM

Page 161: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •

Labs

• Accelerate Lab Core OS Lab B Tuesday 11:30AM

• Metal Lab Graphics and Games Lab A Wednesday 2:00PM

Page 162: What’s New in the Accelerate Framework€¦ · OS X and iOS All generations of hardware. Session Goals New features in vImage Introduce • LinearAlgebra •