Hands-On with Google’s Machine Learning APIs, 1/18/2017

Hands-On with Google’s

Machine Learning APIsStephen Wylie

1/18/2017

@SWebCEO

+StephenWylie

mrcity

About me

WEBSITESwww.stev-o.usgoshtastic.blogspot.comwww.ledgoes.comwww.openbrite.com

[email protected]@openbrite.com

G+https://plus.google.com/u/1/+StephenWylie

TWITTER@SWebCEO

GITHUBmrcity

Senior Software Engineer at Capital One

Test/QA Lead for Auto Finance Innovation

Successful Kickstarter (1000% funded -

BriteBlox)

Intel Innovator, DMS Member, Vintage

computer collector/restorer/homebrewer,

hackathoner

Civic Hacking Cmte Chair @ DMS

@SWebCEO +StephenWylie #MachineLearning

http://www.stev-o.us/

https://goshtastic.blogspot.com/

http://www.ledgoes.com/

http://www.openbrite.com/

mailto:[email protected]

mailto:[email protected]

https://plus.google.com/u/1/+StephenWylie

https://twitter.com/SWebCEO

https://github.com/mrcity/

Tonight’s Mission

Touch on ML API offerings

Explore Google’s RESTful ML APIs

Cloud Vision & Natural Language API

Prediction API

TensorFlow (Not a RESTful API but still cool)

Allocate time to play, develop ideas

Have good conversations, network

Find learning partner or group


Before We Start…

Hopefully you followed instructions on

https://github.com/mrcity/mlworkshop/

Get access to the APIs

Install TensorFlow


https://github.com/mrcity/mlworkshop/

What is Machine Learning?

Programming computers to deduce things from data…

Conclusions

Patterns

Objects in images

…using generic mathematical methods

No advance knowledge of trends in data

Lots of algorithms available

The process can create beautiful constructs


Who’s using ML?

Chat bots

Self-driving cars

Pepper the robot

MarI/O

Document recognition & field extraction


https://www.youtube.com/watch?v=qv6UVOQ0F44

Machine Learning Tools Ecosystem

APIs you interface with

HP, Amazon, Microsoft, IBM, Google, Facebook’s Caffe on mobile & Web

Software you use

Orange (U of Ljubljana, Slovenia)

Weka (U of Waikato, New Zealand)

Hardware you compile programs to run on

nVidia GPUs with CUDA, DGX-1 supercomputer

Can BTC hardware be used for ML?


Google’s ML APIs In Particular

Google Play Services

Mobile Vision API

RESTful ML Services

Cloud Vision API

Cloud Natural Language API

Prediction API

Local ML Services

TensorFlow

SyntaxNet


^ Pre-defined models

v User-defined models

Cloud Vision APIWhat does it look like to you?


Detect Faces, Parse Barcodes, Segment

Text


Availability

Native Android

Native iOS

RESTful API

FACE APIBARCODE

APITEXT API

What do you see in that cloud?

Breaks down into more features than just FACE, BARCODE, and TEXT:

From https://cloud.google.com/vision/docs/requests-and-responses


Feature Type Description

LABEL_DETECTION Execute Image Content Analysis on the entire image and return

TEXT_DETECTION Perform Optical Character Recognition (OCR) on text within the image

FACE_DETECTION Detect faces within the image

LANDMARK_DETECTION Detect geographic landmarks within the image

LOGO_DETECTION Detect company logos within the image

SAFE_SEARCH_DETECTION Determine image safe search properties on the image

IMAGE_PROPERTIES Compute a set of properties about the image (such as the image's

dominant colors)

https://cloud.google.com/vision/docs/requests-and-responses

Cloud Vision APIs

Can simultaneously detect multiple features

Features billed individually per use on image

No Barcode feature as yet

Simple JSON request/response format

Submit image from Cloud Storage or in Base64

Returns 0 or more annotations by confidence


For Your Eyes Only

No OAuth required for Cloud Vision

Make requests using API Key

POST https://vision.googleapis.com/v1/images:annotate?key={YOUR_API_KEY}

Easy to script using Service Account


Response types


Feature Returns

Label Description of the picture’s contents

Confidence score

Text, Logo Text contents or logo owner name

Bounding polygon containing the text or logo

[Logo only] Confidence score

Face Bounding polygon and rotational characteristics of the face

Positions of various characteristics such as eyes, ears, lips, chin, forehead

Confidence score of exhibiting joy, sorrow, anger, or surprise

Landmark Description of the landmark and confidence score

Bounding polygon of the recognized landmark in the picture

Safe Search Likelihood of the image containing adult or violent content, that it was a spoof, or

contains graphic medical imagery

Image

properties

Dominant RGB colors within the image, ordered by fraction of pixels

Demo


Mobile Vision vs. Cloud Vision

Mobile Vision is for Native Android

Free; no usage quotas

Handles more data processing

Can utilize camera video

Takes advantage of hardware


Cloud Natural Language APIMaking computers speak human


Natural Language API: Analyze Any ASCII

Parses text for parts of speech

Discovers entities like organizations, people, locations

Analyzes text sentiment

Use Speech, Vision, Translate APIs upstream

Works with English, Spanish, or Japanese

Sentiment analysis only available for English


Sample NL API Request


From https://cloud.google.com/natural-language/docs/basics

<- Optional, can be guessed automatically

<- Not required for Sentiment Analysis queries

<- Optional, defaults to Entities

https://cloud.google.com/natural-language/docs/basics

Interpreting NL API Sentiment Responses


POLARITY

-1 1

MAGNITUDE

0 ∞

1

10 102

103

Sample

analyzeSentiment

response for the

Gettysburg

Address:

{“polarity”: 0.4,“magnitude”: 3.8

}

Demo


Google Prediction APITo further their conquest for all knowledge past, present, and future


Making Predictions With Google

Build “trained” model or use “hosted” model

Hosted models (all demos):

Language identifier

Tag categorizer (as android, appengine, chrome, youtube)

Sentiment predictor

Trained models:

Submit attributes and labels for each example

Need at least six examples

Store examples in Cloud Storage


Don’t Model Trains; Train Your Model

Train API against dataprediction.trainedmodels.insert

Send prediction queryprediction.trainedmodels.predict

Update the modelprediction.trainedmodels.update

Other CRUD operations: list, get, delete


Don’t Model Trains; Train Your Model

Insert query requires:

id

modelType

storageDataLocation

Don’t forget: poll for status updates


Permissions To Make Predictions

OAuth is required for Predictions

Easy to script using Service Account

Or, get Web app credentials: https://console.developers.google.com/apis/credentials


https://console.developers.google.com/apis/credentials

Demo


TensorFlowAll that Linear Algebra you slept through in college


About TensorFlow

Offline library for large-scale numerical computation

Think of a graph:

Nodes represent mathematical operations

Edges represent tensors flowing between them

Excellent at building deep neural networks


Soft𝑚𝑎𝑥 𝑥 𝑖 =𝑒𝑥𝑖

𝑗 𝑒𝑥𝑗

𝑅𝑒𝐿𝑈𝑛 =𝑓 𝑥= max(0, 𝑥)

Tense About Tensors?

Think about MNIST handwritten digits

Each number is 28 pixels squared

There are 10 numbers, 0-9


Tense About Tensors?

Define an input tensor of shape

(any batch size, 784)

x = tf.placeholder(tf.float32, shape=[None, 784])

Define a target output tensor of shape

(any batch size, 10)

y_ = tf.placeholder(tf.float32, shape=[None, 10])

Define weights matrix (784x10)

and biases vector (10-D)@SWebCEO +StephenWylie #MachineLearning

One-Hot: Cool To the Touch

Load the input data

from tensorflow.examples.tutorials.mnist import input_datamnist = input_data.read_data_sets('MNIST_data', one_hot=True)

One-hot?!

Think about encoding categorical features:

US = 0, UK = 1, India = 2, Canada = 3, …

This implies ordinal properties and confuses learners

Break encoding into Booleans:

This is where the 10-D target output tensor comes from


US = [1, 0, 0, 0]

UK = [0, 1, 0, 0]

Etc…

TensorFlow Data Structures -

Placeholders

Come from inputs prior to computation

x (input picture as vector), y_ (one-hot 10-D

classification vector)


[0, 0, 0, 0, 0, 0, 0, 0, …0, 0, 0, 0, 0, 1, 1, 0, …0, 0, 0, 0, 1, 1, 1, 0, …0, 0, 0, 0, 1, 1, 0, 0, …0, 0, 0, 1, 1, 1, 0, 0, …

...

[0, 0, 0, 0, 1,0, 0, 0, 0, 0]

Input x y_

TensorFlow Data Structures –

Variables

Values (i.e. model parameters) inside nodes

Used and modified by learning process

Need to be initialized with

W = tf.Variable(tf.zeros([784,10]))init = tf.global_variables_initializer()

W (weights to scale inputs by), b (bias to add

to scaled value)


Training a Dragon, if the Dragon is a Model

Your Simple Model:

y = tf.matmul(x, W) + b

Cross-entropy: distance between guess & correct answer

cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))

Gradient descent: minimize cross-entropy

train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

Learning rate: 0.5


𝐻𝑦′ 𝑦 = −

𝑖

𝑦𝑖′log(𝑦𝑖)

Dragon Get

Wiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiings! Start a Session

Run global_variables_initializer

Run training for 1000 steps

sess = tf.Session()sess.run(init)for i in range(1000):

batch_xs, batch_ys = mnist.train.next_batch(100)train_step.run(feed_dict={x: batch_xs, y_: batch_ys})

Expensive to use all training data at once!

Pick 100 random samples each step


Test Flight Evaluation

Compare labels between guess y and correct y_

correct_prediction =tf.equal(tf.argmax(y,1), tf.argmax(y_,1))

Cast each Boolean result into either a 0 or 1, then average it

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

Print the final figure

print(accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels}))


Demo


Future Talks, Other Talks

Follow me if you want to hear these!

Build a Neural Network in Python with NumPy

Build a Neural Network with nVidia CUDA

Elsewhere,

Mapmaking with Google Maps API, Polymer, and

Firebase

The Process Of Arcade Game ROM Hacking


More Resources

Google’s “Googly Eyes” Android app [Mobile Vision API]https://github.com/googlesamples/android-vision/tree/master/visionSamples/googly-eyes

Quick, Draw! Google classification API for sketcheshttps://quickdraw.withgoogle.com/

Making Android Apps With Intelligence, by Margaret Maynard-Reid https://realm.io/news/360andev-margaret-maynard-reid-making-android-apps-with-intelligence/ (Video + slides)


https://github.com/googlesamples/android-vision/tree/master/visionSamples/googly-eyes

https://quickdraw.withgoogle.com/

https://realm.io/news/360andev-margaret-maynard-reid-making-android-apps-with-intelligence/

Thank You