45
H2O 3 REST API Overview Raymond Peck Director of Product Engineering, H2O.ai [email protected] © H2O.ai, 2015 1

H2O 3 REST API Overview

Embed Size (px)

Citation preview

Page 1: H2O 3 REST API Overview

H2O 3 REST API OverviewRaymond PeckDirector of Product Engineering, H2O.ai

[email protected]© H2O.ai, 2015 1

Page 2: H2O 3 REST API Overview

Long version of this content is here:

https://github.com/h2oai/h2o-3/blob/master/h2o-docs/src/api/REST/h2o_3_rest_api_overview.md

© H2O.ai, 2015 2

Page 3: H2O 3 REST API Overview

Why?• Use the REST API to drive H2O from an external script or

program in any language.

• Use the REST API when you want API stability.

• Use the Java API if you want to call the internal APIs from Java, Scala, etc.

© H2O.ai, 2015 3

Page 4: H2O 3 REST API Overview

Who?• Software developers proficient in a scripting or a

programming language.

• Those familiar with nested data representations like JSON.

• Those familiar with the functionality of H2O

• at least well enough to convert a Flow, R or Python script from a Data Scientist.

© H2O.ai, 2015 4

Page 5: H2O 3 REST API Overview

What?Any H2O functionality in Flow, R or Python can be accessed via the REST API - data import - model building - model comparison - generating predictions - admin functions

© H2O.ai, 2015 5

Page 6: H2O 3 REST API Overview

How?You can call the REST API:

• from your browser

• using browser tools such as Postman in Chrome

• using curl

• using the language of your choice

© H2O.ai, 2015 6

Page 7: H2O 3 REST API Overview

BindingsFor Python and R simply use the supplied packages.For JVM clients: - H2O currently ships with REST API payload POJOs. - We're working on endpoint proxies. - These are generated as part of the build using a Python script.

We'll work with you to generate bindings for other languages. A user easily did C#.© H2O.ai, 2015 7

Page 8: H2O 3 REST API Overview

Versioning and Stability, Part 1• Current version is 3.

• Non-breaking changes are allowed; examples:

• adding output fields

• adding parameters with defaults that maintain old behavior

• Well-written clients should not break as functionality is added to version 3.

© H2O.ai, 2015 8

Page 9: H2O 3 REST API Overview

Versioning and Stability, Part 2• Backward compatibility is tested with each release,

including nightlies.

• Functionality under development is version 99.

• /99 endpoints can be called via /EXPERIMENTAL.

© H2O.ai, 2015 9

Page 10: H2O 3 REST API Overview

URLshttp://your_server:54321/version/Resource{/...}

Examples: - /3/Frames - /3/Frames/my_frame - /3/Frames/my_frame/summary - /3/Models - /3/Models/my_model - /3/Cloud© H2O.ai, 2015 10

Page 11: H2O 3 REST API Overview

HTTP Verbs• GET requests fetch data and do not cause side effects.

GET /3/Frames/my_frame_name?row_offset=10000&row_count=1000

• POST requests create a new object.

They use x-www-form-urlencoded input format.

• DELETE requests delete an object.

• HEAD requests return just the HTTP status.© H2O.ai, 2015 11

Page 12: H2O 3 REST API Overview

HTTP Status Codes• 200 OK (all is well)

• 400 Bad Request (the request URL is bad)

• 404 Not Found (a specified object was not found)

• 412 Precondition Failed (bad parameters or other problem handling the request)

• 500 Internal Server Error (unanticipated failure)

© H2O.ai, 2015 12

Page 13: H2O 3 REST API Overview

Schemas, Part 1Schemas define input and output formats.

Schemas fields can be simple values or nested schemas, or arrays or dictionaries (maps) of these.

© H2O.ai, 2015 13

Page 14: H2O 3 REST API Overview

Schemas, Part 2• type

• default value

• help string

• direction (in, out or inout)

• required

• importance

• allowed values for enumerated fields© H2O.ai, 2015 14

Page 15: H2O 3 REST API Overview

{ "__meta": { "schema_name": "ModelParameterSchemaV3", "schema_type": "Iced", "schema_version": 3 }, "actual_value": { "URL": "/3/Models/prostate_glm", "__meta": { "schema_name": "ModelKeyV3", "schema_type": "Key<Model>", "schema_version": 3 }, "name": "prostate_glm", "type": "Key<Model>" }, "default_value": null, "help": "Destination id for this model; auto-generated if not specified", "label": "model_id", "level": "critical", "name": "model_id", "required": false, "type": "Key<Model>", "values": [] },

© H2O.ai, 2015 15

Page 16: H2O 3 REST API Overview

Error Condition Payloads• return a non-2xx HTTP status code

• return standardized error payloads:

• end-user message

• developer message

• HTTP status

• optional dictionary of revelant values

• exception information if applicable.© H2O.ai, 2015 16

Page 17: H2O 3 REST API Overview

Example Error { "__meta": { "schema_type": "H2OError", ... }, "timestamp": 1438634936808, "error_url": "/3/Frames/missing_frame", "msg": "Object 'missing_frame' not found for argument: key", "dev_msg": "Object 'missing_frame' not found for argument: key", "http_status": 404, "values": { "argument": "key", "name": "missing_frame" }, "exception_type": "water.exceptions.H2OKeyNotFoundArgumentException", "exception_msg": "Object 'missing_frame' not found for argument: key", "stacktrace": [ ... ] }

© H2O.ai, 2015 17

Page 18: H2O 3 REST API Overview

Example EndpointsFor the complete list check the reference docs or /Metadata/endpoints. As of August 6, 2015 there are 105 endpoints:

Loading and parsing data filesFrames and ModelsAdministrative and utilityJob management and pollingPersistence© H2O.ai, 2015 18

Page 19: H2O 3 REST API Overview

Loading and parsing data filesGET /3/ImportFilesImport raw data files into a single-column H2O Frame.

POST /3/ParseSetupGuess the parameters for parsing raw byte-oriented data into an H2O Frame.

POST /3/ParseParse a raw byte-oriented Frame into a useful columnar data Frame.

© H2O.ai, 2015 19

Page 20: H2O 3 REST API Overview

FramesGET /3/Frames - Return all Frames in the H2O distributed K/V store.

GET /3/Frames/(?.*) - Return the specified Frame.

GET /3/Frames/(?.*)/summary - Return a Frame, including the histograms, after forcing computation of rollups.

GET /3/Frames/(?.*)/columns/(?.*)/summary - Return the summary metrics for a column, e.g. mins, maxes, mean, sigma, percentiles, etc.

DELETE /3/Frames/(?.*)DELETE /3/Frames

© H2O.ai, 2015 20

Page 21: H2O 3 REST API Overview

Building modelsGET /3/ModelBuildersReturn the Model Builder metadata for all available algorithms.

GET /3/ModelBuilders/(?.*)Return the Model Builder metadata for the specified algorithm.

POST /3/ModelBuilders/deeplearning/parametersValidate a set of Deep Learning model builder parameters.

POST /3/ModelBuilders/deeplearningTrain a Deep Learning model on the specified Frame.

© H2O.ai, 2015 21

Page 22: H2O 3 REST API Overview

Accessing and using modelsGET /3/ModelsReturn all Models from the H2O distributed K/V store.

GET /3/Models/(?.*?)(\.java)?Return the specified Model. Use .java extension for Java POJO.

POST /3/Predictions/models/(?.*)/frames/(?.*)Generate predictions for the specified Frame and Model.

DELETE /3/Models/(?.*)DELETE /3/Models

© H2O.ai, 2015 22

Page 23: H2O 3 REST API Overview

Administrative and utilityGET /3/AboutReturn information about this H2O cluster.

GET /3/CloudDetermine the status of the nodes in the H2O cloud.

HEAD /3/CloudDetermine the status of the nodes in the H2O cloud.

© H2O.ai, 2015 23

Page 24: H2O 3 REST API Overview

Job management and pollingGET /3/JobsGet a list of all the H2O Jobs (long-running actions).

GET /3/Jobs/(?.*)Get the status of the given H2O Job (long-running action).

POST /3/Jobs/(?.*)/cancelCancel a running job.

© H2O.ai, 2015 24

Page 25: H2O 3 REST API Overview

PersistencePOST /3/Frames/(?.*)/exportExport a Frame to the given path with optional overwrite.

POST /99/Models.bin/(?.*)Import given binary model into H2O.

GET /99/Models.bin/(?.*)Export given model.

© H2O.ai, 2015 25

Page 26: H2O 3 REST API Overview

Example workflows using curlSome fields have been omitted for brevity.

When using curl you can pipe (|) the output through python -m json.tool to pretty-print the JSON:curl -X GET http://localhost:54321/3/Frames | python -m json.tool

© H2O.ai, 2015 26

Page 27: H2O 3 REST API Overview

GBM_Example.flow, Step 1: ImportIn Flow:importFiles ["http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz"]

In curl:curl -X GET http://127.0.0.1:54321/3/ImportFiles?path=\ http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz

© H2O.ai, 2015 27

Page 28: H2O 3 REST API Overview

GBM_Example.flow, Step 1 Result{ "__meta": { "schema_name": "ImportFilesV3", "schema_type": "Iced", "schema_version": 3 }, "destination_frames": [ "http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz" ], "fails": [], "files": [ "http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz" ], "path": "http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz"}

© H2O.ai, 2015 28

Page 29: H2O 3 REST API Overview

GBM_Example.flow, Step 2: ParseSetupIn Flow:setupParse paths: ["http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz"]

In curl:curl -X POST http://127.0.0.1:54321/3/ParseSetup --data \ 'source_frames=["http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz"]'

© H2O.ai, 2015 29

Page 30: H2O 3 REST API Overview

GBM_Example.flow, Step 2 Result{ "source_frames": [ { "URL": "\/3\/Frames\/http:\/\/s3.amazonaws.com\/h2o-public-test-data\/smalldata\/flow_examples\/arrhythmia.csv.gz" } ], "parse_type": "CSV", "separator": 44, "column_names": null, "column_types": [ "Numeric", "Numeric", ... ], "destination_frame": "arrhythmia.hex", "header_lines": 0, "number_columns": 280, "data": [ [ "75", "0", "190", ... ], ... ]

© H2O.ai, 2015 30

Page 31: H2O 3 REST API Overview

GBM_Example.flow, Step 3: ParseIn Flow:parseFiles paths: ["http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz"] destination_frame: "arrhythmia.hex" parse_type: "CSV" separator: 44 number_columns: 280 single_quotes: false column_names: null column_types: ["Numeric","Numeric",...,"Numeric"] delete_on_done: true check_header: -1 chunk_size: 4194304

© H2O.ai, 2015 31

Page 32: H2O 3 REST API Overview

GBM_Example.flow, Step 3: ParseIn curl:curl -X POST http://127.0.0.1:54321/3/Parse --data \'destination_frame=arrhythmia.hex&\source_frames=["http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz"]&\parse_type=CSV\&separator=44&\number_columns=280&\single_quotes=false&\column_names=&\column_types=["Numeric"...,"Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric"]&\check_header=-1&\delete_on_done=true&\chunk_size=4194304'

© H2O.ai, 2015 32

Page 33: H2O 3 REST API Overview

GBM_Example.flow, Step 3 Result{ "job": { "key": { "URL": "\/3\/Jobs\/$03010a010a7f32d4ffffffff$_b98fc5bba38d21ea53da2a0834c44f7a" }, "description": "Parse", "status": "RUNNING", "progress_msg": "Ingesting files.", "dest": { "URL": "\/3\/Frames\/arrhythmia.hex" }, "exception": null, "messages": [ ], "error_count": 0 },...}

© H2O.ai, 2015 33

Page 34: H2O 3 REST API Overview

GBM_Example.flow, Step 4: Poll for job completionFlow polls for Job completion automagically:

© H2O.ai, 2015 34

Page 35: H2O 3 REST API Overview

GBM_Example.flow, Step 4: Result "jobs": [ { "key": { "URL": "\/3\/Jobs\/$03010a010a7f32d4ffffffff$_b98fc5bba38d21ea53da2a0834c44f7a" }, "description": "Parse", "status": "RUNNING", "progress_msg": "Ingesting files.", "dest": { "name": "arrhythmia.hex", "URL": "\/3\/Frames\/arrhythmia.hex" }, "error_count": 0, "exception": null, "messages": [], } ]

© H2O.ai, 2015 35

Page 36: H2O 3 REST API Overview

GBM_Example.flow, Step 5: Train the ModelIn Flow:buildModel 'gbm', {"model_id":"gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1","training_frame":"arrhythmia.hex","score_each_iteration":false,"response_column":"C1","ntrees":"20","max_depth":5,"min_rows":"25","nbins":20,"learn_rate":"0.3","distribution":"AUTO","balance_classes":false,"max_confusion_matrix_size":20,"max_hit_ratio_k":10,"class_sampling_factors":[],"max_after_balance_size":5,"seed":0}

© H2O.ai, 2015 36

Page 37: H2O 3 REST API Overview

GBM_Example.flow, Step 5: Train the ModelIn curl:curl -X POST http://127.0.0.1:54321/3/ModelBuilders/gbm --data \'model_id=gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1&\training_frame=arrhythmia.hex&response_column=C1&\score_each_iteration=false&ntrees=20&max_depth=5&\min_rows=25&nbins=20&learn_rate=0.3&distribution=AUTO&\balance_classes=false&max_confusion_matrix_size=20&\max_hit_ratio_k=10&class_sampling_factors=&\max_after_balance_size=5&seed=0'

© H2O.ai, 2015 37

Page 38: H2O 3 REST API Overview

GBM_Example.flow, Step 5: Result{ "job": { "key": { "URL": "\/3\/Jobs\/$03010a010a7f32d4ffffffff$_881e60f52af792b71d20540604b742dd" }, "description": "GBM", "status": "RUNNING", "progress_msg": "Running...", "dest": { "URL": "\/3\/Models\/gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1", ... }, ... }, "algo": "gbm", "algo_full_name": "Gradient Boosting Machine", "messages": [], "error_count": 0, "parameters": [ ... ]}

© H2O.ai, 2015 38

Page 39: H2O 3 REST API Overview

GBM_Example.flow, Step 6: Poll for job completionSame as for Parse

© H2O.ai, 2015 39

Page 40: H2O 3 REST API Overview

GBM_Example.flow, Step 7: View the ModelIn Flow:

getModel "gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1"

In curl:curl -X GET 'http://127.0.0.1:54321/3/Models/gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1'

© H2O.ai, 2015 40

Page 41: H2O 3 REST API Overview

GBM_Example.flow, Step 7: Result { "model_id": { "URL": "\/3\/Models\/gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1" }, "algo": "gbm", "parameters": [...], "output": { "__meta": { "schema_name": "GBMModelOutputV3", }, "model_category": "Regression", "scoring_history": { ... }, "training_metrics": { "model_category": "Regression", "MSE": 31.32188458883, "r2": 0.88422887487626, "mean_residual_deviance": 31.32188458883 }, "status": "DONE", "run_time": 3211,

© H2O.ai, 2015 41

Page 42: H2O 3 REST API Overview

GBM_Example.flow, Step 8: PredictionsIn Flow:predict model: "gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1", frame: "arrhythmia.hex", predictions_frame: "prediction-9d6f23f3-45c2-4e1f-a48e-393b1b7de6db"

In curl:curl -X GET \ 'http://127.0.0.1:54321/3/Frames/prediction-9d6f23f3-45c2-4e1f-a48e-393b1b7de6db\ ?column_offset=0&column_count=20'

© H2O.ai, 2015 42

Page 43: H2O 3 REST API Overview

GBM_Example.flow, Step 8: Result "model_metrics": [ { "predictions": { "frame_id": { "URL": "\/3\/Frames\/prediction-9d6f23f3-45c2-4e1f-a48e-393b1b7de6db" }, "total_column_count": 1, "rows": 452, "columns": [ { "label": "predict", "data": [ 35.275735166748, 53.253980894466, 41.531820529033 ], } ], "MSE": 31.321880321916, "r2": 0.88422889064751, "mean_residual_deviance": 31.321880321916

© H2O.ai, 2015 43

Page 44: H2O 3 REST API Overview

Documentation• long version of this content is here:

https://github.com/h2oai/h2o-3/blob/master/h2o-docs/src/api/REST/h2o_3_rest_api_overview.md

• reference in the Help sidebar in Flow

• reference on the H2O.ai website, http://docs.h2o.ai/

• reference doc is generated via the /Metadata endpoints, so it's always current

© H2O.ai, 2015 44

Page 45: H2O 3 REST API Overview

THANKS!Questions?

© H2O.ai, 2015 45