15
An OpenSource MLIB- PredictionIO Muhammet ARSLAN

Introduction to PredictionIO

Embed Size (px)

Citation preview

Page 1: Introduction to PredictionIO

An OpenSource MLIB- PredictionIO

Muhammet ARSLAN

Page 2: Introduction to PredictionIO

PredictionIO

PredictionIO is an open source Machine Learning Server built on top of state-of-the-art open source stack for developers and data scientists create predictive engines for any machine learning task.

Page 3: Introduction to PredictionIO

User-Based Recommending

Page 4: Introduction to PredictionIO

Requirements• Apache Spark• MLib• Hbase• Spray• ElasticSearch

Page 5: Introduction to PredictionIO

Components

PredictionIO platform - open source machine learning stack for building, evaluating and deploying engines with machine learning algorithms.

Event Server - open source machine learning analytics layer for unifying events from multiple platforms

Template Gallery - the place for you to download engine templates for different type of machine learning applications

Page 6: Introduction to PredictionIO

DASE - MVC of MLibD (Data source/data preparator)

A (algorithms)S (serving)

E (evaluator)

Page 7: Introduction to PredictionIO
Page 8: Introduction to PredictionIO

Event-ServerIn a common scenario, PredictionIO's Event Server continuously collects data from your

application. A PredictionIO engine then builds predictive model(s) with one or more

algorithms using the data. After it is deployed as a web service, it listens to

queries from your application and respond with predicted results in real-time.

Page 9: Introduction to PredictionIO

Event Server collects data from your application, in real-time or in batch. It can also unify data that are related to your application from multiple platforms. After data is collected, it mainly serves two purposes:

• Provide data to Engine(s) for model training and

evaluation• Offer a unified view for data

analysis

Page 10: Introduction to PredictionIO

EngineEngine is responsible for making prediction.

It contains one or more machine learning algorithms. An engine reads training data and build predictive model(s). It is then deployed as a web service. A deployed

engine responds to prediction queries from your application through REST API in real-

time.

Page 11: Introduction to PredictionIO

Tech-Stack

• The engine is written in Scala and APIs are built using Spray• The engine is built on top of Apache Spark

• Event Server uses Apache HBase as the data store• Trained model is stored in HDFS (part of Apache Hadoop)

• Model metadata is stored in ElasticSearch

Page 12: Introduction to PredictionIO

TemplatesPredictionIO's template gallery offers Engine

Templates for all kinds of machine learning tasks. You can easily create one or more engines from

these templates .

The components of a template, namely Data Source, Data Preparator, Algorithm(s), and Serving, are all customizable for your specific needs.

http://templates.prediction.io/

Page 13: Introduction to PredictionIO

Example

pio template get PredictionIO/template-scala-parallel-ecommercerecommendation /srv/bbs

Create a new Engine based on a template

pio app new sony

Create new APP

root@ip-172-31-9-89:/srv/bbs# pio app new sony[INFO] [App$] Initialized Event Store for this app ID: 1.

[INFO] [App$] Created new app:[INFO] [App$] Name: sony

[INFO] [App$] ID: 1[INFO] [App$] Access Key: iQ1A_BvI2qAI7bKqkEVsr5stGCrBy_N98BxpIxF_-e7TsIF_xWbD05vdPwe6g8ww

Collect Some Datahttp://52.211.34.94:7070/events.json?accessKey=iQ1A_BvI2qAI7bKqkEVsr5stGCrBy_N98BxpIxF_-e7TsIF_xWbD05vdPwe6g8ww

Page 14: Introduction to PredictionIO

Build PIO Serverpio build --verbose [INFO] [Console$] Your engine is ready for training.

Train the DATApio train [INFO] [CoreWorkflow$] Training completed successfully.

Deploy the Enginepio deploy Browse the IP:8000

Page 15: Introduction to PredictionIO

Thank U!