Upload
muhammet-arslan
View
88
Download
0
Embed Size (px)
Citation preview
An OpenSource MLIB- PredictionIO
Muhammet ARSLAN
PredictionIO
PredictionIO is an open source Machine Learning Server built on top of state-of-the-art open source stack for developers and data scientists create predictive engines for any machine learning task.
User-Based Recommending
Requirements• Apache Spark• MLib• Hbase• Spray• ElasticSearch
Components
PredictionIO platform - open source machine learning stack for building, evaluating and deploying engines with machine learning algorithms.
Event Server - open source machine learning analytics layer for unifying events from multiple platforms
Template Gallery - the place for you to download engine templates for different type of machine learning applications
DASE - MVC of MLibD (Data source/data preparator)
A (algorithms)S (serving)
E (evaluator)
Event-ServerIn a common scenario, PredictionIO's Event Server continuously collects data from your
application. A PredictionIO engine then builds predictive model(s) with one or more
algorithms using the data. After it is deployed as a web service, it listens to
queries from your application and respond with predicted results in real-time.
Event Server collects data from your application, in real-time or in batch. It can also unify data that are related to your application from multiple platforms. After data is collected, it mainly serves two purposes:
• Provide data to Engine(s) for model training and
evaluation• Offer a unified view for data
analysis
EngineEngine is responsible for making prediction.
It contains one or more machine learning algorithms. An engine reads training data and build predictive model(s). It is then deployed as a web service. A deployed
engine responds to prediction queries from your application through REST API in real-
time.
Tech-Stack
• The engine is written in Scala and APIs are built using Spray• The engine is built on top of Apache Spark
• Event Server uses Apache HBase as the data store• Trained model is stored in HDFS (part of Apache Hadoop)
• Model metadata is stored in ElasticSearch
TemplatesPredictionIO's template gallery offers Engine
Templates for all kinds of machine learning tasks. You can easily create one or more engines from
these templates .
The components of a template, namely Data Source, Data Preparator, Algorithm(s), and Serving, are all customizable for your specific needs.
http://templates.prediction.io/
Example
pio template get PredictionIO/template-scala-parallel-ecommercerecommendation /srv/bbs
Create a new Engine based on a template
pio app new sony
Create new APP
root@ip-172-31-9-89:/srv/bbs# pio app new sony[INFO] [App$] Initialized Event Store for this app ID: 1.
[INFO] [App$] Created new app:[INFO] [App$] Name: sony
[INFO] [App$] ID: 1[INFO] [App$] Access Key: iQ1A_BvI2qAI7bKqkEVsr5stGCrBy_N98BxpIxF_-e7TsIF_xWbD05vdPwe6g8ww
Collect Some Datahttp://52.211.34.94:7070/events.json?accessKey=iQ1A_BvI2qAI7bKqkEVsr5stGCrBy_N98BxpIxF_-e7TsIF_xWbD05vdPwe6g8ww
Build PIO Serverpio build --verbose [INFO] [Console$] Your engine is ready for training.
Train the DATApio train [INFO] [CoreWorkflow$] Training completed successfully.
Deploy the Enginepio deploy Browse the IP:8000
Thank U!