ML as a Service (MLaaS) for HEP · Machine Learning as a Service MLaaS is a set of tools and services service providers offer to clients to perform various ML tasks, e.g. classiﬁcation,

CMS software R&D

ML as a Service (MLaaS) for HEPValentin Kuznetsov, Cornell University

Machine Learning as a Service

✤ MLaaS is a set of tools and services service providers offer to clients to perform various ML tasks, e.g. classification, regression, DeepLearning, etc.

✤ Tools and services include: data visualization, pre-processing, model training and evaluation, serving predictions, etc.

✤ Major tech companies (Goolge, Amazon, Microsoft, IBM, etc.) and plethora of start-ups provide different types of MLaaS services.

!2Can we use existing MLaaS in HEP?

V. Kuznetsov, CMS R&D

Traditional ML workflow

✤ Traditional ML workflow consists of the following components

✤ obtain dataset(s) in tabular (row-wise) data-format, most of the cases ML deal with either CSV or NumPy arrays representing tabular data; perform data pre-processing if necessary

✤ train ML model and for inference (separation leads towards MLaaS concept)

✤ Input datasets are usually small, < O(GB) and should fit into RAM of the training node

✤ HEP datasets may be quite large, at PB scale, they are stored in ROOT data-format, and can be distributed across the GRID nodes

✤ Traditional ML frameworks can’t read HEP data in ROOT data-format

✤ it creates a gap between CS, ML and HEP communities

Train, Testdatasets

CVS or NumPyML Framework ML model Predictions

!3


ML in HEP (ROOT based example)

✤ Training phase:

✤ we transform our data from ROOT data-format to CSV/NumPy for training purposes

✤ other pre-processing steps can be done at this phase

✤ we train our ML models using available ML frameworks, e.g. Python+Keras, TF, PyTorch

✤ we don’t use ROOT data directly in ML frameworks

✤ Inference phase:

✤ we access trained ML models via external libraries integrated into our frameworks, e.g. CMS CMSSW-DNN, ATLAS LTNN libraries, etc.

✤ R&D for specialized solutions to speed-up inference on FPGAs, e.g. HLS4ML, SonicCMS, etc.

✤ Resource utilization constrained by run-time, can’t be used outside framework language (C++)

✤ Does these approaches sufficient?!4

https://gitlab.cern.ch/mrieger/CMSSW-DNN

https://github.com/lwtnn/lwtnn

https://hls-fpga-machine-learning.github.io/hls4ml

https://github.com/hls-fpga-machine-learning/SonicCMS


Step towards MLasS for HEP

✤ MLaaS for HEP should provide the following:

✤ natively read HEP data, e.g. be able to read ROOT and other data-formats from local or remote distributed data-sources

✤ utilize heterogeneous resources, local CPU, GPUs, farms, cloud resources, etc.

✤ use different ML frameworks (TF, PyTorch, etc.)

✤ minimize infrastructure changes, should be able to use it in different frameworks, inside or outside framework boundaries

✤ serve pre-trained HEP models, à la model repository, and access it easily from any place, any code, any framework

MLaaStrain

MLaaSinference

Experiment

Cloud/resource providers

Data repositories

!5

ML as a Service for HEP R&D✤ Data streaming and training tools: github.com/vkuznet/MLaaS4HEP

✤ Data inference tools: github.com/vkuznet/TFaaS Goal: • use variety of data-formats available in HEP (JSON, Avro, Parquet, ROOT)• read arbitrary size dataset(s) from• use supported data-formats directly in existing ML frameworks• access pre-trained models anywhere

!6V. Kuznetsov, CMS R&D

http://github.com/vkuznet/MLaaS4HEP

http://github.com/vkuznet/TFaaS

MLaaS for HEP

✤ Data Streaming Layer is responsible for local or remote data access of HEP files

✤ Data Training Layer is responsible for feeding HEP data into existing ML frameworks

✤ Data Inference Layer provides access to pre-trained HEP model for HEP users

✤ All three layers are independent from each other and allow independent resource allocation


HDFS

localfilesystem

Remotestorage

uproot

Data Reader

batches

NumPy 1D array

Input data: NumPy array ML model: e.g. Neural Network

Data Streaming Layer

Data Training Layer

Repositoryof ML models

Data Inference Layer

pyarrow

NumPy 2D array

NumPy N dim array

shape: (3,)

shape: (2,3)

shape: (X,Y,Z,…)

preprocessing

Data Streaming Layer

✤ The proposed architecture allows to read variety of data formats used in HEP, including ROOT, JSON, CSV, Avro, Parquet

✤ we rely on uproot (for ROOT I/O) and pyarrow module to read data from HDFS

✤ The Data Streaming Layer abstracts data access via custom data readers

✤ {RootData,Json,Csv,Parquet}Reader are supported and able to read data from local file system, HDFS and remote sites (ROOT files via xrootd)

✤ Each Reader satisfy to common/minimal set of APIs, e.g. to read next available chunk of data and represent it as NumPy array suitable for ML framework(s)

✤ the data pre-processing can be done via custom user function applied to all records

✤ This allow to use MLaaS4HEP for different of use-cases

✤ perform ML modeling over distributed ROOT files

✤ be able to read and train on variety of (computing) meta-data stored on HDFS!8

https://github.com/scikit-hep/uproot

https://arrow.apache.org/docs/python/install.html

https://wesmckinney.com/blog/python-hdfs-interfaces/

Data Training Layer

✤ MLaaS4HEP framework does not stick with specific ML framework

✤ Python implementation allows integration with variety of ML frameworks

✤ use custom wrappers to provide common fit/predict methods

✤ The user model as well as pre-processing function can be loaded dynamically via code inspection

✤ user provide python file with model and preprocessing functions which implements the ML model and pre-processing functionality, respectively

✤ Various tests performed with Keras, TF, PyTorch models using local, remote ROOT and gzipped JSON files located on grid sites and HDFS, respectively

✤ MLaaS4HEP repository provides examples how to train model using custom user-based code

!9

https://github.com/vkuznet/MLaaS4HEP


Data Inference Layer

✤ Data Inference Layer is implemented as TensorFlow as a Service (TFaaS)

✤ It is web server with APIs to upload and access any TensorFlow models

✤ Can be used as global repository of pre-trained HEP models

✤ Can be deployed everywhere

✤ Docker image and Kubernetes files are provided

✤ TFaaS is available as part of DODAS (Dynamic On Demand Analysis Service)

!10

https://github.com/vkuznet/TFaaS

https://hub.docker.com/r/veknet/tfaas-public/

https://github.com/dmwm/CMSKubernetes/tree/master/kubernetes/tfaas

https://indico.cern.ch/event/587955/contributions/2937198/attachments/1682105/2702791/CHEP-2018-Spiga.pdf


TFaaS features

✤ HTTP server written in Go to serve arbitrary TF model(s) using Google Go TF APIs

✤ users may upload any number of TF models, models are stored on local filesystem and cached within TFaaS server

✤ TFaaS repository provides instructions/tools to convert Keras models to TF

✤ TFaaS supports JSON and ProtoBuffer data-formats

✤ Any client supporting HTTP protocol, e.g. curl, C++ (via curl library or TFaaS C++ client), Python, can talk to TFaaS via HTTP end-points

✤ see back-up slides for specific client’s examples (Python and C++)

✤ Benchmarks: 200 concurrent calls, throughput 500 req/sec for TF model with 1024x1024 hidden layers

✤ performance are similar to JSON and ProtoBuffer clients!11



Model inspection

!12

IntegratedNetron

model viewer

https://www.lutzroeder.com/ai/

MLaaS for HEP proof-of-concept✤ Train toy models in PyTorch and TF (via Keras) using CMS NANOAOD data

✤ run code locally on laptop, lxplus and GPU node

✤ access data from local or remote ROOT files

✤ Serve model in TFaaS server



MLaaS workflow w/ user models

./workflow.py —files=files.txt —model=<model.py> —params=params.json

PyTorch example Keras/TF example

MLaaS workflow

Input ROOT files

User model

MLaaS parameters

!14


Read remote ROOT file Init ML model Perform train cycle

Read another ROOT file

Perform another train cycle

!15

MLaaS4HEP R&D studies

✤ NanoAOD based analysis (Luca Giommi): perform feasibility studies to read distributed NANOAOD files and perform basic analysis to separate signal vs background

✤ no intermediate ROOT tuples are required

✤ data can be streamed and read directly from remote sites (MLaaS4HEP with uproot)

✤ data will be trained within MLaaS4HEP framework based on attributes available in NANOAOD format

✤ Operational Intelligence analysis: read computing meta-data stored on HDFS and perform ML/analytics tasks

✤ looking for volunteers!16


Summary: streaming & training

✤ MLaaS4HEP data streaming layer is capable of reading local and remote data from commonly used data-formats in HEP: ROOT, CSV, JSON, Parquet, Avro (in a future)

✤ ROOT data can be read via XrootD redirector while CSV and JSON data can be read from HDFS as well as local file system

✤ For ROOT data we perform JaggedArray → NumPy transformation, for other data-formats user can provide custom preprocessing function

✤ Random reads from multiple files (data shuffle mode)

✤ Customization: total number of events to read; data read chunk size; select or exclude branches to read, choice of XrootD redirector

✤ Dynamically load user based model from your choice of ML framework (PyTorch, TF, Keras)

✤ MLaaS4HEP docker image is available

✤ Planning: visual inspection of ROOT file content; possible graphical UI to build full workflow pipeline; I/O optimization; real analysis use cases

!17

https://cloud.docker.com/u/veknet/repository/docker/veknet/mlaas4hep


Summary: inference

✤ TFaaS server provides ability to serve TF models via HTTP protocol, it natively supports concurrency, organizes TF models in hierarchical structure on local file system, and uses caching for model access

✤ no integration is required to include TFaaS into your infrastructure, i.e. clients can talk to TFaaS server via HTTP protocol (python and C++ clients are available)

✤ separation of TF models from experiment’s framework, scale TFaaS with dedicated resources

✤ can be used as model repository, TFaaS architecture allows to implement model versioning, tagging, etc.

✤ TFaaS server was tested with concurrent clients, we obtained 500 req/sec throughput for mid-size model inference (subject of TF model complexity)

✤ TFaaS docker image and k8s deployment files are available

✤ Planning: build repository of HEP (x-experiment) pre-trained ML models!18

https://cloud.docker.com/u/veknet/repository/docker/veknet/tfaas


Summary

✤ MLaaS for HEP is a feasible, fully functional proof-of-concept workflows are available

✤ MLaaS4HEP repository:

✤ Data Streaming Layer for remote access to local and distributed HEP files

✤ ROOT data are available/streamed via uproot ROOT I/O

✤ CSV, JSON, Parquet (and soon Avro) data are available via pyarrow library with support to read them from HDFS

✤ Data Training Layer provides necessary data transformation and batch streaming to existing ML frameworks, including dynamic loading of user-based models and preprocessing functions

✤ TFaaS repository:

✤ Data Inference Layer: serves TF models via Go HTTP server and Google TF Go APIs

✤ ready to use (docker or k8s), provides basic TF model repository functionality!19

http://github.com/vkuznet/MLaaS4HEP


R&D topics✤ Model conversion: PyTorch, fast.ai, etc. to TensorFlow

✤ Model repository: implement persistent model storage, look-up, versioning, tagging, etc.

✤ MLaaS/TFaaS scalability: explore kubernetes, auto-scaling, resource provisioning (FPGAs, GPUs, TPUs, etc.)

✤ Real model training model with distributed data (MLaaS with DODAS)

✤ Adaptation of MLaaS4HEP framework for Operational Intelligence tasks!20

Collaboration is welcome

https://arxiv.org/abs/1811.04492


https://arxiv.org/abs/1811.04492

Back-up slides


TFaaS HTTP end-points

✤ /json: handles data send to TFaaS in JSON data-format, e.g. you’ll use this API to fetch predictions for your vector of parameters presented in JSON data-format (used by Python client)

✤ /proto: handlers data send to TFaaS in ProtoBuffer data-format (user by C++ client)

✤ /image: handles images (png and jpg) and yields predictions for given image and model name

✤ /upload: upload your TF model to TFaaS

✤ /delete: delete TF model on TFaaS server

✤ /models: return list of existing TF models on TFaaS

✤ /params: return list of parameters of TF model

✤ /status: return status of TFaaS server !22


Python client✤ upload API lets you upload your model and model parameter files to TFaaS

tfaas_client.py —-url=url —-upload=upload.json # upload.json contains model parameters

✤ models API lets you list existing models on TFaas server

tfaas_client.py —-url=url —-models

✤ delete API lets you delete given model on TFaaS server

tfaas_client.py —-url=url —-delete=ModelName

✤ predict API lets you get prediction from your model and your given set of input parameters

# input.json: {“keys”:[“attr1”, “attr2”, …], “values”: [1,2,…]}

tfaas_client.py —-url=url —-predict=input.json

✤ image API provides predictions for image classification (supports jpg or png data-formats)

tfaas_client.py —-url=url —-image=/path/file.png —-model=HEPImageModel!23


C++ client

#include <iostream> #include <vector> #include <sstream> #include “TFClient.h” // part of TFaaS repository

int main() { // main function std::vector<std::string> attrs; // define vector of attributes std::vector<float> values; // define vector of values auto url = “http://localhost:8083/proto”; // define your TFaaS server URL auto model = “MyModel"; // name of your model in TFaaS

// fill out your data for(int i=0; i<42; i++) { // the model I tested had 42 parameters values.push_back(i); // create your vector values std::ostringstream oss; oss << i; attrs.push_back(oss.str()); // create your vector headers }

// make prediction call auto res = predict(url, model, attrs, values); // get predictions from TFaaS server for(int i=0; i<res.prediction_size(); i++) { auto p = res.prediction(i); // fetch and print model predictions std::cout << "class: " << p.label() << " probability: " << p.probability() << std::endl; } }

CMS BuildFile.xml<use name="protobuf"/> <lib name="curl"/> <lib name=“protobuf"/>

!24

Documents

ML as a Service (MLaaS) for HEP · Machine Learning as a Service MLaaS is a set of tools and services service providers offer to clients to perform various ML tasks, e.g. classiﬁcation,