18
Peter Reutemann Big Data with ADAMS Big Data with ADAMS What the heck is ADAMS?

Big Data with ADAMS€¦ · 10/08/2015 Peter Reutemann 3 of 18 Flow Operators are called “actors” Actors arranged in tree, no connections Actor “handlers” nest other actors

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Big Data with ADAMS€¦ · 10/08/2015 Peter Reutemann 3 of 18 Flow Operators are called “actors” Actors arranged in tree, no connections Actor “handlers” nest other actors

Peter Reutemann

Big Data with ADAMSBig Data with ADAMS

What the heck is ADAMS?

Page 2: Big Data with ADAMS€¦ · 10/08/2015 Peter Reutemann 3 of 18 Flow Operators are called “actors” Actors arranged in tree, no connections Actor “handlers” nest other actors

10/08/2015 Peter Reutemann 2 of 18

What is ADAMS?

● Java, GPLv3● Data mining: MOA, WEKA, MEKA, R● Spreadsheets and databases● Image and video processing● Visualizations (plots, GIS)● Scripting via Jython and Groovy● ...

Page 3: Big Data with ADAMS€¦ · 10/08/2015 Peter Reutemann 3 of 18 Flow Operators are called “actors” Actors arranged in tree, no connections Actor “handlers” nest other actors

10/08/2015 Peter Reutemann 3 of 18

Flow

● Operators are called “actors”● Actors arranged in tree, no connections● Actor “handlers” nest other actors

● e.g., sequence of actors

● Control actors control data flow● e.g., branch, tee, if-then-else, switch

● Input/output defines● standalone , source , transformer , sink

Page 4: Big Data with ADAMS€¦ · 10/08/2015 Peter Reutemann 3 of 18 Flow Operators are called “actors” Actors arranged in tree, no connections Actor “handlers” nest other actors

10/08/2015 Peter Reutemann 4 of 18

Flow (2)

● Tree only supports 1-to-n connections● Simulating n-to-m semantics

● Containers● Variables● Internal storage● Callable actors

Page 5: Big Data with ADAMS€¦ · 10/08/2015 Peter Reutemann 3 of 18 Flow Operators are called “actors” Actors arranged in tree, no connections Actor “handlers” nest other actors

10/08/2015 Peter Reutemann 5 of 18

Examples

Output file to read

Read file

Set class attribute

Apply filter

Display data

Load dataset, apply filter anddisplay dataset

Execute nested actors one after the other

Page 6: Big Data with ADAMS€¦ · 10/08/2015 Peter Reutemann 3 of 18 Flow Operators are called “actors” Actors arranged in tree, no connections Actor “handlers” nest other actors

10/08/2015 Peter Reutemann 6 of 18

Examples (2)Generate data stream

Feed data into branches

1st sequence of steps

2nd sequence of steps

Apply stream filter

Evaluate classifier

Filter measurement of interest

Generate data for plot

Apply different stream filter

Plot

Filter data stream in two separate branches with different filters, evaluate classifier and plot metric

Page 7: Big Data with ADAMS€¦ · 10/08/2015 Peter Reutemann 3 of 18 Flow Operators are called “actors” Actors arranged in tree, no connections Actor “handlers” nest other actors

10/08/2015 Peter Reutemann 7 of 18

Examples (3)

...

...

groups actors accessible via their name (“callable actors”)

combined plot

1st evaluation: create plotting data

Pump data into referenced plot

2nd evaluation: create plotting data

Pump data into referenced plot

Generate combined plot of two evaluations by using “callable actors” functionality

Page 8: Big Data with ADAMS€¦ · 10/08/2015 Peter Reutemann 3 of 18 Flow Operators are called “actors” Actors arranged in tree, no connections Actor “handlers” nest other actors

10/08/2015 Peter Reutemann 8 of 18

Research (demos)

● Compare two MOA classifiers (drift)● Compare MOA classifier on different streams● MOA cluster visualization● Track mouse in video

Page 9: Big Data with ADAMS€¦ · 10/08/2015 Peter Reutemann 3 of 18 Flow Operators are called “actors” Actors arranged in tree, no connections Actor “handlers” nest other actors

10/08/2015 Peter Reutemann 9 of 18

MOA - Drift

Page 10: Big Data with ADAMS€¦ · 10/08/2015 Peter Reutemann 3 of 18 Flow Operators are called “actors” Actors arranged in tree, no connections Actor “handlers” nest other actors

10/08/2015 Peter Reutemann 10 of 18

MOA - Drift

Page 11: Big Data with ADAMS€¦ · 10/08/2015 Peter Reutemann 3 of 18 Flow Operators are called “actors” Actors arranged in tree, no connections Actor “handlers” nest other actors

10/08/2015 Peter Reutemann 11 of 18

MOA - different streams

Page 12: Big Data with ADAMS€¦ · 10/08/2015 Peter Reutemann 3 of 18 Flow Operators are called “actors” Actors arranged in tree, no connections Actor “handlers” nest other actors

10/08/2015 Peter Reutemann 12 of 18

MOA - different streams

Page 13: Big Data with ADAMS€¦ · 10/08/2015 Peter Reutemann 3 of 18 Flow Operators are called “actors” Actors arranged in tree, no connections Actor “handlers” nest other actors

10/08/2015 Peter Reutemann 13 of 18

MOA - Cluster visualization

Page 14: Big Data with ADAMS€¦ · 10/08/2015 Peter Reutemann 3 of 18 Flow Operators are called “actors” Actors arranged in tree, no connections Actor “handlers” nest other actors

10/08/2015 Peter Reutemann 14 of 18

MOA - Cluster visualization

Stream 1

Stream 2

Page 15: Big Data with ADAMS€¦ · 10/08/2015 Peter Reutemann 3 of 18 Flow Operators are called “actors” Actors arranged in tree, no connections Actor “handlers” nest other actors

10/08/2015 Peter Reutemann 15 of 18

Track mouse

Page 16: Big Data with ADAMS€¦ · 10/08/2015 Peter Reutemann 3 of 18 Flow Operators are called “actors” Actors arranged in tree, no connections Actor “handlers” nest other actors

10/08/2015 Peter Reutemann 16 of 18

Track mouse

Page 17: Big Data with ADAMS€¦ · 10/08/2015 Peter Reutemann 3 of 18 Flow Operators are called “actors” Actors arranged in tree, no connections Actor “handlers” nest other actors

10/08/2015 Peter Reutemann 17 of 18

Industry

● BLGG - environmental lab in NL● Spectral analysis

● XRF: 10,000, MIR: 2,000, NIR: 1,500

● In operation since 2006● Predictive modelling: soil, plant (~250 models)● 1,000 to 3,000 samples per day● Savings due to less wet chemistry

● USD 18 million to USD 33 million per year

Page 18: Big Data with ADAMS€¦ · 10/08/2015 Peter Reutemann 3 of 18 Flow Operators are called “actors” Actors arranged in tree, no connections Actor “handlers” nest other actors

10/08/2015 Peter Reutemann 18 of 18

Interested?

https://adams.cms.waikato.ac.nz/

@TheAdamsFlow