View
16
Download
0
Category
Preview:
Citation preview
Big Data Integrator Platform Platform Architecture and Features
Dr. Hajira JabeenTechnical Team Leader-BDEUniversity of Bonn
BDE Presentation, EBDVF, 17
Platform Architecture Support Layer
Init Daemon
GUIs
Monitor
App Layer
Traffic Forecast Satellite Image Analysis
Platform Layer
Spark Flink Semantic Layer
Ontario SANSA SemagrowKafka
Real-time Stream Monitoring
...
...
Resource Management Layer (Swarm)
Hardware Layer
Premises Cloud (AWS, GCE, MS Azure, …)
Data Layer
Hadoop NOSQL Store CassandraElasticsearch ...RDF Store
8
BDE Supported FrameworksSearch/indexing Data processing
Apache Solr Apache Spark
Data acquisition Apache Flink
Apache Flume Semantic Components
Message passing Strabon
Apache Kafka Sextant
Data storage GeoTriples
Hue Silk
Apache Cassandra SEMAGROW
ScyllaDB LIMES
Apache Hive 4Store
Postgis OpenLink Virtuoso
9
Platform features◎ BDE Development Environment
o Stack buildero Workflow buildero Instructions to add custom components
◎ Administrative Interfaceo SwarmUIo Logger Interface
◎ UI Integratoro Workflow monitoro Integrated web interface
10
BDE Integrator UI-WorkFlow 11
StackbuilderSelect components => (Push Create-Flow)
WorkFlow builderArrange Components => (Push Monitor)
SwarmUISee the scaling and scale up/down
BDE LoggerNavigate the componentUI and deploy jobs
Git-clone
New Stack
Integrator UI
WorkMonitorDeployment status of Components => (Push OK)
BDE-IDE
◎ Open-Source, Community Driveno Commitment from core BDE consortium team
o Independent BD components maintenance
o Platform Maintenance driven by BDI users
◎ Adopterso Feuga , Eurostat, ILVO, I2cat, Vicomtech, IoF, ...
◎ Follow Up Projectso HOBBIT, Special, BigDataOcean, Qrowd, BETTER, …
Maintenance and Uptake20
BDE vs Hadoop distributionsHortonworks Cloudera MapR Bigtop BDE
File System HDFS HDFS NFS HDFS HDFS
Installation Native Native Native Native lightweight virtualization
Flexible Modular Architecture no no no no yes
High Availability Single failure recovery (yarn)
Single failure recovery (yarn)
Self healing, mult. failure rec.
Single failure recovery (yarn)
Failure recovery
Cost Commercial Commercial Commercial Free Free
Scaling Freemium Freemium Freemium Free Free
Addition of custom components
Not easy No No No Yes
Integration testing yes yes yes yes --
Operating systems Linux Linux Linux Linux Windows/Mac/Linux
Management tool Ambari Cloudera manager MapR Control system
- Docker swarm + Custom UI
21
Semantic Data Lake
◎ Data Lake o Repository of data collected in its original formatso Structured, semi-structured, unstructuredo Schema-less
◎ Semantic Data Lakeo Add a Semantic Layer on top of source datasets❖ The data is semantically lifted using ontologies❖ Provide a uniform view over nonuniform data
29
Metadataproperty -> data source (type)
Semantic Data Lake30
Decomposing User QuerySPARQL query
Database XML File
?item gho:Country ?country .?item gho:Disease ?disease .
...
SELECT country, disease, ...
FROM Observations
Finding Relevant Data Sources+ Queries Translation
SQL XPathSQL
MongoDB
JSONPath
SQL
XML
MongoDB
Execution Plan
Thank you!
31
BDI on Github:https://github.com/big-data-europe
Technical Questionsplatform@big-data-europe.eu
Project Website:www.big-data-europe.eu
SANSA Stackhttps://github.com/SANSA-Stack jabeen@cs.uni-bonn.de
The mobility use case in Thessaloniki
◎ Multisource datasets (speed, traffic flow, travel time) are being used in Thessaloniki for the provision of traffic status short-term prediction based on mobility/traffic patterns recognition.
◎ Integration of machine learning techniques using the travel times, traffic counts and speeds as well as the correlations of traffic speed, to train an appropriate Neural Network Model for efficient and robust traffic speed prediction.
The datasets
◎ Floating Car Datao 500 – 2.500 speed measurements per minute
o Location, speed, orientation, status
o Hundreds of Gb (historical dataset)
Mobility services in Thessaloniki◎TrafficThess (http://www.trafficthess.imet.gr)
o Visual representation of the current as well as past speeds in Thessaloniki, Greece
o Email notificationso Historical raw data export per link in open format
◎TrafficPaths (http://www.trafficpaths.imet.gr)o Descriptive information of the current travel times wherever available
(Thessaloniki, Patra, Irakleon, Serres, Kavala)o Mobile friendly web page
◎TrafficThess Reports (http://www.trafficthessreports.imet.gr) o Visual and descriptive representation of the current traffic conditions (speeds &
travel times) on the main roads of Thessaloniki, Greeceo Highly customizable email notificationso Normalized historical data export per road in open formato Traffic calendar (powered by Google)
◎BDE (http://trafficstatusprediction.imet.gr/#)
Mobility services in ThessalonikiTrafficThess (http://www.trafficthess.imet.gr)Reliable traffic conditions monitoring on a 24/7/365 basis
Traffic conditions in the city of Thessaloniki, Greece during snowfall on 10 & 11 Jan 2017:
https://youtu.be/2z12tUkuwaM(credits to anmpout for helping out with the video)
Keep calm!It’s just another congestion on the ring road…
Mobility services in ThessalonikiTrafficPaths (http://www.trafficpaths.imet.gr)Calculation of travel times on a 24/7/365 basis
Mobility services in ThessalonikiTrafficThess Reports (http://www.trafficthessreports.imet.gr)
A personalized single point of access
Mobility services in Thessaloniki• Datatank (Back office + restAPIs)
• CKAN (front end)
http://opendata.imet.gr/dataset
Paris, EBDVF – 22nd November 2017
DR. JOSEP MARIA SALANOVA GRAUJOSE@CERTH.GR +30 2310 498 433
Recommended