Upload
john-breslin
View
856
Download
4
Tags:
Embed Size (px)
DESCRIPTION
Business Intelligence Workshop (BIWO 2014) / Santiago, Chile / 14th August 2014
Citation preview
John Breslin - NUI Galway - @johnbreslin linkd.in/johnbreslin
Data Analytics and Industry-Academic Partnerships:
An Irish Perspective
First: Ireland and Chile! • John/Juan Garland, governor of Valdivia; Ambrosio O’Higgins (from
Sligo), Chile governor and founder of first transcontinental postal service; his son Bernardo O’Higgins, first president of independent Chile; John/Juan McKenna and Thomond O’Brien, Chilean-Irish independence fighters; grandson Benjamin McKenna, writer and liberal politician; George O’Brien, founder of the Chilean navy; Patricio Lynch, Chilean-Irish naval hero (grandfather from Galway) and ancestor of Che Guevara; drill head used in rescue of Atacama miners was made in Ireland (County Clare)
• http://brenspeedie.blogspot.com/2010/11/what-did-irish-ever-do-for-chile.html (a colleague of mine from NUI Galway wrote this)
• http://www.irlandeses.org/0610griffin1.htm
• http://en.wikipedia.org/wiki/Irish_Chilean
1. Menlo Castle (origin of Menlo Park in Silicon Valley, California)2. Computer Museum of Ireland (at DERI)3. NUI Galway (where Stoney, namer of the “electron”, was a prof.)4. Java’s (Zachary Quinto, AKA Spock, waited on tables here)
OnePageCRM
Insight
Ex Ordo
12
3
4
5
G A L W A YT E C H M A P
European Microcity of the Future
5. Claddagh (birthplace of the Claddagh Ring, and Angel from Buffy!)6. Ignite TTO Business Innovation Centre7. Galway Technology Centre8. Innovation in Business Centre (at GMIT)9. Marine Institute
7
8
6
@johnbreslin@technologyvoice@startupgalway#upgalway#gaillimhabuv1.201405271500bit.ly/galwaytech
NUI Galway introduction
NUI Galway in brief • Established in 1845:
• One of Ireland’s seven universities
• 105 hectare campus (260 acres)
• 120 links with universities around the world
• 17,300 students:
• 12,500 undergraduates, 3,600 postgraduates, 1,200 other
• 2,541 staff:
• 1,078 academics, 1,015 admin and support, 448 research
• 90,000 alumni in over a hundred countries
Famous alumni • Alice Perry, first female graduate engineer in the
world, 1849
• Michael O’Shaughnessy, a Civil Engineering graduate from the University in the 1880’s, was San Francisco Chief Engineer, and commissioned the Golden Gate Bridge
• Honorary degrees to Nelson Mandela, Hillary Clinton
• TV and movie star Martin Sheen (The West Wing’s President Bartlet) studied here in 2006/2007
• Includes the largest School of Engineering in Ireland (finished 2011, 14,000 square metres, €43 million)
• Information Technology, Electrical and Electronic Engineering, Biomedical Engineering, Mechanical Engineering, Civil Engineering, DERI (now Insight)
The College of Engineering and
Informatics
Insight Centre for Data Analytics
Incorporating DERI (Digital Enterprise Research Institute) at NUI Galway
200RESEARCHERS 8INSTITUTIONS 30PARTNERS €88MFUNDING
1 = 10
bytes 18
exabyte
1000,000,000,000,000,000
1 = 10
bytes 18
exabyte
20,000 x all of the printed material in the
US Library of Congress. Or all of the words spoken by humans. Ever!
1 = 10
bytes 18
exabyte
6 !
hours
But, we now create this much information every
1 = 10
bytes 18
exabyte
Volume. Velocity. Variation. +Veracity.
A PARADIGMSHIFT
algorithm!!
data
algorithm !!
data
Politics
Epidemiology
Sport
FROM PATTERNS TO PREDICTIONS
From simple inertial sensing…
… to longitudinal gait analysis.
Detecting gait-cycledeviations ⇒ falls prediction
Predicting functional decline.
PARTICIPATORY SENSING
TURNING DATA INTO DECISIONS
Using sensor data to optimise forestry resources during harvesting.
Stem volume prediction, yield management etc.
Remote sensing + autonomous cutter control.
TURNING DATA INTO KNOWLEDGE
Creating a network of knowledge.
Data ⇒ Semantics
Semantics ⇒ Discovery
e.g. discovering links between drugs, genes, and diseases.
MINING PATTERNS FROM REAL-TIME SOURCES
Physical shockwaves travel at 4.8km/sec but knowledge of the earthquake traveled at 70km/
sec to Galway.
14:42 Earthquake strikes.""14:43 First tweet from @Bacanalnica in nearby Managua.""14:44 120 secs later the first tweets are posted.
Insight Centre
Future Projects Legacy Projects
28
Peracton (DERI spin out)
SindiceTech (spin out)
Case studies: finding insights in business data
1. Finding expertise and content
2. Holistic energy management
3. AYLIEN text analytics
4. Social analytics for recommendation and communities
• Saffron (saffron.deri.ie) extracts knowledge from text, with business applications in expert finding, community detection, recommender systems, and enterprise search, e.g.:
1. Ecommerce system with Kennys Bookshop to analyse book descriptions and reviews to extract a fine-grained book topic categorisation for use in book recommendation to customers
2. EnRG entity relatedness for applications in semantic search (EnRG is built over a large matrix on Wikipedia and using the DBpedia ontology)
1. Expertise and content
Kennys Bookshop
2. Holistic energy management
Managing energy related to:
• Office IT
• Data centres
• Facilities
• Business travel
• Daily commutes
Keep in mind business context:
• Energy expended
• Finances required
• Resource allocation
• Human resources
• Asset management
Making smart buildings smarter (have air conditioning at CO2 peaks)
Energy management software can be unintuitive/difficult to use
We can do better!
More challenges • Technology and data
interoperability: data scattered among different systems, multiple incompatible technologies make it difficult to use
• Interpreting dynamic and static data: sensors, ERP, BMS, assets databases
• Need to proactively identify efficiency opportunities
• Empowering actions and including users in the loop
• Understanding of direct and indirect impacts of activities
• Embedding impacts within business processes
• Engaging users
Applica
tions
Energy Analysis Model
Complex Events
Situation Awareness Apps
Energy and Sustainability Dashboards
Decision Support Systems
Linked
Dat
a
Support
Se
rvic
es
Entity Management
Service
Data Catalog
Complex Event Processing
Engine
Provenance Search & Query
Sourc
es
Adapter Adapter Adapter Adapter Adapter
Energy saving applications
Energy awareness
Semantic event processing
Collaborative data management
Cloud of energy data
Linked sensor middleware
Resource Description Framework (RDF)
Semantic sensor networks
Constrained application protocol (CoAP)
Linked energy intelligence
You…
…are part of a team…
…sharing a ‘shell’
3. AYLIEN text analytics • AYLIEN is based in Dublin, backed by SOS Ventures
• 7 employees, started as B2C, switched to B2B in 2014
• Vision to “extract reality from data” (information retrieval domain)
• Research collaboration with NUI Galway through John (sentiment analysis on large-scale social media)
• http://bresl.in/aylientechcrunch
Text Analysis API (TAA) ● A package of easy-to-use
tools for extracting information and insights from any text
● Language detection
● Supported Languages (EN, DE, PT, ES, IT, FR)
● 168 customers
● Academic, ad intelligence and brand protection, sentiment analysis/opinion mining, PR and media, CRMs, education, psychology/interest graph
● Endpoints:
o Extraction
o Classification
o Summarisation
o Concept/entity extraction
o Hashtag suggestion
o Sentiment analysis
How is TAA used? o Three major methods for deploying text analytics
services: API (via the “cloud”), on-premises deployments, other integrations
o TAA is mainly provided using an API/subscription model (monthly) via Mashape or 3scale
o Additional integrations with Google Spreadsheets and other platforms (Telerik, Azuqua)
o In future: on-premises deployment, subscription (yearly), custom solutions (bespoke)
Under the TAA hood… • Based on Machine Learning (ML)
techniques (supervised, unsupervised and semi-supervised)
• Extraction: useful for scraping text, media and metadata from web pages
• Annotate text, media and metadata in a training set
• Extract a set of heuristic rules and use them to extract text, media and metadata
• Summarisation: Extracting n-best key sentences from a document, based on heuristics and a sentence similarity matrix (initially), learning over time
• Classification and document-level sentiment analysis: assigning a label to any piece of text (“sports”, “technology”, “positive”, “negative”)
• Create word vectors from an annotated dataset
• Train a classifier, use it to predict future classes for a new instance
• Similar to a spam filter
• Concept extraction: Find what is mentioned in a document and disambiguate them based on contextual clues e.g. Apple is mentioned, how do we find out if it’s the fruit or the company?
TAA market (SMEs) Segments: SMEs, enterprises
Size: “many times the $2bn forecast”
US, UK, Germany, Spain, India
See “Text Analytics 2014: User Perspectives on Solutions and Providers”
Market: natural language processing [related markets: machine learning, text mining]
SME segment: AlchemyAPI, Semantria (Lexalytics), Textalytics,
Fluxifi
Enterprise segment: SAS, IBM, Lexalytics
Main target: SMEs
Differentiators: feature-richness, quality, price, progression
http://www.programmableweb.com/news/how-5-natural-language-processing-apis-
stack/analysis/2014/07/28
4. Social analytics • Some applications include cross-domain
recommendations, community detection and evolution monitoring
• SemStim (Cisco)
• Whassapi (Volvo Ocean Race)
• SociaLens (ROBUST)
SemStim A spreading activation based algorithm tuned for RDF-based social semantic networks
How does it work?
Travel destinations:
Movies:
User profile with DBpedia URIsfrom multiple source domains
Cross-domain recommendation algorithm using DBpedia as
background knowledgeRecommendations
from target domains
Input Background knowledge Cross-domain output
http://advansse.deri.ie/
Advansse
Whassappi: real-time event and topic detection
GraphDB
NetworksSlices
Ranked Communities, Users and Labels
Feedback:Users to Follow
Raw Tweets
RelationalDB
Relevant Tweets
Signal:Updated Communities Data
Tweets, Topics &Communities
Relationships
Initial Tweets & Communities Data
TweetsClassifierTwitter Stream
Filter
Entities Extraction
AnalyticsStreaming Flow
On-‐demand Flow
System Triggered Flow
Data Flow Types
Mobile App
SociaLens: insights into enterprise communities
Thanks! Any questions?