Big Data - Where from Where to

Preview:

DESCRIPTION

 

Citation preview

Copyright © 2013 Hanmin Jung

Hanmin JungHead of the Dept. of Computer Intelligence Research

KISTI

Big Data:Where from? Where to?

Copyright © 2013 Hanmin Jung

� Very Recent Activities on Big Data

� (National Science and Technology Commission) Member of Big Data Technical Impact Assessment Committee

� (Korea Communications Commission) Sub-committee Chair of Big Data Forum

� (Ministry of Knowledge Economy) Technical Secretary of Big Data Program Planning Committee

� (Ministry of Educational Science and Technology) Member of Big Data Information Strategic Program Expert Committee

� (National IT Industry Promotion Agency) Lecturer of Big Data Expertise Reinforcement Program

Let Me Introduce Myself :-)

2

Copyright © 2013 Hanmin Jung3

Questions

Where are Big Data from?

Who gathers and consumes the data?

Is the data used for?

Copyright © 2013 Hanmin Jung

Smart Work

http://files.thinkpool.com/files/bbs/2010/07/21/%EC%8A%A4%EB%A7%88%ED%8A%B8%EC%9B%8C%ED%81%AC1.jpg

4

Copyright © 2013 Hanmin Jung

Cloud Computing

� Service Platform Accelerated by Mobile Devices

http://simpleroot.com/wp-content/uploads/2012/10/Remote-Cloud-Computing.jpg

5

Copyright © 2013 Hanmin Jung6

Cloud Computing – 建建建建て前前前前& 本音本音本音本音

� Introducing iCloud

Copyright © 2013 Hanmin Jung7

Cloud Computing

� Google Data Center

http://www.youtube.com/watch?v=avP5d16wEp0

Copyright © 2013 Hanmin Jung8

Data Sources

Web -> Social -> Thing

“The next Google or Facebook may well bean Internet of Things company.”by R. MacManus (ReadWriteWeb)

Copyright © 2013 Hanmin Jung9

Social Data

http://bynoy.files.wordpress.com/2011/08/united-noy-weblife-60-seconds.jpg

Copyright © 2013 Hanmin Jung10

Machine Data

T. Baer, “What is Big Data? The Reality for Analytics”, OVUM, 2011.

Call data recordsCall data records

Sensory dataSensory data

Web log filesWeb log files

Financial Instrument TradeFinancial Instrument Trade

Copyright © 2013 Hanmin Jung11

Internet of Things

K. Escherich, “Internet of Things”, 2011.

Copyright © 2013 Hanmin Jung12

Big Data in the World

http://www.ektron.com/billcavablog/Big-Data-Big-Content-Big-Challenges/

Copyright © 2013 Hanmin Jung13

Infographics for Big Data

http://thumbnails.visually.netdna-cdn.com/big-data_50291c3b16257.jpg

Copyright © 2013 Hanmin Jung14

Google.com Traffic

http://siteanalytics.compete.com/naver.com/

Copyright © 2013 Hanmin Jung15

Naver.com Traffic

http://siteanalytics.compete.com/naver.com/

Copyright © 2013 Hanmin Jung

Foreseeable Future

� Google Project Glass

16

Copyright © 2013 Hanmin Jung17

Hype Cycle

Copyright © 2013 Hanmin Jung18

Hype Cycle – 2010

Emerging Technologies Hype Cycle 2010

Copyright © 2013 Hanmin Jung19

Hype Cycle – 2011

Emerging Technologies Hype Cycle 2011

Copyright © 2013 Hanmin Jung20

Hype Cycle – 2012

Emerging Technologies Hype Cycle 2012

Copyright © 2013 Hanmin Jung21

Google Insights

http://www.google.com/insights/search/

Copyright © 2013 Hanmin Jung22

Bottleneck in Data Ecosystem

http://quizzicaleyebrow.files.wordpress.com/2011/03/pict0044.jpg

Copyright © 2013 Hanmin Jung23

Big Data Ecosystem

http://imexresearch.com/Newsletter_HTML/bd2.png

Copyright © 2013 Hanmin Jung

Big Data Ecosystem

� New Approaches Required for

� Persistence

� Indexing

� Caching and query optimization

� Processing

� Structure

� Query language

� Compression

24T. Baer, “What is Big Data? The Reality for Analytics”, OVUM, 2011.

Copyright © 2013 Hanmin Jung25

Insights for Search

http://www.google.com/insights/search/

Copyright © 2013 Hanmin Jung

Mobile Phone

� Worldwide Market Share

� Worldwide mobile device sales to end users in 2008 ~ 2012

Gartner, IDC Worldwide Mobile Phone Tracker

4.0, 14.14.3, 17.19.9, 47.8Apple

7.5, 23.011.0, 31.68.1, 28.45.4, 21.1LG

3.3, 15.8Huawei

Company4Q2012

(%, M. Units)3Q2011

(%, M. Units)3Q2010

(%, M. Units)3Q2009

(%, M. Units)3Q2008

(%, M. Units)

Nokia 17.9, 86.3 27.1, 106.6 31.6,110.4 37.8, 108.5 38.6, 117.9

Samsung 23.0, 111.2 22.3, 87.8 20.5, 71.4 21.0, 60.2 17.0, 52.0

ZTE 3.6, 17.6 4.9, 19.1 3.5, 12.1

Sony Ericsson 4.9, 14.1 8.4, 25.7

Motorola 4.7, 13.6 8.3, 25.4

Others 42.3, 203.8 36.1, 142 32.2, 112.5 20.6, 59.1 20.1, 61.5

Total 482.5 393.7 348.9 287.1 305.4

26

Copyright © 2013 Hanmin Jung27

CDC Influenza Summary

http://www.cdc.gov/flu/weekly/usmap.htm

Copyright © 2013 Hanmin Jung28

Google Flu Trends

J. Ginsberg, “Detecting influenza epidemics using search engine query data”

Copyright © 2013 Hanmin Jung29

Voice Search Evaluation

http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//pubs/archive/40491.pdf

Copyright © 2013 Hanmin Jung30

Causes of Death

http://image.guardian.co.uk/sys-files/Guardian/documents/2011/10/28/Factfile_deaths_2_2011.pdf

Copyright © 2013 Hanmin Jung31

IBM Watson

http://powet.tv/powetblog/wp-content/uploads/2011/02/watson_the_computer_beats_ken_jennings_and_brad_rutter_at_jeopardy_full.jpg

Copyright © 2013 Hanmin Jung32

Search

Clustering

Extracting

DecisionSupport

Forecasting

ScenarioPlanning

Advising

Modified from D. Bousfield & P. Fooladi, “STM Information: 2009 Final Market Size and Share Report”, 2010.

Value Pyramid

InSciTe Advanced (2011)

InSciTe Adaptive (2012)

OntoFrame (2005~2009)

InSciTe Advanced (2010)

Copyright © 2013 Hanmin Jung33

Big Data & Decision Making

http://lithosphere.lithium.com/t5/Lithium-s-View/Big-Data-Analytics-Reducing-Zettabytes-of-Data-Down-to-a-Few/ba-p/36378

� Reducing Zettabytes of Data Down to a Few Bits

Data help us make better decisions.

The primary function of analytics is to support decision making.

The challenge of big data analytics isto reduce a lot of data down to a few bits.

Copyright © 2013 Hanmin Jung

Strategic Foresight

R. Rohrbeck, H. Arnold, and J. Heuer, “Strategic Foresight in Multimedia Enterprises”, 2007.

34

Copyright © 2013 Hanmin Jung35

Quantitative Analytics

Copyright © 2013 Hanmin Jung36

TI Projects

� FUSE

� Funded by IARPA (early 2011 ~ early 2016)

� Kick off meeting in summer, 2011

� Foresight and Understanding from Scientific Exposition Program

� Seeks to develop automated methods that aid in the systematic, continuous, and comprehensive assessment of technical emergence using information found in the published scientific, technical, and patent literature

� Partners

� BAE Systems, Brandeis Univ., New York Univ., 1790 Analytics, …

Copyright © 2013 Hanmin Jung37

TI Projects

� FUSE

Copyright © 2013 Hanmin Jung

TI Projects

� CUBIST

� Funded by the European Commission (late 2010 ~ late 2013)

� 1st CUBIST workshop in July, 2011

� Combining and Uniting Business Intelligence with Semantic TechnologiesProgram

� Aims to develop new ways to interrogate not only the massive volume data on the Internet, but also analyze the different formats it exist in – such as blogs, wikis, and video

� Partners

� SAP, Ontotext, Sheffield Hallam Univ., …

38

Copyright © 2013 Hanmin Jung39

TI Projects

� CUBIST

Copyright © 2013 Hanmin Jung

TI Projects

� Common Technologies

� Semantic technologies

� Ontology, reasoning, URI scheme

� Analytics model

� BYOM (e.g. technology opportunity discovery model, technology evolution model, formal concept analysis model)

� Information extraction (InSciTe, FUSE)

� Named entities and events/relations in textual documents

40

Copyright © 2013 Hanmin Jung

Our Vision & Architecture

41

Copyright © 2013 Hanmin Jung

InSciTe Advanced (2011)

42

Copyright © 2013 Hanmin Jung43

InSciTe Adaptive (2012)

Copyright © 2013 Hanmin Jung

Data Fact Sheet

� InSciTe Adaptive (2012)

� Articles: 22.6 millions (9.8 millions for papers, 7.6 millions for patents, 5.3 millions for Web data)

� All technical areas (2001~2011)

� Named entities: 1.9 millions

� Authority dictionary: 1.5 millions entries

� LOD data: 290 GB (are being connected)

44

Copyright © 2013 Hanmin Jung45

Supporting Decision Making

http://4.bp.blogspot.com/-Pf1hkccZZh4/TWDJahBpL2I/AAAAAAAAASU/JHLpXi8d9AQ/s640/meetings.jpg

Copyright © 2013 Hanmin Jung46

Data Scientist

http://philanthropy.com/blogs/innovation/matching-data-scientists-and-nonprofits/778

Copyright © 2013 Hanmin Jung

Evidence-based Decision Making

� Advantages

� Ensures that policies are responding to the real needs of the community

� Highlight the urgency of an issue or problem which requires immediate attention

� Enables information sharing amongst other members of the public sector

� Reduces government expenditure which may otherwise be directed into ineffective policies or programs

� Produces an acceptable return on the financial investment that is allocated toward public programs

� Ensures that decisions are made in a way that is consistent with our democratic and political processes which are characterized by transparency and accountability

http://www.abs.gov.au/ausstats/abs@.nsf/lookup/1500.0chapter32010

47

Copyright © 2013 Hanmin Jung48

InSciTe Project

http://semantics.kisti.re.kr

Copyright © 2013 Hanmin Jung49

Thank you

jhm@kisti.re.kr

“A lot of times, people don’t know what they want until you show it to them.”

by Steve Jobs

“Many people won’t be convinced until they’ve seen it for themselves.”

by Jakob Nielsen

Recommended