36
Big Data - Beyond the 'Bigness' and the Technology April 26, 2012 Anant Jhingran @jhingran http://blog.apigee.com http://jhingran.typepad.com

Big Data: Beyond the "Bigness" and the Technology (webcast)

  • Upload
    apigee

  • View
    5.275

  • Download
    2

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Big Data: Beyond the "Bigness" and the Technology (webcast)

Big Data - Beyond the 'Bigness' and the Technology

April 26, 2012

Anant Jhingran @jhingran

http://blog.apigee.com

http://jhingran.typepad.com

Page 2: Big Data: Beyond the "Bigness" and the Technology (webcast)

groups.google.com/group/api-craft

Page 3: Big Data: Beyond the "Bigness" and the Technology (webcast)

youtube.com/apigee

Page 4: Big Data: Beyond the "Bigness" and the Technology (webcast)

IRC Channel#api-craft

on freenode

New!

Page 5: Big Data: Beyond the "Bigness" and the Technology (webcast)

Three themes

Big Data dialog has focused on the wrong things – bigness and technology, which are both misplaced

Big Data needs to focus on the right new thing – focus on data stitching from disparate data sources

Data APIs need to be front and center of any Big Data dialog – too little discussion on that

Page 6: Big Data: Beyond the "Bigness" and the Technology (webcast)

Big Data discussion has focused on the wrong things

Page 7: Big Data: Beyond the "Bigness" and the Technology (webcast)

Wrong thing #1 – focus on technology

Business value

DATA“THE GOLD”

TECHNOLOGY“THE MEANS”

Cassandra

HBASE

EC2 . . .

.91

Page 8: Big Data: Beyond the "Bigness" and the Technology (webcast)

dept

h of

ana

lysi

s

size of the data 100TB 10 PB

Interesting problems

Hype

2 dimensions of complexity

Big data nerds $$$ VC invest Next cool tech

- webscale etc.

Wrong thing #2 – focus on bigness

Page 9: Big Data: Beyond the "Bigness" and the Technology (webcast)

Big Data needs to focus on the new right thing

Page 10: Big Data: Beyond the "Bigness" and the Technology (webcast)

Circa 2005 – Data controlled within enterprise

YourCompany

Web Page

Store

Data Warehouse

Page 11: Big Data: Beyond the "Bigness" and the Technology (webcast)

2012 – Control shifts to edge of enterprise

YourCompany

Web Page

Store

Data Warehouse

BusinessNetworks

SocialNetworks

Partners

Apps

API

Page 12: Big Data: Beyond the "Bigness" and the Technology (webcast)

Control shifts to edge of enterprise

Big Data needs to become Broad Data

Page 13: Big Data: Beyond the "Bigness" and the Technology (webcast)

Da

ta v

olu

me

enterprise data sources

enterprise + complementary sources

old world new world

Page 14: Big Data: Beyond the "Bigness" and the Technology (webcast)

sign

al /

noi

se

Most of the bigness comes from noise

The noise doesn’t matter

Only the signal matters

Page 15: Big Data: Beyond the "Bigness" and the Technology (webcast)

sign

al /

noi

se

Increase signal/noise by stitching data

sources

Page 16: Big Data: Beyond the "Bigness" and the Technology (webcast)

enterprise

syndicated

✖ Web 1.0 – Crawling . . .

✖ Web 2.0 – AJAX . . .

✔Web 3.0 - APIs + control of data

enterprise

external

access ?

control ?

central or de-central process?

Page 17: Big Data: Beyond the "Bigness" and the Technology (webcast)

If we give up the wrong things and take up the right things, what is it that we need to do?

Page 18: Big Data: Beyond the "Bigness" and the Technology (webcast)

It’s about . . .• Accessing Data that others collect• Variety• Striking deals• Respecting the APIs• Data stitching and improving S/N ratio• Depth of analysis

It’s not about . . .• Crawling• BIGNESS from any one data source

Shifting from Big Data to Broad Data

Page 19: Big Data: Beyond the "Bigness" and the Technology (webcast)

Data APIs are the future

So what kind of Data APIs?

Page 20: Big Data: Beyond the "Bigness" and the Technology (webcast)

Data APIs are the future

Monetizable apps produce & consume data

Data is the lifeblood at edge of enterprise

Need to focus on making data consumption easy

Page 21: Big Data: Beyond the "Bigness" and the Technology (webcast)

Yin and a Yang of transactions and data

X-APIs

User managementSend SMSAdd movieDo tradeGet credit info

Example APIs

D-APIsBrowse catalogGet weather by Zip codeGet demographics by region

Page 22: Big Data: Beyond the "Bigness" and the Technology (webcast)

Let’s create an information halo around APIs

http://blog.apigee.com/detail/api_strategy_talk_web_2.0/

See Amundsen’s Dogs, Information Halos and APIs: The epic story of your API Strategy »

Page 23: Big Data: Beyond the "Bigness" and the Technology (webcast)

Give Data . . .

what are your transactions, and what are your data?

Do you want to be crawled or do you want to control it?

Give Visibility . . .

Analytics and Data go hand in hand…

. . . to both your end developers and your colleagues

Page 24: Big Data: Beyond the "Bigness" and the Technology (webcast)

People are planting “flags” on various data domains by collecting and stitching disparate data together

Weather

Real-estate

Finance

Internet Traffic

Local

Business

Social

Demographic

Purchases

Price

Page 25: Big Data: Beyond the "Bigness" and the Technology (webcast)

To build out a single domain, many data sources have to be accessed and stitched

A natural stitching thing could be linked data

linkeddata.org

Page 26: Big Data: Beyond the "Bigness" and the Technology (webcast)

Once stitched, clean APIs can be provided

Data Source

Data Source

Data Source

Data Sources(crawled, bulk loaded, API accessed)

Data API and Analytics

Cleansed, Stitched

Page 27: Big Data: Beyond the "Bigness" and the Technology (webcast)

Data Source

Data Source

Data Source

Data Sources(crawled, bulk loaded, API accessed)

Data API and Analytics

Cleansed, Stitched

Typically Linked Data techniques not used here

Page 28: Big Data: Beyond the "Bigness" and the Technology (webcast)

Data Source

Data Source

Data Source

Data Sources(crawled, bulk loaded, API accessed)

Data API and Analytics

Cleansed, Stitched

Can Linked Data techniques be used here?

Page 29: Big Data: Beyond the "Bigness" and the Technology (webcast)

Linked Data as the Data API for the domains not likely to be very common

Why? The interlinking of domains is not as important as the strength of any one domain (at least for now)

Weather

Real-estate

Finance

Internet Traffic

Local

Business

Social

Demographic

Purchases

Price

Page 30: Big Data: Beyond the "Bigness" and the Technology (webcast)

If not linked data APIs, what other Data APIs might become common?

Data Source

Data Source

Data Source

Data Sources(crawled, bulk loaded, API accessed)

Data API and Analytics

Cleansed, Stitched

Our guess: APIs patterned after relational access

Page 31: Big Data: Beyond the "Bigness" and the Technology (webcast)

Kinds of Data APIs we are observing

Data

Primary Key Lookuphttp://weather.yahooapis.com/forecastrss?w=location

Imposed Hierarchy based traversal over collectionshttp://api.worldbank.org/incomeLevels/LIC/countries

“Rectangle” {rows, columns} through query parametershttp://api.worldbank.org/countries?per_page=10&incomeLevel=LIC

Page 32: Big Data: Beyond the "Bigness" and the Technology (webcast)

There are many perspectives on data APIs coming from relational world

http://blog.apigee.com/detail/rest_api_design_for_sql_programmers

http://azgroups.nextslide.com/odata-begins

Page 34: Big Data: Beyond the "Bigness" and the Technology (webcast)

• Practical REST and OData are good starting points

• However, they cannot be available as vendor-specific implementations

• The Linked Data model cannot be ignored completely

• Let us, as a community, get the best of Linked Data and OData thoughts together

• Let’s continue this dialoggroups.google.com/group/api-craft

What do we need for Data APIs to take off?

Page 35: Big Data: Beyond the "Bigness" and the Technology (webcast)

Big Data dialog has focused on the wrong things – bigness and technology, which are both misplaced

Big Data needs to focus on the right new thing – focus on data stitching from disparate data sources

Data APIs need to be front and center of any Big Data dialog – too little discussion on that

Wrapping up

Page 36: Big Data: Beyond the "Bigness" and the Technology (webcast)

THANK YOUQuestions and ideas to:

@jhingran