37
www.Objectivit y.com Welcome! Webinar: Big Data – NoSQL Technology and Real-time, Accurate Predictive Analytics © Objectivity Inc 2013

NoSQL Technology and Real-time, Accurate Predictive Analytics

Embed Size (px)

DESCRIPTION

Big Data: NoSQL Technology and Real-time, Accurate Predictive Analytics Enjoy this insightful webinar moderated by Matt Aslett, Research Director at 451 Group beginning with a brief overview of Objectivity, Inc. and its products Objectivity/DB, a world class object database and InfiniteGraph, the enterprise proven, scalable and distributed graph database with deployments across multiple major verticals including government, telecom, finance, security, and social networking. Learn how Georgetown University is taking advantage of Objectivity’s products to develop one of the most interconnected databases today. Examining information from all types of sources worldwide in real-time. J.C. Smart, Director Global Insight Laboratory, Georgetown University- Coming Soon Leon Guzenda, Founder, Objectivity – a founding member of Objectivity, Inc. in 1988, one of the original architects of Objectivity/DB and Chief Technology Officer. He now consults with the company and works with Objectivity’s Big Data and Analytics customers/partners to deploy Objectivity/DB and InfiniteGraph, a high performance, scalable graph database. Matt Aslett, Research Director, 451 Group – As Research Director for data management and analytics within 451 Research’s Information Management practice, Matt has overall responsibility for the coverage of operational and analytic databases, data integration, data quality, and business intelligence. Matt’s own primary area of focus is on relational and non-relational databases, data warehousing, data caching, and Hadoop. Matthew is also an expert in open source software and regularly contributes to 451 Research’s open source-related research.

Citation preview

Page 1: NoSQL Technology and Real-time, Accurate Predictive Analytics

www.Objectivity.com

© Objectivity Inc 2013

Welcome!

Webinar: Big Data – NoSQL Technology and Real-time,

Accurate Predictive Analytics

Page 2: NoSQL Technology and Real-time, Accurate Predictive Analytics

© Objectivity Inc 2013

Agenda

Market Overview• Presented by Matt Aslett, Research Director at 451 Group

Big Data Use Case• Presented by J.C. Smart, Director Global Insight Laboratory at Georgetown

University

Q&A• Presented by

• Matt Aslett, Research Director at 451 Group• J.C. Smart, Director Global Insight Laboratory at Georgetown University• Leon Guzenda, Founder at Objectivty, Inc.

Page 3: NoSQL Technology and Real-time, Accurate Predictive Analytics

© 2013 by The 451 Group. All rights reserved

Matthew Aslett• Research Director, Data Management and Analytics [email protected] www.twitter.com/maslett

Responsible for data management and analytics research agenda

Focus on operational and analytic databases, including NoSQL, NewSQL, and Hadoop

With 451 Research since 2007

Page 4: NoSQL Technology and Real-time, Accurate Predictive Analytics

© 2013 by The 451 Group. All rights reserved

Company Overview

One company with 3 operating divisions

Syndicated research, advisory, professional services, datacenter certification, and events

Global focus

200+ staff 1,300+ client organizations:

enterprises, vendors, service providers, and investment firms

Organic and growth through acquisition

Page 5: NoSQL Technology and Real-time, Accurate Predictive Analytics

© 2013 by The 451 Group. All rights reserved

Unique combination of research, analysis & data

Emerging tech market segment focus

Daily qualitative & quantitative insight

Analyst advisory & Go-to-market support

Global events

Page 6: NoSQL Technology and Real-time, Accurate Predictive Analytics

© 2013 by The 451 Group. All rights reserved

What has driven the development and adoption of NoSQL?

NoSQL, NewSQL and Beyond• Assessing the drivers behind the development and

adoption of NoSQL and NewSQL databases, as well as data grid/caching technologies• Released April 2011• Role of open source in driving innovation• [email protected]

MySQL vs NoSQL and NewSQL• Released May 2012

Next-generation Operational Databases• Released July 2013

Page 7: NoSQL Technology and Real-time, Accurate Predictive Analytics

© 2013 by The 451 Group. All rights reserved

SPRAINED RELATIONAL DATABASES

Photo credit: Foxtongue on Flickrhttp://www.flickr.com/photos/foxtongue/4844016087/

Page 8: NoSQL Technology and Real-time, Accurate Predictive Analytics

© 2013 by The 451 Group. All rights reserved

Database SPRAIN

The traditional relational database has been stretched beyond its normal capacity by the needs of high-volume, highly distributed or highly complex applications.

There are workarounds – such as DIY sharding – but manual, homegrown efforts can result in database administrators being stretched beyond their normal capacity in terms of managing complexity.

Scalability Performance Relaxed consistency Increased willingness to look towards Agility emerging alternatives Intricacy Necessity

Page 9: NoSQL Technology and Real-time, Accurate Predictive Analytics

© 2013 by The 451 Group. All rights reserved

Necessity is the mother of NoSQL

Hadoop and NoSQL innovation did not come from existing relational database and storage suppliers

It came from Google, Amazon, Facebook, Yahoo, LinkedIn and open source communities…

This has significantly altered the relationship between customer and vendor, and changed the database landscape enormously

And also generated a new breed of database vendors and database products

“We couldn’t bet the company on other companies building the answer for us.”

– Werner Vogels, Amazon CTO

Page 10: NoSQL Technology and Real-time, Accurate Predictive Analytics

© 2013 by The 451 Group. All rights reserved

The NoSQL database landscape

Wide-column stores

Data is mapped by a row key, column key and time stamp.

Key Value Stores

Store keys and associated values.

Graph databases

Store data and the relationships between data.

Document stores

Store all data related to a specific key as a single document.

DATA MODEL COMPLEXITY

Page 11: NoSQL Technology and Real-time, Accurate Predictive Analytics

© 2013 by The 451 Group. All rights reserved

The NoSQL database landscape

Wide-column stores

Data is mapped by a row key, column key and time stamp.

Key Value Stores

Store keys and associated values.

Graph databases

Store data and the relationships between data.

Document stores

Store all data related to a specific key as a single document.

Multi-model databases

Support a combination of the various individual NoSQL data models.

DATA MODEL COMPLEXITY

Page 12: NoSQL Technology and Real-time, Accurate Predictive Analytics

© 2013 by The 451 Group. All rights reserved

The NoSQL database landscape

Graph databases not only store data in a collection of key-value pairs, known as nodes and properties, but also store the relationships – or edges – that connect nodes to other nodes, or nodes to properties.

Users can navigate – or traverse – the resulting graph by nodes, properties or edges to identify and analyze relationships between nodes and properties.

This is inherently more flexible than traditional approaches that would require cross-table joins in relational databases.

Graph databases

Store data and the relationships between data.

Page 13: NoSQL Technology and Real-time, Accurate Predictive Analytics

© 2013 by The 451 Group. All rights reserved

The NoSQL database landscape

Graph databases are more than just a new way of storing data

Graph databases enable analysis of not just individual or aggregate data, but also the relationships between data

Graph databases potentially provide new opportunities for generating business intelligence by highlighting new patterns in data

Graph databases

Store data and the relationships between data.

Page 14: NoSQL Technology and Real-time, Accurate Predictive Analytics

© 2013 by The 451 Group. All rights reserved

Graph analytics

The rise of graph databases is closely linked to the rise of social networking

It could be argued that the most valuable assets that Facebook, Twitter and LinkedIn own are the graphs that represent the relationships between their users and their users’ interests

However, the roots of graph analytics can be traced back much further, all the way to Leonhard Euler’s Seven Bridges of Königsberg, published in 1736

Graph databases

Store data and the relationships between data.

Page 15: NoSQL Technology and Real-time, Accurate Predictive Analytics

© 2013 by The 451 Group. All rights reserved

Seven Bridges of Königsberg (now Kaliningrad)

Find a route crossing each bridge once, and only one• Euler proved there was no solution

Source: Wikipedia http://en.wikipedia.org/wiki/File:Konigsberg_bridges.png

Page 16: NoSQL Technology and Real-time, Accurate Predictive Analytics

© 2013 by The 451 Group. All rights reserved

Seven Bridges of Königsberg (now Kaliningrad)

Relevance today:• Google uses graph theory to find the most efficient routes for Street

View cars to capture images for Google Maps

Page 17: NoSQL Technology and Real-time, Accurate Predictive Analytics

© 2013 by The 451 Group. All rights reserved

Other applications

Less obvious applications include customer management• E.g. Financial services firm with multiple business units

PARENT CO

LOANBANKING

CHECKING CREDIT CARD

INSURANCE PENSION

HOUSE INSURANCE CAR INSURANCE

Page 18: NoSQL Technology and Real-time, Accurate Predictive Analytics

© 2013 by The 451 Group. All rights reserved

Other applications

Less obvious applications include customer management• E.g. Financial services firm with multiple business units• What happens when an individual has multiple customer relationships?

PARENT CO

LOANBANKING

CHECKING CREDIT CARD

INSURANCE PENSION

HOUSE INSURANCE CAR INSURANCE

Page 19: NoSQL Technology and Real-time, Accurate Predictive Analytics

© 2013 by The 451 Group. All rights reserved

Other applications

Less obvious applications include customer management• E.g. Financial services firm with multiple business units• What happens when an individual has multiple customer relationships?• Graph analysis to identify multiple services related to an individual

PARENT CO

LOANBANKING

CHECKING CREDIT CARD

INSURANCE PENSION

HOUSE INSURANCE CAR INSURANCE

Page 20: NoSQL Technology and Real-time, Accurate Predictive Analytics

© 2013 by The 451 Group. All rights reserved

Other applications

Less obvious applications include customer management• E.g. Financial services firm with multiple business units• What happens when an individual has multiple customer relationships?• Graph analysis to identify multiple services related to an individual• And provide a customer-centric relationship perspective

CUSTOMER

PENSIONLOANCHECKING HOUSE INSURANCE

Page 21: NoSQL Technology and Real-time, Accurate Predictive Analytics

© 2013 by The 451 Group. All rights reserved

Exploratory analysis/discovery

While BI involves analyzing data for answers to existing questions, exploratory analytics/discovery involves exploring patterns in data to prompt new questions

This search for patterns requires a platform that offers more flexibility than the schema-on-write approach of the EDW and traditional analytics• Statistical analytics• Predictive analytics• Machine learning

The search for patterns also lends itself to analyzing not just data, but relationships between data• Graph analysis

Page 22: NoSQL Technology and Real-time, Accurate Predictive Analytics

© 2013 by The 451 Group. All rights reserved

Conclusion

NoSQL development was driven by the need for new approaches to scalability, performance, consistency, agility and intricacy

Initiated by Web startups, it has generated a new breed of database vendors and database products

Graph databases enable analysis of not just individual or aggregate data, but also the relationships between data

While the rise of graph databases is closely linked to the rise of social networking, use-cases include anything that involves relationships between entities

Graph databases are expanding the market for analytics

Page 23: NoSQL Technology and Real-time, Accurate Predictive Analytics

© 2013 by The 451 Group. All rights reserved

Questions? [email protected]@maslett

Page 24: NoSQL Technology and Real-time, Accurate Predictive Analytics

© Objectivity Inc 2013

Big Data Use Case: Georgetown University

Page 25: NoSQL Technology and Real-time, Accurate Predictive Analytics

J. C. Smart, Ph.D.Georgetown University

August 2013

Global Insight

Page 26: NoSQL Technology and Real-time, Accurate Predictive Analytics

The world is an important place…...and it has a few problems

7 billion people, 40,000 cities, 5 billion cell phones, 800 million vehicles, 12 million miles of paved roads, 50,000 airports, ...

Page 27: NoSQL Technology and Real-time, Accurate Predictive Analytics

The world is a complex system ofinterdependent complex systems

Climate Population Political Energy

Social Poverty Transportation Trade

Communications Terrorism Crime Health

Page 28: NoSQL Technology and Real-time, Accurate Predictive Analytics

There is an enormous diversity of topics,scales, fidelity, time, duration, …

Geospatial, cyberspatial, real-time, historical, predictive, hypothetical, virtual, on and on….

Page 29: NoSQL Technology and Real-time, Accurate Predictive Analytics

Data exists in many different forms….

Real-time Feeds Applications Databases Spreadsheets

Files Photos Audio Sensors

Websites Models Systems Plans/Maps

Page 30: NoSQL Technology and Real-time, Accurate Predictive Analytics

The “High-Yield” Knowledge Phenomena

Knowledge Density

(#Related Facts / Domain)

High-YieldPotential

Low-YieldPotential

?

Information Inferiority Information Superiority

“Anything,Anytime,

Anywhere”

“Some things,Some of the time,

Somewhere”

IntelligenceSaturation

Knowledge Gap

“Critical Mass”

IntelligenceStarvation

An

alyt

ic P

oten

t ial

/

An

alyt

ic Y

iel d

Page 31: NoSQL Technology and Real-time, Accurate Predictive Analytics

04/11/2023

Page 32: NoSQL Technology and Real-time, Accurate Predictive Analytics

Why is “connecting-the-dots” so hard?

• Plumbing: Massive logistics problem to integrate thousands of government/non-government data systems at scale

Different standards, models, security, infrastructure, procedures, policies, networks, access, compartments, applications, tools, protocols, etc. … all at immense scale!

• Protection: Large-scale integration of data resources increases cyber security risks

Prevention of adversary exploitation of strategic national assets.

• Patterns: Lack of analytic algorithm techniques to automatically detect data patterns and alert

Transition from “analytic dumpster diving” to early-warning indication and real-time notification

• Privacy: Significant tension between security and libertyWho trusts the “watchers”?Who watches the watchers?

Page 33: NoSQL Technology and Real-time, Accurate Predictive Analytics

04/11/2023

The FOUR-Color FrameworkOverview

Page 34: NoSQL Technology and Real-time, Accurate Predictive Analytics

Black Layer

Black Layer

Analytic

AnalyticKnowledge Space

Analytic

Analytic

Analytic

Analytic

Analytic

Analytic

AnalyticEngine

AnalyticEngine

AnalyticEngine

AnalyticEngine

API

API

API

API

Page 35: NoSQL Technology and Real-time, Accurate Predictive Analytics

Global insight is now possible!

• Techniques derived from innovations at LLNL, DoD, Raytheon, Georgetown, [many others] – enabled by HPC

• Extremely powerful, very effective, not for the timid

• Represents global systems as trillions of interacting objects

• Scaling, privacy, and protection achieved through a unique data to information transformation (overlay) technique

Page 36: NoSQL Technology and Real-time, Accurate Predictive Analytics

04/11/2023

Page 37: NoSQL Technology and Real-time, Accurate Predictive Analytics

© Objectivity Inc 2013

Q&A

A copy of the webinar including QA will be available online at www.Objectivity.com.

A follow up email incorporating answers to questions that may not have been answered live will be sent out following the webinar.

Thank you for joining us!