37
Live Webinar - Sep 24, 2014 Industry Experts Examine the State of Databases

Industry experts webinar slides (final v1.0)

  • Upload
    nuodb

  • View
    759

  • Download
    1

Embed Size (px)

DESCRIPTION

Slides from the live Webiner: Industry Experts Examine the State of Databases 451 Research analyst, Matt Aslett, discusses the state of the next-generation database market, the various categories of DBMS being offered on his database landscape map, and how to get a leg up on competitors with the best use cases for each of these categories. Aslett and NuoDB CTO, Seth Proctor, visit a growing trend of “back to SQL” and what each of them believes lies ahead for in the data management marketplace.

Citation preview

Page 1: Industry experts webinar slides (final   v1.0)

Live Webinar - Sep 24, 2014

Industry Experts Examine the State of Databases

Page 2: Industry experts webinar slides (final   v1.0)

Speakers

Matt AslettResearch Director

The 451 Group

Seth ProctorCTO

NuoDB, Inc.

2

Page 3: Industry experts webinar slides (final   v1.0)

© 2014 by The 451 Group. All rights reserved

Company Overview

One company with 3 operating divisions

Syndicated research, advisory, professional services, datacenter certification, and events

Global focus

270+ staff 1,500+ client organizations:

enterprises, vendors, service providers, and investment firms

Organic and growth through acquisition

Page 4: Industry experts webinar slides (final   v1.0)

114

Relational zone

Non-relational zone

Lotus Notes

Objectivity

MarkLogic

InterSystemsCaché

McObject

Starcounter

ArangoDB

FoundationDB

Neo4J

InfiniteGraph

CouchDB

MongoDB

Oracle NoSQL

Redis

Handlersocket

RavenDB

AWS DynamoDBCloudant

Redis-to-go

RethinkDB

App EngineDatastore

SimpleDB

LevelDB

Accumulo

Iris CouchMongoLab

Compose

Cassandra

HBase

RiakCouchbase

Key: General purposeSpecialist analytic

BigTablesGraphDocumentKey value stores

-as-a-Service

Splice Machine

Actian IngresSAP Sybase ASE

EnterpriseDB

SQL Server

MySQL

InformixMariaDB

SAP HANA

IBMDB2

Database.com

ClearDB

Google Cloud SQL

RackspaceCloud Databases

AWS RDS

SQL Azure

FathomDB

HP Cloud RDB for MySQL

StormDB

Hadapt Teradata Aster

HPCC

ClouderaHortonworksMapR IBM

BigInsights

AWSEMR

Google Compute

Engine

Zettaset

NGDATA

451 Research: Data Platforms Landscape Map – September 2014

InfochimpsMetascale

MortarData

Rackspace

Qubole

Voldemort

Aerospike

Key value direct accessHadoop

Teradata

IBM PureDatafor Analytics

Pivotal GreenplumHP Vertica

InfiniDB

SAP Sybase IQ

IBM InfoSphere

Actian Vector

XtremeData

Kx Systems

Exasol

Actian Matrix

ParStream

Tokutek

ScaleDBMySQL ecosystemAdvanced clustering/sharding

VoltDB

ScaleArc

ContinuentTransLattice

NuoDB

Drizzle

JustOneDB

Pivotal SQLFire

Galera

CodeFutures

ScaleBase

Zimory Scale

Clustrix

TesoraMemSQL

GenieDB

Datomic New SQL databasesYarcData

FlockDB

AllegrographHypergraphDB

AffinityDB

Giraph

Trinity MemCachier

Redis LabsRedis Cloud

Redis LabsMemcached Cloud

FairCom

BitYota

IronCache

Grid/cache zone

Memcached

Ehcache

ScaleOutSoftware

IBM eXtreme

ScaleOracle

Coherence

GigaSpaces XAPGridGain

PivotalGemFire

CloudTran

InfiniSpan

Hazelcast

OracleExalytics

OracleDatabase

MySQL Cluster

Data cachingData gridSearch

Oracle Endeca Server Attivio

Elasticsearch

LucidWorksBig Data

Lucene/Solr

IBM InfoSphere Data Explorer

TowardsE-discovery

Towardsenterprise search

Appliances

DocumentumxDB

TaminoXML Server

Ipedo XMLDatabase

ObjectStore

LucidDB

MonetDB

Metamarkets Druid

Databricks/Spark

AWSElastiCache

Firebird

SciDBSQLite

Oracle TimesTensolidDB

Adabas

IBM IMS

UniDataUniVerse

WakandaDB

Altiscale Oracle Big Data Appliance

RainStor

OrientDB

Sparksee

ObjectRocket

Metamarkets

TreasureData

PostgreSQLPercona

vFabric Postgres

© 2014 by 451 Research LLC. All rights reserved

HyperDex

TIBCOActiveSpaces

TitanCloudBird

SAP Sybase SQL Anywhere

JethroData

CitusDB Pivotal HD

BigMemory

ActianVersant

DataStaxEnterprise

DeepDB

Infobright

FatDB

Google Cloud

Datastore

Heroku Postgres

GrapheneDBCassandra.io

Hypertable

BerkeleyDB

SqrrlEnterprise

MicrosoftHDInsight

HPAutonomy

OracleExadata

IBM PureData

RedisGreen

AWSElastiCachewith Redis

IBMBig SQL

Impala

ApacheDrill

Presto

MicrosoftSQL Server

PDW

ApacheTajo

ApacheHive

SPARQLBASE

MammothDB

Altibase HDBLogicBlox

SRCH2

TIBCOLogLogic

Splunk

TowardsSIEMLoggly Sumo

LogicLogentries

InfiniSQL

In-memory

JumboDB

ActianPSQLProgressOpenEdge

Kognitio

Altibase XDB

Savvis

SoftlayerVerizon

xPlenty

Stardog

MariaDBEnterprise

Apache StormApache S4

IBMInfoSphereStreams

TIBCOStreamBase

DataTorrent

AWSKinesis

Feedzai

GuavusLokad

SQLStream

Software AG

Stream processing

OpenStack Trove

1010data

Google BigQuery

AWSRedshift

TempoIQ

InfluxDB

MagnetoDB

WebScaleSQL

MySQL Fabric Spider

21 43 65

E

D

A

B

C

T-Systems

E

D

A

B

C

21 43 65

SQream

SpaceCurve

Postgres-XL

GoogleCloud

DataflowTrafodion

Hadapt

ObjectRocketRedis

DocumentDB

AzureSearch

Red Hat JBossData Grid

Page 5: Industry experts webinar slides (final   v1.0)

114

Relational zone

Non-relational zone

Lotus Notes

Objectivity

MarkLogic

InterSystemsCaché

McObject

Key: General purposeSpecialist analytic

MySQL

Hadapt

451 Research: Data Platforms Landscape Map – 5ish years ago

Grid/cache zone

ScaleOutSoftware

IBM eXtreme

ScaleTangosol

Coherence

GigaSpaces

GemStone

Data grid/cacheSearch

EndecaAttivio

LucidImagination

Vivisimo

TowardsE-discovery

Towardsenterprise search

DocumentumxDB

TaminoXML Server

Ipedo XMLDatabase

SQLite

Adabas

IBM IMS

UniDataUniVerse

PostgreSQL

© 2014 by 451 Research LLC. All rights reserved

TIBCOActiveSpaces

Versant

BerkeleyDB

Autonomy

LogLogicSplunk

TowardsSIEM

In-memory

ProgressApama

StreamBase

TIBCOSQLStream

Coral8

Stream processing

21 43 65

E

D

A

B

C

E

D

A

B

C

21 43 65

Terracotta Memcached

ProgressObjectStore

LuceneSolr

Aleri

BEA

IngresSybase ASE

EnterpriseDB

Firebird

Sybase SQL Anywhere

SQL Server

InformixIBMDB2

OracleDatabase

Oracle TimesTenIBM solidDB

Pervasive PSQLProgress OpenEdge

Kognitio

1010data

TeradataNetezza

GreenplumVertica

Calpont

Sybase IQ

IBM InfoSphere

VectorWiseInfobright

Kx Systems

ParAccel

MonetDB

Aster Data

Page 6: Industry experts webinar slides (final   v1.0)

114

Relational zone

Non-relational zone

Lotus Notes

Objectivity

MarkLogic

InterSystemsCaché

McObject

Starcounter

ArangoDB

FoundationDB

Neo4J

InfiniteGraph

CouchDB

MongoDB

Oracle NoSQL

Redis

Handlersocket

RavenDB

AWS DynamoDBCloudant

Redis-to-go

RethinkDB

App EngineDatastore

SimpleDB

LevelDB

Accumulo

Iris CouchMongoLab

Compose

Cassandra

HBase

RiakCouchbase

Key: General purposeSpecialist analytic

BigTablesGraphDocumentKey value stores

-as-a-Service

Splice Machine

Actian IngresSAP Sybase ASE

EnterpriseDB

SQL Server

MySQL

InformixMariaDB

SAP HANA

IBMDB2

Database.com

ClearDB

Google Cloud SQL

RackspaceCloud Databases

AWS RDS

SQL Azure

FathomDB

HP Cloud RDB for MySQL

StormDB

Hadapt Teradata Aster

HPCC

ClouderaHortonworksMapR IBM

BigInsights

AWSEMR

Google Compute

Engine

Zettaset

NGDATA

451 Research: Data Platforms Landscape Map – September 2014

InfochimpsMetascale

MortarData

Rackspace

Qubole

Voldemort

Aerospike

Key value direct accessHadoop

Teradata

IBM PureDatafor Analytics

Pivotal GreenplumHP Vertica

InfiniDB

SAP Sybase IQ

IBM InfoSphere

Actian Vector

XtremeData

Kx Systems

Exasol

Actian Matrix

ParStream

Tokutek

ScaleDBMySQL ecosystemAdvanced clustering/sharding

VoltDB

ScaleArc

ContinuentTransLattice

NuoDB

Drizzle

JustOneDB

Pivotal SQLFire

Galera

CodeFutures

ScaleBase

Zimory Scale

Clustrix

TesoraMemSQL

GenieDB

Datomic New SQL databasesYarcData

FlockDB

AllegrographHypergraphDB

AffinityDB

Giraph

Trinity MemCachier

Redis LabsRedis Cloud

Redis LabsMemcached Cloud

FairCom

BitYota

IronCache

Grid/cache zone

Memcached

Ehcache

ScaleOutSoftware

IBM eXtreme

ScaleOracle

Coherence

GigaSpaces XAPGridGain

PivotalGemFire

CloudTran

InfiniSpan

Hazelcast

OracleExalytics

OracleDatabase

MySQL Cluster

Data cachingData gridSearch

Oracle Endeca Server Attivio

Elasticsearch

LucidWorksBig Data

Lucene/Solr

IBM InfoSphere Data Explorer

TowardsE-discovery

Towardsenterprise search

Appliances

DocumentumxDB

TaminoXML Server

Ipedo XMLDatabase

ObjectStore

LucidDB

MonetDB

Metamarkets Druid

Databricks/Spark

AWSElastiCache

Firebird

SciDBSQLite

Oracle TimesTensolidDB

Adabas

IBM IMS

UniDataUniVerse

WakandaDB

Altiscale Oracle Big Data Appliance

RainStor

OrientDB

Sparksee

ObjectRocket

Metamarkets

TreasureData

PostgreSQLPercona

vFabric Postgres

© 2014 by 451 Research LLC. All rights reserved

HyperDex

TIBCOActiveSpaces

TitanCloudBird

SAP Sybase SQL Anywhere

JethroData

CitusDB Pivotal HD

BigMemory

ActianVersant

DataStaxEnterprise

DeepDB

Infobright

FatDB

Google Cloud

Datastore

Heroku Postgres

GrapheneDBCassandra.io

Hypertable

BerkeleyDB

SqrrlEnterprise

MicrosoftHDInsight

HPAutonomy

OracleExadata

IBM PureData

RedisGreen

AWSElastiCachewith Redis

IBMBig SQL

Impala

ApacheDrill

Presto

MicrosoftSQL Server

PDW

ApacheTajo

ApacheHive

SPARQLBASE

MammothDB

Altibase HDBLogicBlox

SRCH2

TIBCOLogLogic

Splunk

TowardsSIEMLoggly Sumo

LogicLogentries

InfiniSQL

In-memory

JumboDB

ActianPSQLProgressOpenEdge

Kognitio

Altibase XDB

Savvis

SoftlayerVerizon

xPlenty

Stardog

MariaDBEnterprise

Apache StormApache S4

IBMInfoSphereStreams

TIBCOStreamBase

DataTorrent

AWSKinesis

Feedzai

GuavusLokad

SQLStream

Software AG

Stream processing

OpenStack Trove

1010data

Google BigQuery

AWSRedshift

TempoIQ

InfluxDB

MagnetoDB

WebScaleSQL

MySQL Fabric Spider

21 43 65

E

D

A

B

C

T-Systems

E

D

A

B

C

21 43 65

SQream

SpaceCurve

Postgres-XL

GoogleCloud

DataflowTrafodion

Hadapt

ObjectRocketRedis

DocumentDB

AzureSearch

Red Hat JBossData Grid

NoSQL

Hadoop

NewSQL

DBaaS

Page 7: Industry experts webinar slides (final   v1.0)

© 2014 by The 451 Group. All rights reserved

Drivers for change

Developers

Agile

REST

JSON

Schemaless

Schema-on-read

Flexible

Architecture

Cloud

Scalable

Elastic

Virtual

Distributed

Flexible

Applications

Web

Social

Mobile

Always-on

Interactive

Local

Global

Page 8: Industry experts webinar slides (final   v1.0)

© 2014 by The 451 Group. All rights reserved

Drivers for change influence each other

Developers

Agile

REST

JSON

Schemaless

Schema-on-read

Flexible

Architecture

Cloud

Scalable

Elastic

Virtual

Distributed

Flexible

Applications

Web

Social

Mobile

Always-on

Interactive

Local

Global

New applications require distributed architecture

Distributed architecture encourages new development approaches

New development approaches demand new architecture

Distributed architecture enables new applications

New app requirements demand new development approaches

New dev approaches enable new lightweight apps

Page 9: Industry experts webinar slides (final   v1.0)

© 2014 by The 451 Group. All rights reserved

Drivers for change: applications

Social – increased interactivity generates data

Mobile – different form factors and access methods

Global – applications need to be immediately available everywhere

Local – need to deliver localized content

Applications

Web

Social

Mobile

Always-on

Interactive

Local

Global

Social, mobile, global, local all have implications for data connectivity

Page 10: Industry experts webinar slides (final   v1.0)

© 2014 by The 451 Group. All rights reserved

Drivers for change: development

Rapid development and continuous delivery is inconsistent with traditional database management processes

Need to unite application development and database management people/processes to achieve common goals

DevOps movement growing apace

Developers

Agile

REST

JSON

Schemaless

Schema-on-read

Flexible

Developers increasingly drive data management and database selection

Page 11: Industry experts webinar slides (final   v1.0)

© 2014 by The 451 Group. All rights reserved

Architecture

Cloud

Scalable

Elastic

Virtual

Distributed

Flexible

Drivers for change: architecture

Transitioning from a traditional database to a distributed database

Interactive applications means the pace of user growth and multiplicity of data types is too great for traditional relational databases to efficiently absorb.

Scalability Performance Relaxed consistency Agility Intricacy Necessity

Page 12: Industry experts webinar slides (final   v1.0)

© 2014 by The 451 Group. All rights reserved

Transitioning from on-premises computing to the cloud

Drivers for change: architecture

Transitioning from a traditional database to a distributed database

Transitioning from on-premises computing to the cloud

Architecture

CloudElastic

Virtual

Distributed

Flexible

Scalable

Page 13: Industry experts webinar slides (final   v1.0)

© 2014 by The 451 Group. All rights reserved

Transitioning from on-premises computing to the cloud

Drivers for change: architecture

Transitioning from a traditional database to a distributed database

Transitioning from on-premises computing to the cloud

Architecture

CloudElastic

Virtual

Distributed

Flexible

Scalable

Page 14: Industry experts webinar slides (final   v1.0)

© 2014 by The 451 Group. All rights reserved

Amazon’s top enterprise use cases are (in order of popularity starting with the most popular):• Development and test• New workloads• Supplement existing workloads with cloud• Migration of existing workloads to the cloud• Datacenter migration• All-in cloud

Top three adoption drivers for public cloud have no impact on the existing database landscape

Transitioning from on-premises computing to the cloud

Drivers for change: architecture

Transitioning from a traditional database to a distributed databaseArchitecture

CloudElastic

Virtual

Distributed

Flexible

Scalable

Page 15: Industry experts webinar slides (final   v1.0)

© 2014 by The 451 Group. All rights reserved

Drivers for change: combined effect

Architecture

Cloud

Scalable

Elastic

Virtual

Distributed

Flexible

Applications

Web

Social

Mobile

Always-on

Interactive

Local

Global

Developers

Agile

REST

JSON

Schemaless

Schema-on-read

Flexible

Page 16: Industry experts webinar slides (final   v1.0)

© 2014 by The 451 Group. All rights reserved

Drivers for change: combined effect

DevelopersApplications

Architecture

Page 17: Industry experts webinar slides (final   v1.0)

© 2014 by The 451 Group. All rights reserved

DevelopersApplications

Architecture

Drivers for change: combined effect

NoSQL DBaaS HadoopNewSQL

Page 18: Industry experts webinar slides (final   v1.0)

© 2014 by The 451 Group. All rights reserved

New databases: similarities

NoSQL DBaaS HadoopNewSQL

Distributed architecture Agility Elasticity New application development projects

Page 19: Industry experts webinar slides (final   v1.0)

© 2014 by The 451 Group. All rights reserved

New databases: differences

NoSQL Non-relational data models.Trade-off consistency for availability

NewSQLAdds availability and flexibility tothe familiar relational data model

Hadoop Batch (and now interactive) analytic processing of unstructured data

DBaaSAny of the above, or traditional RDBMS, delivered as a service

Page 20: Industry experts webinar slides (final   v1.0)

© 2014 by The 451 Group. All rights reserved

Use cases

Approach Details Examples

NoSQLMongoDB, Couchbase,

Cassandra, Redis, Aerospike, Cloudant

Non-transactional operational applications, unstructured data,

lightweight query

NewSQLNuoDB, MemSQL,

Translattice, VoltDB, Splice Machine

Transactional operational apps, structured data, complex query,

operational intelligence

HadoopCloudera, MapR,

Hortonworks, Pivotal, IBM, Teradata

Non-transactional analytic applications, multi-structured data,

complex query

DBaaSObjectRocket, AWS

DynamoDB, AWS RDS, Altiscale, Qubole

Any of the above, or traditional RDBMS, delivered as a service

Page 21: Industry experts webinar slides (final   v1.0)

© 2014 by The 451 Group. All rights reserved

A quick word operational intelligence

It has become an accepted best practice that analytics should be performed on data stored in a separate database from that used to support operational, transactional systems• data management benefits • the need to avoid the performance limitations of traditional systems

STRUCTURED DATASTRUCTURED DATA

OPERATIONAL DATABASE

APPLICATIONS

DATA WAREHOUSE

AD HOC ANALYTICS

PRE-DEFINED REPORTING

Page 22: Industry experts webinar slides (final   v1.0)

© 2014 by The 451 Group. All rights reserved

A quick word operational intelligence

The emergence of a new breed of relational database vendors taking advantage of hardware, memory and processor performance to support transactional and analytic workloads in the same instance • A rejection of the concept that it is necessary to wait for data to become

available in analytic databases

STRUCTURED DATASTRUCTURED DATA

NEWSQLDATABASE

APPLICATIONS

DATA WAREHOUSE

AD HOC ANALYTICS

PRE-DEFINED REPORTING

OPERATIONALINTELLIGENCE

Page 23: Industry experts webinar slides (final   v1.0)

© 2014 by The 451 Group. All rights reserved

A quick word operational intelligence

This is not a matter of making the data warehouse redundant, but rather providing another source of business intelligence to complement that generated by the data warehouse• Providing users with a ‘live’ view of their operational data for rapid

decision making

STRUCTURED DATASTRUCTURED DATA

NEWSQLDATABASE

APPLICATIONS

DATA WAREHOUSE

AD HOC ANALYTICS

PRE-DEFINED REPORTING

OPERATIONALINTELLIGENCE

Page 24: Industry experts webinar slides (final   v1.0)

Webscale Distributed Database

Page 25: Industry experts webinar slides (final   v1.0)

Convergence

NoSQL systems adding structure & query expressivenessNon-ACID systems running limited transactionsSQL databases for JSON or RDFHDFS as the core for SQLEtc.

25

Page 26: Industry experts webinar slides (final   v1.0)

Why?

26

Page 27: Industry experts webinar slides (final   v1.0)

Simplicity …

27

Page 28: Industry experts webinar slides (final   v1.0)

… and because we can.

28

Page 29: Industry experts webinar slides (final   v1.0)

Multi-Model

How is your data used?Relational, Graph or Document not SQL, RDF or JSON

This is how to optimize workloadsDrives access patterns, storage models, caching, distribution, etc.Requires thought at the core architecture

Simplifying utilities like transactions or consistency models are general

29

Page 30: Industry experts webinar slides (final   v1.0)

DBaaS & Automation

DBaaS simplifies cloud modelsShould be a logical unit to operate

Enables auto-pilot operationCan be automated to simplify operations

Not a “nice to have”A system cannot scale to any massive size without some amount of self-awareness

30

Page 31: Industry experts webinar slides (final   v1.0)

Some Requirements for a Distributed Database

Scale in & out on-demandProvide resiliency and online upgradePresent a logical, single system viewSupport multiple models on mixed infrastructureRun in multiple locationsBe simple to use

31

Page 32: Industry experts webinar slides (final   v1.0)

Distributed Database Designs

Approach Shared DiskShared-Nothing/

ShardedSynchronous Replication

DurableDistributed Cache

Key Idea Sharing a file system.Independent databases for disjoint subsets of

data.

Committing data transactionally to multiple

locations before returning.

Replicating data in memory on-demand.

Topology

Example Oracle RACDB2 Pure Scale

*VoltDBMySQL Cluster

and most NoSQL/NewSQL solutions

Google F1

32

*Note: Most major web properties include custom-sharded MySQL or sharded PostgreSQL, including Facebook, GOOGLE, Wikipedia, Amazon, Flickr, Box.net, and Heroku.

Page 33: Industry experts webinar slides (final   v1.0)

Peer to Peer Architecture

33

Page 34: Industry experts webinar slides (final   v1.0)

Scale-out PerformanceMulti-TenancyContinuous Availability

No-knobs Admin

Breakthrough Capabilities

34

• NuoDB scales to over 100 server machines

• Scalability is instant and elastic • Scales-out and scales-in• TPS numbers exceed 10m TPS on

$100k of hardware• Also scales on AWS, GCE etc. Public

demo of 32 nodes with GOOGLE• Now showing linear scalability on

TPC-C type workloads (DBT-2)• Scalability demonstrated with

heavier duty customer applications (eg Axway, Dassault Systémes)

• Self-healing• No single point of failure• Fully distributed control• Arbitrarily redundant• Online backup• Online schema evolution• Rolling upgrades

• HP Moonshot Launch – 45 Micro servers in a 4U rack mount box

• NuoDB ran 72,000 databases on a single Moonshot box

• Uses proprietary “Database Hibernation” and “Database Bursting” technologies

• Zero admin UI• Demo showed the potential of

“Software Defined Database”• Moonshot is the foundation of

the HP relationship

• Active/Active • ACID Semantics• Transactional

Consistency • N-Way Redundant• Local User Latency• Asynch WAN Comms

• Auto-admin• Rules-driven• Auto-optimizing• Auto-backup

Geo-Distribution

Page 35: Industry experts webinar slides (final   v1.0)

HTAP on NuoDB

35

TE TE TE TE TE TE

SM SM

Long-running Analytical Queries

Read/Write OLTP Workload

• MVCC: workloads operate on live data without lock contention• Scale-out architecture: workloads can be distributed across, and appropriately

matched to, machine resources to ensure consistent throughput for diverse operational and analytical workloads

• Scale-out architecture: burst out analytics to appropriate hardware when needed; upon completion, those resources can be spun down until needed again

Single logical database

Page 36: Industry experts webinar slides (final   v1.0)

Our investment in NuoDB demonstrates our strong interest and belief in NuoDB’s

strategy and technologies for next-generation cloud-based services.

“”

Dominique Florack, Senior Executive VP, Products-R&D

36

Page 37: Industry experts webinar slides (final   v1.0)

Thank you!