40
Microsoft Azure DocumentDB Overview presentation James Serra Big Data Evangelist Microsoft [email protected] \

Introducing DocumentDB

Embed Size (px)

Citation preview

Microsoft Azure DocumentDBOverview presentation

James SerraBig Data [email protected]

\

About Me Microsoft, Big Data Evangelist In IT for 30 years, worked on many BI and DW projects Worked as desktop/web/database developer, DBA, BI and DW architect and

developer, MDM architect, PDW/APS developer Been perm employee, contractor, consultant, business owner Presenter at PASS Business Analytics Conference, PASS Summit, Enterprise Data

World conference Certifications: MCSE: Data Platform, Business Intelligence; MS: Architecting

Microsoft Azure Solutions, Design and Implement Big Data Analytics Solutions, Design and Implement Cloud Data Platform Solutions

Blog at JamesSerra.com Former SQL Server MVP Author of book “Reporting with Microsoft SQL Server 2012”

AgendaNoSQL OverviewDocumentDB OverviewToday’s application environmentPricingDocumentDB basicsService summaryDevelopment scenariosResources and tools

What is NoSQL?

Choose the store that best fits your needsSQL

NoSQL

A database solution designed to compensate for the technical limitations of SQL

Traditional approach: relational stores

Data is stored in tables that comprise:• Schemas• Columns• Rows

Chappell & Associates. “Understanding NoSQL on Microsoft Azure.” 2014. http://www.davidchappell.com/writing/white_papers/Azure-NoSQL-Technologies-v2.0--Chappell.pdf.Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf

Azure DocumentDBUses all but graph categoryIncludes some key-value and columnstore capabilities

Wide column

NoSQL approach: various types of stores

Key value GraphDocument

PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015 http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf

A NoSQL database uses four categories of stores:

Key-value stores

Key-value stores offer high speed through the least-complicated data model—anything can be stored as a value, as long as each value is associated with a key or name.

Key Value

Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf

Wide-column stores

Wide-column stores are fast and can be almost as simple as key-value stores. They include a primary key, an optional secondary key, and anything stored as a value.

Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf

Values

Primary key

Keys and values can be sparse or numerous

Secondary key

Graph databases

Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf

Title: Forgotte

n Bridges

Title: MythicalBridges

PurchasedDate: 03/02/2011

PurchasedDate: 09/09/2011

PurchasedDate: 05/07/2011

Name:Ian

Name:Alan

Document stores

Document stores contain data objects that are inherently hierarchical, tree-like structures (most notably JavaScript Object Notation [JSON] or Extensible Markup Language [XML]).

Note that these are not Microsoft Word documents!

Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf

NewSQL: another variation

Relational NewSQL stores are designed for web-scale applications, but they still require up-front schemas, joins, and table management that can be labor intensive.

Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015. http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf.

Scalability

Resiliency

Operational efficiency

Speed and variety of cloud-based data

Why NoSQL evolved

Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf

Drivers

SQL and NoSQL: each has its place

Fully featured RDBMS

Transactional processing

RichQuery

Managed as a service

Elastic scale

Internet-accessible http/rest

Schema-free data model

Arbitrary data formats

Azure DocumentDB

Application SQL query

Perfect for cloud architects and developers who need an enterprise-ready NoSQL document database

JSON

{ "name": "John", "country": "Canada", "age": 43, "lastUse": "March 4, 2014"}

{ "name": "Eva", "country": "Germany", "age": 25}

{ "name": "Lou", "country": "Australia", "age": 51, "firstUse": "May 8, 2013"}

{ "docCount": 3, "last": "May 1, 2014"}

DO

CUM

ENT

1

DO

CUM

ENT

2

DO

CUM

ENT

3

DO

CUM

ENT

4

A NoSQL document database-as-a-service, fully managed by Azure

Azure DocumentDB Document store

{ "name": "SmugMug", "permalink": "smugmug", "homepage_url": "http://www.smugmug.com", "blog_url": "http://blogs.smugmug.com/", "category_code": "photo_video", "products": [ { "name": "SmugMug", "permalink": "smugmug" } ], "offices": [ { "description": "", "address1": "67 E. Evelyn Ave", "address2": "", "zip_code": "94041", "city": "Mountain View", "state_code": "CA", "country_code": "USA", "latitude": 37.390056, "longitude": -122.067692 } ]}

Perfect for: schema-agnostic JSON store for hierarchical and denormalized data at scale

What documents?

Not Word documents

Azure DocumentDB details

Native support for JavaScript, SQL query, and transactions over JSON documents

Reliable and predictable performance• Tunable consistency• Elastic scale

Rapiddevelopment• Build with familiar tools—

REST, JSON, JavaScript

RichQuery and transactions over JSON data• Query JSON data with no

secondary indices

Ideal for apps designed for the cloud when the following are high priorities:

Top Features

Auto-scaling/sharding• Improved scalability and reliability due to

distribution of large data sets across multiple machines

Automatic indexing• All document properties are available for queries• Frees you from relying on schemas or secondary

indexes

SQL query language • Make use of SQL experience and .NET LINQ

Managed service • Spin up on demand with no setup• Availability guarantee of 99.95%• Linear price curve without virtual-machine step

functions• Integration with Azure HDInsight and Azure Search

Top FeaturesGreater consistency control• Four consistency levels provide more options for

consistency, availability, and performance requirements

Atomicity, Consistency, Isolation, and Durability (ACID) transaction control• Simpler programming model (compared to state

variables)• Use JavaScript for insert, update, and delete actions Standards-based open

API with RESTful HTTP• Uses JSON standard—no mapping of Binary

JSON (BSON) to JSON needed

Granular access rights• Allows access to all documents and attachments

within collections

Monitor an account• View performance metrics for a DocumentDB account• Customize performance metric views for a DocumentDB account• Create side-by-side performance metric charts• View usage metrics for a DocumentDB account• Set up performance metric alerts for a DocumentDB account

Today’s modern apps• Produce and consume data at a staggering

rate • Require instantaneous response times to

match user expectations• Are developed iteratively with many

versions supported concurrently• Are developed with continuously evolving

data models• Are increasingly complex• Experience unpredictable and explosive

growth

Well-suited for web and mobile apps

Catalog data Preferencesand state

Event store

User-generatedcontent

Data exchange

Ideal Scenarios

Azure DocumentDB at MicrosoftMore than 450 million unique users

Store 20 TB of JSON document data

Under 15 millisecond (ms) writes and single-digit ms reads

Store for 40+ app/device combinations

Available globally to serve all marketsUSER DATA STORE

Standard pricing tier with hourly billing

RU= Request Unit (represents the processing capacity required to read a single 1KB JSON document consisting of 10 unique property values)

Azure DocumentDB basicsResource model

• Entities addressable by logical Uniform Resource Identifier (URI)

• Partitioned for scale out • Replicated for high availability• Entities represented as JSON• Accounts scale out by moving a slider

Interaction model• RESTful interaction over HTTPS• HTTPS and TCP connectivity• Standard HTTPS verbs and semantics

Development• .NET, Node.js, Python, Java, and JavaScript

clients• SQL for query expression, .NET LINQ• JavaScript for server-side app logic

Azure DocumentDB account Databases

Users

Permissions

101010

Attachments

Your documents here

{ }{ }DocumentsCollections

Stored procedures

Triggers

User-defined functions

JS

JS

JS

• Collections != tables• Unit of partitioning• Transaction boundary• No enforced schema, flexible• Queried or updated stay together

in one collection• Elasticity to 10 GB• RUs evenly distributed

across partitions

Azure DocumentDB collections

101010

Attachments

Your documents here

DocumentsCollections

Stored procedures

Triggers

User-defined functions

JS

JS

JS

Elastic collections

• Collection != single partition• Partition count dynamic• Each partition (key) is 10 GB• Online splits and merges with

full availability• RUs evenly distributed

across partitions

Rich query over JSON data

Native JavaScript transactional processing

Familiar SQL-based query language

Query on JSON data without specifying secondary indices or constructing views

Build modern, scalable apps with robust transactional querying and data processing on JSON documents

JavaScript transactionsTransactionally process multiple documents with application-defined stored procedures

and triggers• JavaScript as the procedural language• Language integrated• Execution wrapped in an implicit transaction• Preregistered and scoped to a collection• Performed with ACID guarantees• Triggers invoked as pre- or post-operations

Stored procedures

JS

Triggers

Reliable and predictable performance

Tunable consistencyTune and trade-off consistency and performance through well-defined levels to suit application-scenario needs—from eventual to strong

Elastic scaleEnterprise-tested by a large, internal, consumer-facing service

Fast, predictable performanceDefined throughput levels that scale linearly with application needs

Azure DocumentDB is born in the cloud to achieve fast, predictable performance with reserved resources to deliver on throughput needs. Delivers reliable, tunable consistency to increase performance based on application needs.

Document myDoc = await client.ReadDocumentAsync(documentLink, new RequestOptions { ConsistencyLevel = ConsistencyLevel.Eventual });

Four consistency levels

Strong Session

BoundedStaleness

Eventual

Lower consistency level on read operations

Consistency levels enable guaranteesChoose your consistency level and make predictable trade-offs between consistency, availability, and performance

Choose your level

StrongData consistency

SessionMonotonic reads (on explicit read requests) and writes

Bounded StalenessTotal order of propagation of writes

EventualLowest latency for reads and writes

Security model

Azure Document DB is designed to be secure with:• Master key• Access control on resources• User operations• Permission operations• Code execution

Rapid development

Easy to start and fully-manage

Enterprise-grade Azure platform

Build with familiar tools—REST, JSON, and JavaScript

Reduce development friction and complexity when building new business-class applications by using familiar tools and industry-standard platforms. Combine Azure DocumentDB with a portfolio of complementary cloud services on the Azure platform, such as the Azure HDInsight Connector and Azure Search Indexer

Tools

https://azure.microsoft.com/en-us/blog/exploring-azure-documentdb-in-visual-studio/

Microsoft Cloud Explorer for Visual Studio

Azure DocumentDB data-migration tool

https://azure.microsoft.com/en-us/documentation/articles/documentdb-import-data/

http://portal.azure.com

Document Explorer in Azure portal

Azure DocumentDB service summary

Unique among NoSQL stores:• Developed for the cloud and for delivery

as a service• Truly query-able JSON store• Transactional processing through language-

integrated JavaScript• Predictable performance and

tunable consistency

Development scenariosConsider Azure DocumentDB when you need:• To build new web and mobile cloud-based applications• Rapid development and high-scalability requirements• Query and processing of user- and device-generated data• More query and processing support for your key-value stores• To run a document store in virtual machines• A managed service model

Build your first Azure DocumentDB app today

Get supportSchedule a 1:1 chat directly with the Azure DocumentDB engineering team at askdocdb.com

Give feedbackAsk questions through the forum at http://aka.ms/docdbforum Suggest an idea and vote to support other ideas for Azure DocumentDB at http://aka.ms/docdbideas On Twitter @documentdb

Get startedSign up for Azure DocumentDB at http://aka.ms/docdbstart Access and configure your account at http://portal.azure.comDownload an SDK from http://aka.ms/docdbsdks, and then build a sample at http://aka.ms/docdbsample

Go to http://www.documentdb.com/sql/demo

Test out sample queries or write your own against the dataset

Using DocumentDB Query Playground

MICROSOFT CONF IDENT IAL – INTERNAL ONLY

Learn moreDavid Chappell NoSQL overview paper on Infopediahttp://www.davidchappell.com/writing/white_papers/Azure-NoSQL-Technologies-v2.0--Chappell.pdf

Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement [book]http://www.pdfiles.com/pdf/files/English/Databases/Seven_Databases_In_Seven_Weeks.pdf

Replicated Data Consistency Explained Through Baseball [paper]http://research.microsoft.com/apps/pubs/default.aspx?id=206913

Q & A ?James Serra, Big Data EvangelistEmail me at: [email protected] me at: @JamesSerra Link to me at: www.linkedin.com/in/JamesSerra Visit my blog at: JamesSerra.com (where this slide deck will be posted)