Upload
james-serra
View
134
Download
2
Embed Size (px)
Citation preview
About Me Microsoft, Big Data Evangelist In IT for 30 years, worked on many BI and DW projects Worked as desktop/web/database developer, DBA, BI and DW architect and
developer, MDM architect, PDW/APS developer Been perm employee, contractor, consultant, business owner Presenter at PASS Business Analytics Conference, PASS Summit, Enterprise Data
World conference Certifications: MCSE: Data Platform, Business Intelligence; MS: Architecting
Microsoft Azure Solutions, Design and Implement Big Data Analytics Solutions, Design and Implement Cloud Data Platform Solutions
Blog at JamesSerra.com Former SQL Server MVP Author of book “Reporting with Microsoft SQL Server 2012”
AgendaNoSQL OverviewDocumentDB OverviewToday’s application environmentPricingDocumentDB basicsService summaryDevelopment scenariosResources and tools
What is NoSQL?
Choose the store that best fits your needsSQL
NoSQL
A database solution designed to compensate for the technical limitations of SQL
Traditional approach: relational stores
Data is stored in tables that comprise:• Schemas• Columns• Rows
Chappell & Associates. “Understanding NoSQL on Microsoft Azure.” 2014. http://www.davidchappell.com/writing/white_papers/Azure-NoSQL-Technologies-v2.0--Chappell.pdf.Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf
Azure DocumentDBUses all but graph categoryIncludes some key-value and columnstore capabilities
Wide column
NoSQL approach: various types of stores
Key value GraphDocument
PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015 http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf
A NoSQL database uses four categories of stores:
Key-value stores
Key-value stores offer high speed through the least-complicated data model—anything can be stored as a value, as long as each value is associated with a key or name.
Key Value
Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf
Wide-column stores
Wide-column stores are fast and can be almost as simple as key-value stores. They include a primary key, an optional secondary key, and anything stored as a value.
Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf
Values
Primary key
Keys and values can be sparse or numerous
Secondary key
Graph databases
Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf
Title: Forgotte
n Bridges
Title: MythicalBridges
PurchasedDate: 03/02/2011
PurchasedDate: 09/09/2011
PurchasedDate: 05/07/2011
Name:Ian
Name:Alan
Document stores
Document stores contain data objects that are inherently hierarchical, tree-like structures (most notably JavaScript Object Notation [JSON] or Extensible Markup Language [XML]).
Note that these are not Microsoft Word documents!
Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf
NewSQL: another variation
Relational NewSQL stores are designed for web-scale applications, but they still require up-front schemas, joins, and table management that can be labor intensive.
Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015. http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf.
Scalability
Resiliency
Operational efficiency
Speed and variety of cloud-based data
Why NoSQL evolved
Image based on: PricewaterhouseCoopers. “Data models in NoSQL and NewSQL databases.” 2015http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/features/assets/data-models-production.pdf
Drivers
SQL and NoSQL: each has its place
Fully featured RDBMS
Transactional processing
RichQuery
Managed as a service
Elastic scale
Internet-accessible http/rest
Schema-free data model
Arbitrary data formats
Azure DocumentDB
Application SQL query
Perfect for cloud architects and developers who need an enterprise-ready NoSQL document database
JSON
{ "name": "John", "country": "Canada", "age": 43, "lastUse": "March 4, 2014"}
{ "name": "Eva", "country": "Germany", "age": 25}
{ "name": "Lou", "country": "Australia", "age": 51, "firstUse": "May 8, 2013"}
{ "docCount": 3, "last": "May 1, 2014"}
DO
CUM
ENT
1
DO
CUM
ENT
2
DO
CUM
ENT
3
DO
CUM
ENT
4
A NoSQL document database-as-a-service, fully managed by Azure
Azure DocumentDB Document store
{ "name": "SmugMug", "permalink": "smugmug", "homepage_url": "http://www.smugmug.com", "blog_url": "http://blogs.smugmug.com/", "category_code": "photo_video", "products": [ { "name": "SmugMug", "permalink": "smugmug" } ], "offices": [ { "description": "", "address1": "67 E. Evelyn Ave", "address2": "", "zip_code": "94041", "city": "Mountain View", "state_code": "CA", "country_code": "USA", "latitude": 37.390056, "longitude": -122.067692 } ]}
Perfect for: schema-agnostic JSON store for hierarchical and denormalized data at scale
What documents?
Not Word documents
Azure DocumentDB details
Native support for JavaScript, SQL query, and transactions over JSON documents
Reliable and predictable performance• Tunable consistency• Elastic scale
Rapiddevelopment• Build with familiar tools—
REST, JSON, JavaScript
RichQuery and transactions over JSON data• Query JSON data with no
secondary indices
Ideal for apps designed for the cloud when the following are high priorities:
Top Features
Auto-scaling/sharding• Improved scalability and reliability due to
distribution of large data sets across multiple machines
Automatic indexing• All document properties are available for queries• Frees you from relying on schemas or secondary
indexes
SQL query language • Make use of SQL experience and .NET LINQ
Managed service • Spin up on demand with no setup• Availability guarantee of 99.95%• Linear price curve without virtual-machine step
functions• Integration with Azure HDInsight and Azure Search
Top FeaturesGreater consistency control• Four consistency levels provide more options for
consistency, availability, and performance requirements
Atomicity, Consistency, Isolation, and Durability (ACID) transaction control• Simpler programming model (compared to state
variables)• Use JavaScript for insert, update, and delete actions Standards-based open
API with RESTful HTTP• Uses JSON standard—no mapping of Binary
JSON (BSON) to JSON needed
Granular access rights• Allows access to all documents and attachments
within collections
Monitor an account• View performance metrics for a DocumentDB account• Customize performance metric views for a DocumentDB account• Create side-by-side performance metric charts• View usage metrics for a DocumentDB account• Set up performance metric alerts for a DocumentDB account
Today’s modern apps• Produce and consume data at a staggering
rate • Require instantaneous response times to
match user expectations• Are developed iteratively with many
versions supported concurrently• Are developed with continuously evolving
data models• Are increasingly complex• Experience unpredictable and explosive
growth
Well-suited for web and mobile apps
Catalog data Preferencesand state
Event store
User-generatedcontent
Data exchange
Ideal Scenarios
Azure DocumentDB at MicrosoftMore than 450 million unique users
Store 20 TB of JSON document data
Under 15 millisecond (ms) writes and single-digit ms reads
Store for 40+ app/device combinations
Available globally to serve all marketsUSER DATA STORE
Standard pricing tier with hourly billing
RU= Request Unit (represents the processing capacity required to read a single 1KB JSON document consisting of 10 unique property values)
Azure DocumentDB basicsResource model
• Entities addressable by logical Uniform Resource Identifier (URI)
• Partitioned for scale out • Replicated for high availability• Entities represented as JSON• Accounts scale out by moving a slider
Interaction model• RESTful interaction over HTTPS• HTTPS and TCP connectivity• Standard HTTPS verbs and semantics
Development• .NET, Node.js, Python, Java, and JavaScript
clients• SQL for query expression, .NET LINQ• JavaScript for server-side app logic
Azure DocumentDB account Databases
Users
Permissions
101010
Attachments
Your documents here
{ }{ }DocumentsCollections
Stored procedures
Triggers
User-defined functions
JS
JS
JS
• Collections != tables• Unit of partitioning• Transaction boundary• No enforced schema, flexible• Queried or updated stay together
in one collection• Elasticity to 10 GB• RUs evenly distributed
across partitions
Azure DocumentDB collections
101010
Attachments
Your documents here
DocumentsCollections
Stored procedures
Triggers
User-defined functions
JS
JS
JS
…
Elastic collections
• Collection != single partition• Partition count dynamic• Each partition (key) is 10 GB• Online splits and merges with
full availability• RUs evenly distributed
across partitions
Rich query over JSON data
Native JavaScript transactional processing
Familiar SQL-based query language
Query on JSON data without specifying secondary indices or constructing views
Build modern, scalable apps with robust transactional querying and data processing on JSON documents
JavaScript transactionsTransactionally process multiple documents with application-defined stored procedures
and triggers• JavaScript as the procedural language• Language integrated• Execution wrapped in an implicit transaction• Preregistered and scoped to a collection• Performed with ACID guarantees• Triggers invoked as pre- or post-operations
Stored procedures
JS
Triggers
Reliable and predictable performance
Tunable consistencyTune and trade-off consistency and performance through well-defined levels to suit application-scenario needs—from eventual to strong
Elastic scaleEnterprise-tested by a large, internal, consumer-facing service
Fast, predictable performanceDefined throughput levels that scale linearly with application needs
Azure DocumentDB is born in the cloud to achieve fast, predictable performance with reserved resources to deliver on throughput needs. Delivers reliable, tunable consistency to increase performance based on application needs.
Document myDoc = await client.ReadDocumentAsync(documentLink, new RequestOptions { ConsistencyLevel = ConsistencyLevel.Eventual });
Four consistency levels
Strong Session
BoundedStaleness
Eventual
Lower consistency level on read operations
Consistency levels enable guaranteesChoose your consistency level and make predictable trade-offs between consistency, availability, and performance
Choose your level
StrongData consistency
SessionMonotonic reads (on explicit read requests) and writes
Bounded StalenessTotal order of propagation of writes
EventualLowest latency for reads and writes
Security model
Azure Document DB is designed to be secure with:• Master key• Access control on resources• User operations• Permission operations• Code execution
Rapid development
Easy to start and fully-manage
Enterprise-grade Azure platform
Build with familiar tools—REST, JSON, and JavaScript
Reduce development friction and complexity when building new business-class applications by using familiar tools and industry-standard platforms. Combine Azure DocumentDB with a portfolio of complementary cloud services on the Azure platform, such as the Azure HDInsight Connector and Azure Search Indexer
Tools
https://azure.microsoft.com/en-us/blog/exploring-azure-documentdb-in-visual-studio/
Microsoft Cloud Explorer for Visual Studio
Azure DocumentDB data-migration tool
https://azure.microsoft.com/en-us/documentation/articles/documentdb-import-data/
http://portal.azure.com
Document Explorer in Azure portal
Azure DocumentDB service summary
Unique among NoSQL stores:• Developed for the cloud and for delivery
as a service• Truly query-able JSON store• Transactional processing through language-
integrated JavaScript• Predictable performance and
tunable consistency
Development scenariosConsider Azure DocumentDB when you need:• To build new web and mobile cloud-based applications• Rapid development and high-scalability requirements• Query and processing of user- and device-generated data• More query and processing support for your key-value stores• To run a document store in virtual machines• A managed service model
Build your first Azure DocumentDB app today
Get supportSchedule a 1:1 chat directly with the Azure DocumentDB engineering team at askdocdb.com
Give feedbackAsk questions through the forum at http://aka.ms/docdbforum Suggest an idea and vote to support other ideas for Azure DocumentDB at http://aka.ms/docdbideas On Twitter @documentdb
Get startedSign up for Azure DocumentDB at http://aka.ms/docdbstart Access and configure your account at http://portal.azure.comDownload an SDK from http://aka.ms/docdbsdks, and then build a sample at http://aka.ms/docdbsample
Go to http://www.documentdb.com/sql/demo
Test out sample queries or write your own against the dataset
Using DocumentDB Query Playground
MICROSOFT CONF IDENT IAL – INTERNAL ONLY
Learn moreDavid Chappell NoSQL overview paper on Infopediahttp://www.davidchappell.com/writing/white_papers/Azure-NoSQL-Technologies-v2.0--Chappell.pdf
Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement [book]http://www.pdfiles.com/pdf/files/English/Databases/Seven_Databases_In_Seven_Weeks.pdf
Replicated Data Consistency Explained Through Baseball [paper]http://research.microsoft.com/apps/pubs/default.aspx?id=206913
Q & A ?James Serra, Big Data EvangelistEmail me at: [email protected] me at: @JamesSerra Link to me at: www.linkedin.com/in/JamesSerra Visit my blog at: JamesSerra.com (where this slide deck will be posted)