Upload
marco-parenzan
View
211
Download
3
Tags:
Embed Size (px)
Citation preview
IoT day 2015
NoSQL in Azure per l’IoT
(e il Business)
Marco ParenzanMicrosoft Azure MVP
@marco_parenzan
marco [dot] parenzan [at] 1nn0va [dot] it
IoT day 2015
Sponsor
IoT day 2015
Speaker info/Marco Parenzan
www.slideshare.net/marco.parenzan
www.github.com/marcoparenzan
marco [dot] parenzan [at] 1nn0va [dot] it
www.1nnova.it
@marco_parenzan
Formazione ,Divulgazione e Consulenza con 1nn0va
Microsoft MVP 2014 for Microsoft Azure
Cloud Architect, NET developer
Loves Functional Programming, Html5 Game Programming and Internet of Things
Microservices
Saturday 2015:
un viaggio con
NServiceBus LI
VE
AZURE
COMMUNITY
BOOTCAMP 2015
IoT as an hobby (now…?)
IoT day 2015
Data Ecosystem
Where do I put data
received in EventHub?
From private to public Cloud
A Continuous offering
Microsoft Relational Storage Options
IoT day 2015
SQL Server database technology “as a Service”
Fully managed database-as-a-service built on SQL with near zero administration
Enterprise-ready with automatic support for HA, DR, Backups, replication and more
Highly available and elastically scalable for unpredictable SaaS workloads
Uptime SLA of 99.99%
Predictable performance & Pricing
Built-in regional database geo-replication for additional protection
All core search capabilities - faceting, suggestions, geospatial
Secure and compliant for your sensitive data
Fully compatible with SQL Server 2014 databases
SQL Azure features
StreamingRelational
Internal &
external
Non-
relational NoSQL
MobileReports
Natural
language
queryDashboardsApplications
Orchestration
Machine
learningModeling
Information
management
Complex
event
processing
Data
The Microsoft data platform
The traditional world
IoT day 2015
Business, no longer data, is the foundation of software design
DDD!=OOP
Don’t start from Data
Data are not unique
No more ACID…ACID transactions are not useful with a
distributed model over different storages
Paradigm Shift
IoT day 2015
How many queries can be determined at level analysis?
“A repository should offer an explicit and well defined contract
and avoid arbitrary query”
In business … don’t‘ delete anything (Repository doesn’t
delete anything)
From theory to practice
Classic MVC
Business Logic
Contract BL/P
View
Controller
CQRS (Service Bus powered)
Event Handler
UI
EventCommand Handler
Queue
Topics/Subscription
CQRS for IoT (Service Bus Powered)
Event Handler
UI
Event
Command Handler
Even
t
Device
Queue
Topics/Subscription
Event Hub
Write
Model
Read
/Search
Model
IoT day 2015
No longer build on data…but on “what happens”
No more one single data store
Data store typess
Logs
Persistence
Saga (long transactions)
Search
Event-based systems
The Big Picture
A modern view:
The traditional world in Azure
Why Use a NoSQL Technology on Azure?
Choosing a Data Technology
IoT day 2015
Db for what?
To store data?
To manipulate data?
Long-term theme
IoT day 2015
NoSql Introduction
IoT day 2015
Key/Value
Table
Blob
Queue
Graph
Document
Not Only Sql Paradigms
What is a document database?
Definitely NOT this
kind of document !
What is a document database?
Not ideal, but it can work -
{
"id": "13244_post",
"text": "Lorizzle ghetto dolor tellivizzle boofron, stuff pimpin' elizzle. Nullam sapizzle
velizzle, my shizz tellivizzle, suscipizzle funky fresh, shizzle my nizzle crocodizzle
vizzle, arcu. Pellentesque eget tortizzle. Sizzle erizzle. Mammasay mammasa mamma oo sa
break it down dolor own yo' things fo shizzle mah nizzle fo rizzle, mah home g-dizzle
sure. Maurizzle pellentesque dawg ghetto turpizzle. Shiz izzle my shizz. Pellentesque
eleifend rhoncizzle nisi. In its fo rizzle owned ma nizzle dictumst. Sizzle gangsta.
Curabitur tellizzle urna, pretizzle go to hizzle, mattizzle izzle, eleifend vitae,
tellivizzle. Dawg shizzlin dizzle. Integer semper velit sizzle stuff.
Boofron mofo auctizzle ma nizzle. Pot a elizzle ut nibh pretium tincidunt. Maecenizzle
things erat. Own yo' in lacizzle sed maurizzle elementizzle tristique. I'm in the
shizzle yippiyo sizzle daahng dawg eros ultricizzle . In velit tortor, ultricizzle
ghetto, hendrerizzle fo shizzle mah nizzle fo rizzle, mah home g-dizzle, adipiscing
crunk, boom shackalack. Etizzle velit doggy, hizzle consequizzle, pharetra get down
get down, dictizzle sed, shut the shizzle up. Fo shizzle neque. Fo lorizzle. Bling
bling vitae pizzle ut libero commodo gizzle. Fusce izzle augue eu yo mamma dang.
Phasellizzle break it down fo nizzle erat. Suspendisse shizzlin dizzle owned,
sollicitudin sizzle, mah nizzle izzle, commodo nec, justo. Donizzle fizzle
porttitizzle ligula. Nunc feugizzle, tellus tellivizzle ornare tempor, sapizzle break
it down tincidunt gangster, eget dapibus daahng dawg enizzle izzle that's the shizzle.
Stuff quizzle leo, imperdizzle izzle, fo shizzle my nizzle izzle, semper izzle,
sapien. Ut boofron magna vizzle ghetto. I'm in the shizzle ante bling bling,
suscipizzle vitae, yo mamma stuff, rutrizzle pizzle, velizzle.
Mauris da bomb go to zzle. Sizzle mammasay mammasa mamma oo sa magna own yo' amet risus
congue. Boofron mofo auctizzle ma nizzle. Pot a elizzle ut nibh pretium tincidunt.
things erat. Own yo' in lacizzle sed maurizzle elementizzle tristique. I'm in the
shizzle yippiyo sizzle daahng dawg eros ultricizzle . In velit tortor, ultricizzle
ghetto, hendrerizzle fo shizzle mah nizzle fo rizzle, mah home g-dizzle, adipiscing
crunk, boom shackalack. Etizzle velit doggy, hizzle consequizzle, pharetra get down
get down, dictizzle sed, shut the shizzle up. Fo shizzle neque. Fo lorizzle. Bling "
}
What is a document database?
Ideally suited to this
kind of document -
{
"id": "13244_user",
"firstName": "John",
"lastName": "Smith",
"age": 25,
"employmentHistory" : [
{
"company":"Contoso Inc"
"start": {"date":"Thu, 02 Apr 2015 20:54:45 GMT", "epoch":1428008086},
"position":"CEO"
},
{
"start": {"date":"Thu, 02 Apr 2012 20:54:45 GMT", "epoch":1428008086},
"end": {"date":"Thu, 01 Apr 2015 20:54:45 GMT", "epoch":1428008086},
"position":"GM"},
],
"address":
{
"streetAddress": "21 2nd Str",
"city": "New York",
"state": "NY",
"postalCode": "10021"
},
"children": [
{"name":"Megan", "age":10},
{"name": "Bruce", "age":7},
{"name": "Angus", "sports" : ["football", "basketball", "hockey"]}
]
"mobileNumber": "212 555-1234"
}
IoT day 2015
JSON can represent complex containment relationships that are
difficult to represent in RDBMS
Schema-less – great for growing requirements during dev unlike
RDBMS where you must know the structure up front and its
painful to modify it
Native notation for JavaScript
Why JSON?
IoT day 2015
try to treat your entities as self-contained documents represented in JSONWhen working with relational databases, we've been taught for years to normalize, normalize, normalize.
There are contains relationships between entities.
There are one-to-few relationships between entities.
There is embedded data that changes infrequently.
There is embedded data won't grow without bound.
There is embedded data that is integral to data in a document.
Embedding
better read performance
IoT day 2015
Representing one-to-many relationships.
Representing many-to-many relationships.
Related data changes frequently.
Referenced data could be unbounded
Provides more flexibility than embedding
More round trips to read data
Referencing
Normalizing typically provides better write performance
• No magic bullet
Think about how your data
is going to be written, read
and model accordingly
Hybrid models ~ denormalize + reference + aggregate
{ "id": "1", "firstName": "Thomas", "lastName": "Andersen", "countOfBooks": 3, "books": [1, 2, 3], "images": [
{"thumbnail": "http://....png"} {"profile": "http://....png"}
] }
{ "id": 1, "name": "DocumentDB 101", "authors": [
{"id": 1, "name": "Thomas Andersen", "thumbnail": "http://....png"}, {"id": 2, "name": "William Wakefield", "thumbnail": "http://....png"}
] }
IoT day 2015
Promote code first development (mapping objects to json)
Resilient to iterative schema changes
Richer query and indexing (compared to KV stores)
Low impedance as object / JSON store; no ORM required
It just works
It’s fast
Developer Appeal
IoT day 2015
DocumentDb Introduction
IoT day 2015
Store schema-less JSON documents
Excels at search w/ SQL syntax
JavaScript for Stored Procs, Triggers and UDFs
Elastic capacity (not in specific Azure sense, up to now)
Multi-document transaction (Batch)
Tweak everything (read/write performance vs. consistency, index performance, security)
Designed for massive scale
What is DocumentDb?
IoT day 2015
Applications that need managed elastic scale
Customer does not want to add additional IT resources for
support and maintenance
Avoiding CAPEX and OPEX
Built-for-the-cloud database technology
Access via RESTful HTTP API or client library
DocumentDB: DbaaS
IoT day 2015
Catalog data
Preferences and state
Event store
User generated content
Data exchange
Typical usage
IoT day 2015
Resource Model
Database Account
JS
JS
JS
101010
Database
JS
JS
JS
101010
Collections
JS
JS
JS
101010
* collection != table of homogenous entities
collection ~ a data partition
Documents
JS
JS
JS
101010
{
"id" : "123"
"name" : "joe"
"age" : 30
"address" : {
"street" : "some st"
}
}
Users, Server Scripts, Attachments
JS
JS
JS
101010
IoT day 2015
Collections
IoT day 2015
a container of JSON documents and the associated JavaScript
application logic
JSON docs inside of a collection can vary dramatically
A unit of scale for transaction and query throughput (capacity
units allocated uniformly across all collections)
A unit of scale for capacity
A unit of replication
What is a collection?
IoT day 2015
Collections in DocumentDB are not just logical containers, but also physical containers
They are the transaction boundary for stored procedures and triggers
entry point to queries and CRUD operations
Each collection is assigned a reserved amount of throughput which is not shared with other collections in the same account
Collections do not enforce schema
Collections
IoT day 2015
Partitioning
Design: Partitioning
Why Partition?
• Data SizeA single collection (currently*) holds 10GB
• Throughput3 Performance tiers with a max of 2,500 RU/sec
IoT day 2015
In hash partitioning, partitions are assigned based on the value
of a hash function, allowing you to evenly distribute requests
and data across a number of partitions. This is commonly used
to partition data produced or consumed from a large number of
distinct clients, and is useful for storing user profiles, catalog
items, and IoT ("Internet of Things") telemetry data.
Hash Partitioning
IoT day 2015
In range partitioning, partitions are assigned based on whether
the partition key is within a certain range
This is commonly used for partitioning with time
stamp properties
Keep current data hot, Warm historical data, Scale-down older
data, Purge / Archive
Range partitioning
IoT day 2015
In lookup partitioning, partitions are assigned based on a lookup map that assigns discrete partition values to specific partitions a.k.a. a partition or shard map
This is commonly used for partitioning by region
Lookup partitioning
Tenant Partition Id
Customer 1
Big Customer 2
Another 3
{
record: "1",created: {
"date": "6/1/2014","epoch": 1401662986
}},
{record: "3",created: {
"date": "9/23/2014""epoch": 1411512586
}} ,
{record: "123",created: {
"date": "8/17/2013""epoch": 1376779786
}}
SELECT * FROM root r WHERE r.date.epoch BETWEEN 1376779786 AND 1401662986
{
record: "1",created: {
"date": "6/1/2014","epoch": 1401662986
}},
{record: "3",created: {
"date": "9/23/2014""epoch": 1411512586
}}
{record: "43233",created: {
"epoch": 1411512586}
} ,
{record: "1123",created: {
"date": "8/17/2013""epoch": 1376779786
}},
{ record: "43234",created: {
"epoch": 1376779786}
Partitioning - Fan-out Queries
IoT day 2015
Consistency
IoT day 2015
Query / transaction throughput (and reliability – i.e., hardware failure) depend on
replication!
All writes to the primary are replicated across two secondary replicas
All reads are distributed across three copies
“Scalability of throughput” – allowing different clients to read from different replicas helps prevent
bottlenecks
BUT replication takes time!
Potential scenario: some clients are
reading while another is writing
Now, the data is out-of-date, inconsistent!
Why worry about consistency?
IoT day 2015
Trade-off: speed (performance & availability) or consistency (data correctness)?“Does every read need the MOST current data?”
“Or do I need every request to be handled and handled quickly?”
No “one size fits all” answer … so it’s up to you!
4 options …For the entire Db…
…In a future release, we intend to support overriding the default consistency level on a per collection basis.
Tweakable Consistency
IoT day 2015
client always sees completely consistent data
Slowest reads / writes
Mission critical: e.x. stock market, banking, airline reservation
Strong
IoT day 2015
Default – even trade-off between performance & availability vs.
data correctness
client reads its own writes, but other clients reading this same
data might see older values
Session
IoT day 2015
client might see old data, but it can specify a limit for how old
that data can be (ex. 2 seconds)
Updates happen in order received
similar to Session consistency, but speeds up reads while still
preserving the order of updates
Bounded Staleness
IoT day 2015
client might see old data for as long as it takes a write to
propagate to all replicas
High performance & availability, but a client might sometimes
read out-of-date information or see updates out of order
Eventual
IoT day 2015
At the database level (see preview portal)
On a per-read or per-query basis (optional parameter on
CreateDocumentQuery method)
Setting Consistency
IoT day 2015
Use Weaker Consistency Levels for better Read latencies
• IoT
• Data Analysis
http://azure.microsoft.com/blog/2015/01/27/performance-tips-
for-azure-documentdb-part-2/
Consistency Tips
IoT day 2015
Indexing
IoT day 2015
Efficient, rich hierarchical and relational queries without any schema or
index definitions.
Consistent query results while handling a sustained volume of writes. For
high write throughput workloads with consistent queries, the index is
updated incrementally, efficiently, and online while handling a sustained
volume of writes.
Storage efficiency. For cost effectiveness, the on-disk storage overhead of
the index is bounded and predictable.
Indexing
var collection = new DocumentCollection
{
Id = "lazyCollection"
};
collection.IndexingPolicy.IndexingMode = IndexingMode.Lazy;
client.CreateDocumentCollectionAsync(databaseLink, collection);
Indexing modes
ConsistentDefault mode
Index updated synchronously on writes
LazyUseful for bulk ingestion scenarios
Indexing policies
AutomaticDefault
ManualCan choose to index documents via RequestOptions
Can read non-indexed documents via selflink
Indexing – Modes and policies
Set indexing mode
Set indexing policy
var collection = new DocumentCollection{
Id = "manualCollection"};
collection.IndexingPolicy.Automatic = false;
client.CreateDocumentCollectionAsync(databaseLink, collection);
Setting paths, types, and precisionvar collection = new DocumentCollection
{ Id = "Orders"
};
collection.IndexingPolicy.ExcludedPaths.Add("/\"metaData\"/*");
collection.IndexingPolicy.IncludedPaths.Add(new IndexingPath{
IndexType = IndexType.Hash,Path = "/",
});
collection.IndexingPolicy.IncludedPaths.Add(new IndexingPath{
IndexType = IndexType.Range,Path = @"/""shippedTimestamp""/?",NumericPrecision = 7
});
client.CreateDocumentCollectionAsync(databaseLink, collection);
Index paths
Include and/or Exclude paths
Index types
HashSupported for strings and numbers
Optimized for equality matches
RangeSupported for numbers
Optimized for comparison queries
Index precision
String precisionDefault is 3
Numeric precisionDefault is 3
Increase for larger number fields
Indexing – Paths and types
IoT day 2015
Use lazy indexing for faster peak time ingestion rates
Exclude unused paths from indexing for faster writes
Specify range index path type for all paths used in range queries
Vary index precision for write vs query performance vs storage
tradeoffs
http://azure.microsoft.com/blog/2015/01/27/performance-tips-
for-azure-documentdb-part-2/
Indexing tips
IoT day 2015
Querying
IoT day 2015
Optimize for queries with small result sets for scalability
Limit use of scans (no range index, NOT, UDFs in WHERE)
Use page size (MaxItemCount) and continuation tokens
For large result sets, use a larger page size (1000)
Querying
Query over heterogeneous documents without defining
schema or managing indexes
Query arbitrary paths, properties and values without
specifying secondary indexes or indexing hints
Execute queries with consistent results
Supported SQL features; predicates, iterations (arrays),
sub-queries, logical operators, UDFs, intra-document
JOINs, JSON transforms
In general, more predicates result in a larger request
charge.
Additional predicates can help if they result in narrowing
the overall result set.
from book in client.CreateDocumentQuery<Book>(collectionSelfLink)
where book.Title == "War and Peace"
select book;
from book in client.CreateDocumentQuery<Book>(collectionSelfLink)
where book.Author.Name == "Leo Tolstoy"
select book.Author;
-- Nested lookup against index
SELECT B.Author
FROM Books B
WHERE B.Author.Name = "Leo Tolstoy"
-- Transformation, Filters, Array access
SELECT { Name: B.Title, Author: B.Author.Name }
FROM Books B
WHERE B.Price > 10 AND B.Language[0] = "English"
-- Joins, User Defined Functions (UDF)
SELECT udf.CalculateRegionalTax(B.Price, "USA", "WA")
FROM Books B
JOIN L IN B.Languages
WHERE L.Language = "Russian"
LINQ Query
SQL Query Grammar
Query
IoT day 2015
Programmability
function region(doc)
{
switch (doc.Location.Region)
{
case 0:
return "North";
case 1:
return "Middle";
case 2:
return "South";
}
}
The complexity of a query impacts the
request units consumed for an operation:
Use of user-defined functions (UDFs)
SELECT or WHERE clauses
To take advantage of indexing, try and have at least one filter against an indexed property when leveraging a UDF in the WHERE clause
.
Query with user-defined function
function count(filterQuery, continuationToken) {var collection = getContext().getCollection();var maxResult = 25; // MAX number of docs to process in one
batch, when reached, return to client/request continuation. // intentionally set low to demonstrate the concept. This can
be much higher. Try experimenting.// We've had it in to the high thousands before seeing the
stored proceudre timing out.
// The number of documents counted.var result = 0;
tryQuery(continuationToken);}
Execute “explicit” Javascript
code on collection
Executing Stored Procedures
function normalize() {var collection = getContext().getCollection();var collectionLink = collection.getSelfLink();var doc = getContext().getRequest().getBody();
var newDoc = {"Sensor": {"Id": doc.sensorId,"Class": 0},"Degree": {"Value": doc.degreeValue,"Type": 0},"Location": {"Name": doc.locationName,"Region": doc.locationRegion,"Longitude": doc.locationLong,"Latitude": doc.locationLat},
"id": doc.id};
// Update the request -- this is what is going to be inserted.getContext().getRequest().setBody(newDoc);
}
Execute “implicit” Javascript
code on CRUD operations
(Insert, Update, Delete) on
collections
Triggers!
IoT day 2015
Performances
IoT day 2015
Data is saved on SSD
All writes to the primary are replicated across two secondary replicas(Replicas are spread on different hardware in same region to protect against failures)
All reads are distributed across the three copies (when and how depend on consistency level for db account and query)
DocumentDb Performance
IoT day 2015
Measure and Tune for lower request units/second usage
DocumentDB offers a rich set of database operations including relational and hierarchical queries with UDFs, stored procedures and triggers – all operating on the documents within a database collection. The cost associated with each of these operations will vary based on the CPU, IO and memory required to complete the operation. Instead of thinking about and managing hardware resources, you can think of a request unit (RU) as a single measure for the resources required to perform various database operations and service an application request.
Handle Server throttles/request rate too large
When a client attempts to exceed the reserved throughput for an account, there will be no performance degradation at the server and no use of throughput capacity beyond the reserved level. The server will preemptively end the request with RequestRateTooLarge (HTTP status code 429) and return the x-ms-retry-after-ms header indicating the amount of time, in
milliseconds, that the user must wait before reattempting the request.
Delete empty collections to utilize all provisioned throughput
Every document collection created in a DocumentDB account is allocated reserved throughput capacity based on the number of Capacity Units (CUs) provisioned, and the number of collections created. A single CU makes available 2,000 request units (RUs) and supports up to 3 collections
Design for smaller documents for higher throughput
The Request Charge (i.e. request processing cost) of a given operation is directly correlated to the size of the document
http://azure.microsoft.com/blog/2015/01/27/performance-tips-for-azure-documentdb-part-2/
Performance Tips
IoT day 2015
Considerations
IoT day 2015
User generated content
Many specific data (varbinary(MAX) in SQL)
Catalog data
Log data
User preferences data
Device sensor data
IoT use cases commonly share some patterns in how they ingest, process and store data. First, these systems allow for data intake that can ingest bursts of data from device sensors of various locales. Next, these systems process and analyze streaming data to derive real time insights. And last but not least, most if not all data will eventually land in a data store for adhoc querying and offline analytics.
Usage: what is DocumentDb for?
IoT day 2015
Maturity: Balancing embedding (ok) and relating (limits)
Searching and Denormalizing
Opportunity
Storing transient Data
Better Opportunities
Storing Files
Append Only
(Table) Storage
Limits from DocumentDb
IoT day 2015
Logs
Attachments
Transient Data
Search
Alternatives for some scenarios
IoT day 2015
Targeted at streaming workloads (E.g. files read from beginning
to end like media files)
Each blob consists of a sequence of blocks
Each block is identified by a Block ID
Each block can be a maximum of 64 MB in size
Size limit 200GB per blob
Azure Storage Blob: Block Blob
Block Blob:
IoT day 2015
Targeted at random read/write workloads (E.g. backing storage
for the VHDs used in Azure VMs)
Each blob consists of an array of pages
Each page is identified by its offset from the start of the blob
Size limit 1TB per blob
Azure Storage Blob: Page Blob
IoT day 2015
Not an RDBMS Table!
The mental picture is ‘Entities’
Entity can have up to 255 properties
Up to 1MB per entity
Partitioning
PartitionKey & RowKey are mandatory properties
Composite key which uniquely identifies an entity
They are the only indexed properties
Defines the sort order
Purpose of the PartitionKey:
Entity Locality
Entities in the same partition will be stored together
Efficient querying and cache locality
Entity Group Transactions
Target throughput – 500 tps/partition, several thousand tps/account
Microsoft Azure monitors the usage patterns of partitions
Automatically load balance partitions
Each partition can be served by a different storage node
Scale to meet the traffic needs of your table
Supports full manipulation (CRUD)
Table Scalability
Azure Table Storage Details
IoT day 2015
Embed a sophisticated search experience into web and mobile
applications without having to worry about the complexities of
full-text search and without having to deploy, maintain or
manage any infrastructure.
Perfect for enterprise cloud developers, cloud software vendors,
cloud architects who need a fully-managed search solution.
Search is a natural backend for CortanaTake a bunch of words apply linguistics return relevant results
Azure Search
IoT day 2015
“Search service”Scope for capacity
Bound to a region
Has keys, indexes, indexers, data sources
ProvisioningAzure Portal
Azure resource management API
Elastic scaleCapacity can be changed dynamically
Replicas ~ more QPS, HA
Partitions ~ more documents, write throughput
Azure Search Service
IoT day 2015
Simple HTTP/JSON API for creating indexes, pushing documents, searching
Keyword search with user-friendly operators (+, -, *, “”, etc.)
Hit highlighting
Faceting (histograms over ranges, typically used in catalog browsing)
Based on ElasticSearch
Search Functionality
IoT day 2015
Linguistics are key in search
Support for 50 languagesWord breaking, stop words, inflections
Lucene analyzersWell-known analyzer stack
Stemming
Microsoft analyzersSame NLP stack used by parts of Office, Bing
Lematization in many languages
Linguistics
IoT day 2015
Suggestions (auto-complete)
Rich structured queries (filter, select, sort) that combines with search
Scoring profiles to model search result relevance
Geo-spatial support integrated in filtering, sorting and ranking (such as finding all
restaurants within 5 KM of your current location)
Search Functionality
IoT day 2015
Redis is an open source, BSD licensed, networked, single-threaded, in-memory key-value cache and store.
Key-value cache and store (value can be a couple of things)
In-memory (no persistence, but you can)
Single-threaded (atomic operations & transactions)
Networked (it’s a server and it does master/slave)
Some other stuff (scripting, pub/sub, Sentinel, snapshot
Caching: Redis
IoT day 2015
Conclusions
IoT day 2015
Pro:
partitioning, replica and scaling at it’s core
self contained documents
programmability in Javascript
SQL like “intradocument” queries
Cons:
No SQL generic queries
Can work alone just in few scenarios
So DocumentDb…
IoT day 2015
Great storage opportunities in Azure
• Log
• Search
• Transient
• Files/Attachments
• SQL!
• And all new Data Analysis/Machine Learning opportunities
Other Not Only SQL alternatives
IoT day 2015
http://bit.do/documentdb-pricing
Capacity Units (CU)Capacity
Throughput (in terms of rate of transactions / second)
• Request Units (RU) = 2000 request per second
• “Request” depends on the size of the document – ex. Uploading 1000 large JSON documents
might count as more than one request
Pricing
Standard pricing tier with hourly billing
1 hr from just $0.034!
Performance levels can be adjusted
Each collection = 10GB of SSD
Collection* perf is set by S1, S2, S3
Limit of 100 collections (1 TB)
Soft limit, can be lifted as needed per account
What does DocumentDB cost?
* collection != table of homogenous entities
collection ~ a data partition
IoT day 2015
NoSQL in Azure per l’IoT
(e il Business)
Marco ParenzanMicrosoft Azure MVP
@marco_parenzan
marco [dot] parenzan [at] 1nn0va [dot] it