Upload
atidan
View
132
Download
0
Tags:
Embed Size (px)
Citation preview
RESOURCES
CONNECT WITH THE WORLD’S DATA Windows Azure Marketplace
https://datamarket.azure.com
WINDOWS AZURE MARKETPLACE
Business Model Base meter: $ per transaction (query or API call), charged per transaction or monthly subscriptions
1. Trial Usage: Try before you buy
2. Per Transaction: Pay-as-you-grow transaction based tiers
3. Monthly Subscription: Subscribe to the dataset for unrestricted access
Revenue sharing MS(20%) and content providers (80%)
Free offers – no cost
WINDOWS AZURE MARKETPLACE MOMENTUM
Registered Users: ~50,000 (double digit growth
month over month) (38 countries)
Subscriptions: 70,000+ (double digit growth month
over month)
Providers: 500+
Datasets: 130+ across 15 categories, Apps: 600+
Data Quality Services, Microsoft Translator, Bing
Search API
HOW TO ACCESS THE DATA?
Access from any platform or any app: Native OData-based APIs support
Query language over HTTP
Open standard protocol (www.odata.org)
Native service reference usage in Visual Studio
Downloadable proxy classes for fixed function services
URL-based query support
Authentication and authorization: Live ID for marketplace access
HTTP Basic auth over SSL with account key
WHY DATA MOVEMENT IS SO CRITICAL IN AZURE?
It is a common requirement for applications running in the cloud
May have impact on Initial production ramp up (on premises to Azure)
Ongoing operations (Azure to Azure)
Data protection and recovery (Azure to Azure, Azure to on premises)
Data lifecycle management, archiving, pruning (Azure to Azure, Azure to on premises)
Moving data between on premises and cloud, and between cloud
services is slower*
BULK COPY PROGRAM
Fast option to move data in and out from Azure SQL Database
Can run on premises, in a web/worker role or in a Virtual machine If used from on premises, expect potential latency issues
Minimum deployment requirements (SNAC)
Can be easily automated, scripted or part of a batch program
Can read/write data to/from a local disk, or from an attached persisted .vhd,
consider IOPs performance requirements when choosing between the various
options
Can run parallel import/export streams from/to the same set of objects to
reduce time
SQLBULKINSERT CLASS IN ADO.NET
Provide a way to programmatically move data into SQL
Databases from .NET code
Wrapper around bulk insert APIs
Can run on premises, in a web/worker role or in a VM, but
like other options proximity with the target service will
provide the best performance
Available since forever in ADO.NET
SQLBULKINSERT CLASS IN ADO.NET
Can be easily integrated with components like Transient
Fault Handling App Block to provide robust connection
management and support throttling or other issues
Provides all the other benefits of BCP.EXE, like the ability to
define batch size and to run multiple insert streams in
parallel
ADDITIONAL CONSIDERATION ON BATCHING DATA INSERTS
Executing multiple insert operations in a single roundtrip is a way to
minimize the increased network latency between Windows Azure
compute nodes and SQL Azure databases
Bulk Copy APIs provide best performance results over 100 rows inserted,
while produce a significant overhead (3.5x) when used with a single row
insertion
Batching multiple INSERT statements in a single Command Text provided
good performance between 1 and 10 records, while produce an
important overhead going to 100 or 1000 records inserted in a single
roundtrip (more than 3x)
ADDITIONAL CONSIDERATION ON BATCHING DATA INSERTS
Using a Table Valued Parameter approach in a “INSERT
INTO…. SELECT FROM @TVP” query statement provide
the consistently good result results in batches of 1-10-100-
1000 rows per roundtrip
SQL SERVER INTEGRATION SERVICES
Rich and mature data transformation framework and pipeline Well known to traditional DBAs and Devs
Many useful standard tasks and capabilities included out of the box
Officially supported in WA VMs It works on Web/Worker Roles, but not supported
Automated package creation and deployment makes SSIS an
interesting component to implement complex data movement
solution in a hybrid scenario
BACPAC FORMAT AND IMPORT/EXPORT SERVICE
Bacpac format provides a way to logically represent data models,
database objects (t-sql code, indexes,etc) with version support and other
benefits, together with a JSON serialized instance of database rows.
The Azure SQL Database Import/Export service uses the Bacpac format to
move data from/to a database and the Azure Blob Storage from the
management portal or a scriptable service endpoint.
Bacpac files can be then easily downloaded on premises and imported
into a SQL Server instance, or being manipulated through a
programmatic API.
BACPAC FORMAT AND IMPORT/EXPORT SERVICE
Import/Export service doesn't provide a transactionally consistent copy of your
SQL Database, so you can either stop user's workload or execute this
operation n a copy of your database executed with the ACREATE DATABASE
AS COPY OF option.
This option may seems the slowest one, compared to a raw export/import
operation, but youn need to consider that it performs several important tasks
that you'll need to execute anyway, like rebuilding a complex indexing
strategy, compress exported data and copy it t the Azure storage.
If your data movement needs involve some of these operations,
Import/Export service may be your option.
GRADE OF THE STEEL: AZURE DATA MOVE
Move data fast: Minimize downtime window finding the
fastest and more reliable data movement
mechanism available
Warning: based on previous assumptions
this process is non-deterministic in
nature
Move Data Reliably: May fail in between move phases as well
Check for consistency and implement
retry logic is necessary
KEY DATA MOVEMENT SCENARIOS
Initial data loading and migration
Data synchronization
Backup/restore for data protection and disaster recovery
“Whoops” errors protection and recovery
Sharding / Re-Sharding
INITIAL DATA LOADING AND MIGRATION
Move existing on premises relational data to Azure SQL Databases May reside in SQL Server or in a competitive product
Many existing tools available for this task Import/Export Service, BCP.EXE, SSIS
Recommended approach is to export data out of the on premises
databases, compress and copy to a shared storage on Azure, and
run your import operations from Azure roles to limit latency impact
Parallel loading streams can help performance
DATA SYNCHRONIZATION
After an initial data loading, continuous or scheduled synchronization
activities between on premises databases and cloud may be required
Data Sync Service is a cloud based solution, build on top of the Synch-FX,
that can hide most of the complexities and provide a code less solution
dedicated to DBAs and system engineers
Depending on the complexity of the sync flow, custom solution leveraging
change data capture on premises or other differencing approaches can
be applied
This scenario is critical for Hybrid Cloud implementations
MOVING CLOUD GENERATED DATA TO ON PREMISES
The ability to move cloud generated data from Azure SQL Databases to
on premises systems can be seen as a mono-directional version of data
synchronization
Depending on the complexity of customer requirements, this can be
implemented by taking a “snapshot” of the production database, moving
this into the Azure Storage, to be downloaded on premises and loaded
into a traditional system (e.g. a data warehousing system)
If customer requirements will increase, a more invasive approach could be
based on differencing data movement, limited only to the data that have
been modified over time
AZURE SQL DATABASE BACKUP / DR OPTIONS
Backup/restore for data protection and disaster recovery
“Whoops” errors protection and recovery
PITR – Point In Time Restore (Private CTP) 14 days rolling database backups automatically, in addition to local 3 or 4 replicas per db
Currently: restores the entire physical DB + all logical dbs, then extracts your logical db
DBCopy Limited scaling options
Must complete within 24 hours
DAC Import/Export Services Service runs in worker roles
Scale by adding more roles to support large number of db backups per hour
~100K dbs, avg ~300MB each in 6 hours using about 200 small instances
Backup file will be geo-replicated (Windows Azure Blob Storage)
NOTE: It does not guarantee transactional consistency – how to solve?
SHARDING / RE-SHARDING
Take an existing monolithic database and move data into a
sharded/partitioned data tier Initial sharding operation, source database can be on premises or already in Azure
SQL Databases
Split an existing shard into one or more destination databases for
scaling out or data management purposes Usually a cloud to cloud operation, may become a recurring one based on the application
and workload requirements
Could be ideally performed online, but if the application can
tolerate some downtime it will be much easier to implement
© 2011 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.
MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
© 2011 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.
MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.