Upload
amazon-web-services
View
2.686
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Come learn about architecting high-performance applications and production workloads using Amazon RDS for SQL Server. Understand how to migrate your data to an Amazon RDS instance, apply security best practices, and optimize your database instance and applications for high availability.
Citation preview
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
DAT303 - A Closer Look at Amazon RDS for Microsoft SQL Server Deep Dive into Performance, Security, and Data Migration Best Practices
Sergei Sokolenko - Sr Product Manager, AWS
Allan Parsons - VP Operations, Viddy
November 13, 2013
• Best Practices – Security
– Performance
– Data Migration
– Data Durability
• Viddy’s Case
Next Hour …
Security Best Practices
Control Access Internet
IAM
VPC
Encrypt Your Data
• “In transit” with SSL – Import public Amazon RDS certificate into Windows
https://rds.amazonaws.com/doc/rds-ssl-ca-cert.pem
– Add "encrypt=true" to your connection string
• “At rest” with Transparent Data Encryption – Encrypts data before writing to storage
– Decrypts when reading
Performance Best Practices
High Performance Relational Databases
Amazon RDS Configuration
Increase Throughput
Reduce Latency
Push-Button Scaling
DB Shards
Provisioned IOPS
Push-Button Scaling Provisioned IOPS Database Shards
Push Button Scaling & Sharding
• Scale nodes vertically up or down – M1.small (1 virtual core, 1.7GB)
– M2.4XLarge (8 virtual cores, 64GB)
• Scale out nodes horizontally – Shard based on data or workload
characteristics
Production = Provisioned IOPS Consistently fast performance
• 1 TB max instance size
• 10,000 Provisioned IOPS
• I/O-Optimized instances
• Check I/O blockers – Database contention
– Locking
Data Migration Best Practices
Replication + Switchover
Linked Servers
SSIS
Bulk Migration
Import/Export Wizard
BCP Bulk Load
Migrating Data to Amazon RDS
One-time Bulk Migration
On Premise AWS
Migration Code Snippets -- Run SSMS’s “Generate and Publish Scripts” Wizard
-- .BAT script for export BCP commands
SELECT 'bcp ' + db_name() + '..' + name + ' out “C:\Data\' + name + '.txt" -E -n -S localhost –U usr –P pwd' FROM sysobjects WHERE type = 'U'
bcp dbname..table out “C:\Data\table.txt” –E -n -S localhost -U usr -P pwd
-- .BAT script for import BCP commands
SELECT 'bcp ' + db_name() + '..' + name + ' in “C:\Data\' + name + '.txt" -E -n –S RDSEndpoint –U usr –P pwd‘ from sysobjects where type = 'U‘
bcp dbname..table in “C:\Data\table.txt” –E -n -S endpoint,port -U usr -P
pwd
More Info: Data Import Guide for SQL Server
Tables Only
Script USE DATABASE = False
Script Check Constraints = False
Script Foreign Keys = False
Script Primary Keys = False
Script Unique Keys = False
Ongoing Replication with Switchover
SourceINST
On Premise TargetINST
AWS
Linked Server
On Target Instance (Amazon RDS) USE master;
CREATE LOGIN [repl_login] WITH PASSWORD=N'password01', DEFAULT_DATABASE=[master], DEFAULT_LANGUAGE=[us_english], CHECK_EXPIRATION=OFF, CHECK_POLICY=OFF;
USE UserDB1;
CREATE USER [repl_user] FOR LOGIN [repl_login];
EXEC sp_addrolemember 'db_datareader', [repl_user];
EXEC sp_addrolemember 'db_datawriter', [repl_user];
-- Assume Source DB has a table “Customers”
CREATE TABLE StageCustomers ( CustomerID int, UpdatedDate datetime );
On Source Instance (On-Premise) USE master;
EXEC sp_addlinkedserver N'[TargetINST.amazonaws.com,port]', N'SQL Server';
CREATE LOGIN [repl_login] WITH PASSWORD=N'password02', DEFAULT_DATABASE=[master], DEFAULT_LANGUAGE=[us_english], CHECK_EXPIRATION=OFF, CHECK_POLICY=OFF;
EXEC sp_addlinkedsrvlogin
@rmtsrvname = N'[TargetINST.amazonaws.com,port]', N'SQL Server',
@useself = 'FALSE', @locallogin = N'repl_login',
@rmtuser = N'repl_login', @rmtpassword = N'password01';
USE UserDB1;
INSERT INTO [TargetINST.amazonaws.com,port].UserDB1.dbo.StageCustomers (CustomerID, UpdatedDate)
SELECT CustomerID,UpdatedDate FROM Customers WHERE UpdatedDate >= DATEADD(DD,-2,GETDATE());
Data Durability Best Practices
Backups and Disaster Recovery • Automated Backups
Nightly system snapshots + transaction backup
Enables point-in-time restore to any point in retention period
Max retention period = 35 days
• DB Snapshots
User-driven snapshots of database
Kept until explicitly deleted
Region 1
AZ 1
Region 2
AZ 1
Cross Region Snapshot Copy
Viddy’s Case
Allan Parsons, Viddy
Scaling viddy.com on Amazon RDS for SQL Server
Vision
To entertain and connect
people around the world by
empowering mobile users to
easily capture, beautify and
share amazing videos to
those who matter most.
Viddy By The Numbers
• Reach :: 41+ Million Registered Users
• Connections :: 250+ Million Users Connections
• Media :: 6.0+ Million Unique Videos
• CDN Assets (encoded videos + images)
• Videos :: 30+ Million Video Files
• Images :: 2+ Billion Image Files
• Human Power
• Executives & Support Staff :: 4
• Software Engineers :: 6
• DevOps Engineers :: 1
• Database Administrators :: 0
What Powers Viddy
• Web / Front-End :: Windows / IIS (C# / .NET / MVC)
• Cache :: Linux / memcached (via Couchbase)
• Persistent Cache :: Linux / Redis (2x Master-Slave Environments)
• Source Control :: Team Foundation Server
• Continuous Integration & Build Automation :: Jenkins, Powershell, msbuild
• AWS & EC2 Tools
• VPCs :: 1 VPC/Environment (Production, QA, Dev)
• RDS :: 11 SQL Server Instances Housing 144 Databases (Production)
• SNS / SQS :: Used for Eventual Consistency
• Route53 & ELBs :: DNS and Load Balancing
• CloudWatch :: Monitoring & Trending
• CloudSearch :: Media, Tag, and User Searching
• S3 & CloudFront :: Asset Storage and Delivery
We’re a Technology Agnostic Stack & Team
Early Technical Challenges Wrong Cloud Ideology
• Inherited a PaaS Cloud Infrastructure
Difficulty in Caching Data
• Twitter-based Service Model
Underestimated Power of Facebook
• Open Graph drove 1MM+ User Registrations / 24H Period
Very Very Busy SQL Instance
• 1 Instance, 6 Databases
• Disabled Key Constraints to Improve Performance
• Too busy to get transactionally consistent backups
Inflexible Platform
• Adding machines would make inefficiencies worse
• On PaaS, more money != more scalability
Moving to AWS
VPC
• Guaranteed affinity between Web, Cache, SQL
• Low Latency
• Better security
SQL
• Tremendous cleanup effort
• 144 RDS shells & filled via ETL
• Engineered Eventual Consistency to Move Deltas
Build Automation
• Build Scripts dual-deployed to PaaS and IaaS
• Developers could build & test multiple times per hour on 2 providers
DNS
• Moved all zones to Route53 & Lowered TTLs
• Updated DNS entries Christmas Eve 2012 (low traffic)
Goal: PaaS to IaaS with Zero Downtime
RDS Eventual Consistency
[1] :: API Servers Push Messages to Amazon SNS Topic
[2] :: Amazon SNS Distributes Message to SQS Queue
[3] :: Windows Service Monitors Queues
[4] :: Windows Service Pushes Message to Shard
Advantages :: Can lose Windows Service, keep messages
:: Can lose DB Shard, keep messages
:: Easy to Scale!
+ more queues
+ more messages
= More Windows Services / EC2 Machines
Shards Based On UserID (GUID)
Provisioning On RDS
SQL Edition
• SQL Server 2012 Standard (BizSpark)
Storage Allocation
• We took the max (1TB)
• Changing Storage = downtime
IOPS
• Busiest Instance (ViddyDB) has 7,000 provisioned IOPS
• Shards have no provisioned IOPS
• Occasional hotspots when celebrities post content
• Changing IOPS = downtime
Instance Size
• Busiest Instance (ViddyDB) has largest size (m2.4xlarge)
• Shards running (m2.2xlarge)
• Changing Instance Size = downtime
VPC Placement
• VPC guarantees node affinity (ours sit in private segment)
• Change VPC Placement = downtime
Goal: As Hands Off As Possible (we don’t have a DBA)
Designing for High Availability
Amazon RDS In VPCs
• At the time we provisioned (Nov-2012), no data replication across AZs
• Single point of failure is Availability Zone
• Running our own replication meant no RDS (and need a DBA)
• RDS didn’t force SQL Server’s AlwaysOn Technology
Sharded Model
• User exists in 1/64 Consumer Shards & 1/64 Producer Shards
• Database goes down: 1/64 users affected (1.5%)
• Instance goes down: 1/8 users affected (12.5%)
Eventual Consistency
• Amazon SNS/SQS Guarantees Eventual Consistency
• Visibility Timeout gives us time to get DB or Instance back online
• Sharded Amazon SQS = won’t affect other shards during downtime
Snapshots
• Set it and forget it
• Reliably works
• Allows us to regularly refresh non-prod DBs via scripts.
Goal: Easily & Quickly
Recover from Outage
Security Considerations The Basics
• Application config files use separate restricted accounts (not SA)
• DBs sit in private VPC segment
• Port restrictions done at Security Group Level
• Viddy HQ is whitelisted
• Developers can connect remotely over OpenVPN
• Support staff gets read-only DB access if they know SQL
The Facebook Security Model
• Every developer has access to everything (we’re a team of 7)
• Less friction, empowers developers
• With great privilege comes great responsibility
Questions?
Try Amazon RDS for SQL Server!
• Start using Transparent Data Encryption (TDE) – See Amazon RDS for SQL Server documentation
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/
• Try Cross Region Snapshot Copy
Please give us your feedback on this
presentation
As a thank you, we will select prize
winners daily for completed surveys!
DAT303