Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
1FamilySearch’s Family Tree Web Application© 2018 Carnegie Mellon University
SATURN 201814th Annual SEI Architecture Technology User Network Conference
MAY 7–10, 2018 | PLANO, TEXAS
FamilySearch’s Family Tree Web Application
Replacing Relational Database Technology and Transitioning to Cloud-Hosted ComputingRandy A. Ynchausti
Software Architect
Email: [email protected]
Twitter: @RandyYnchausti
2FamilySearch’s Family Tree Web Application© 2018 Carnegie Mellon University
Outline
• FamilySearch Web Site• Family Tree Web Application• Replacing RDBMS and Cloud-Hosted Computing
Project- Scope and Motivation- Three Architecture and Design Challenges- Results
• Summary
5FamilySearch’s Family Tree Web Application© 2018 Carnegie Mellon University
FamilySearch Web Site
Notable Statistics
Statistic Value (Billion)
Searchable Names in Historical Records 6.2
Historical Record Images Online 1.25
Published Indexed Records Per Year 0.271
Web Page Views Per Day 11.2
This Photo by Unknown Author is licensed under CC BY-NC-ND
6FamilySearch’s Family Tree Web Application© 2018 Carnegie Mellon University
Family Tree Web Application
11FamilySearch’s Family Tree Web Application© 2018 Carnegie Mellon University
FamilySearch Web Site
Notable StatisticsStatistic Value
(Billion)Total Tree Persons 1.18
Tree Persons Added Per Month 0.38
Total Sources 0.915
Sources Added Per Month 0.005
Sources Attached to Tree Persons 1.28
This Photo by Unknown Author is licensed under CC BY-NC-ND
12FamilySearch’s Family Tree Web Application© 2018 Carnegie Mellon University
Replacing Relational DBMS and Cloud-Hosted Computing Project
13FamilySearch’s Family Tree Web Application© 2018 Carnegie Mellon University
Given:
• Virtualized private (on premise) data center
• Tens of services and hundreds of nodes
• Hundreds of database server licenses
• Many development teams
• Continuous delivery
• Budget events
• …
Project Scope and MotivationReplacing Relational DBMS and Cloud-Hosted Computing Project
14FamilySearch’s Family Tree Web Application© 2018 Carnegie Mellon University
We want:
1. Security
2. Availability
3. Scalability
4. Performance
5. Affordability
Project Scope and MotivationReplacing Relational DBMS and Cloud-Hosted Computing Project
Utility
Affordability
Performance
Availability
Security
Scalability
Data Confidentiality
Data Integrity
Consumption-Based
Capital Ownership Cost
Transaction Throughput
Data Latency
Hardware Failure
Service Failure
Horizontal Approach
Elastic
(L, L) Monthly total cost is the operating cost of the resources used by the application
(M, L) The resources the system needs to operate efficiently can change daily based on load projections and other factors
(L, L) The operating budget is nine times or more higher than the capital budget of the system
(L, M) The Family Tree database does not constrain the total number of transactions the system can process per hour
(M, H) The data the system uses to draw a nine-generation pedigree is retrieved in one second or less
(H, L) Power outage in an availability zone causes the system to redirect users to the services running in another availability zone within one minute or less from the outage event
(H, M) Redundant services in an availability zone provide %99.99 service availability
(H, H) Access-controlled data is secure 99.9999% of the time
(M, L) Additional resources can be provisioned and deployed within 10 minutes or less
(L, H) The system can automatically expand and contract its resources to accommodate fluctuating workloads.
(M, M) Adding additional resources allows the system to service a corresponding percentage increase in users and workload
(H, H) 99.9999% of all data is accurate and consistent over its entire life cycle in the system
15FamilySearch’s Family Tree Web Application© 2018 Carnegie Mellon University
Project Scope and MotivationReplacing Relational DBMS and Cloud-Hosted Computing Project
1 hr peak Max Capacity Linear (1 hr peak)
16FamilySearch’s Family Tree Web Application© 2018 Carnegie Mellon University
Project Scope and MotivationReplacing Relational DBMS and Cloud-Hosted Computing Project
17FamilySearch’s Family Tree Web Application© 2018 Carnegie Mellon University
Project Scope and MotivationReplacing Relational DBMS and Cloud-Hosted Computing Project
18FamilySearch’s Family Tree Web Application© 2018 Carnegie Mellon University
Project Scope and Motivation (Revision 2)
Project Scope and MotivationReplacing Relational DBMS and Cloud-Hosted Computing Project
Publish-Subscribe
Public API
Throttling
Conclusion Tree
Conclusion Tree
Web Client Platform API
Throttling
Tree Foundation
Web Client
Tree Data
Amazon Cloud
Tree Data
Change Message Queue
Publish-Subscribe
Message Queue
Amazon SQS
Admin Labels
Admin Labels
Postgres RDS
DMC
Extract, Transform, Load
Private Data Center
KeyProcess
Communication / Data Flow
Service
Relational Database
Responsibility Division
N-Node NoSQL Database
Batch Data Transfer
User / Consumer ApplicationContext
Queue
Diagram 1 of …
19FamilySearch’s Family Tree Web Application© 2018 Carnegie Mellon University
Three Architecture and Design Challenges
20FamilySearch’s Family Tree Web Application© 2018 Carnegie Mellon University
1) Performance differences between the existing and target technology
RDBMS• Parallel query execution• Parallel index build• Replication aborting longer running
queries
NoSQL • Operation and maintenance
Three Architecture and Design Challenges
21FamilySearch’s Family Tree Web Application© 2018 Carnegie Mellon University
2) Weak transactional semantics in the target NoSQL technology
Patterns that helped us form a solution• Event Source• Commutative Replicated Data Type• Command Query Resource Segregation
Three Architecture and Design Challenges
This Photo by Unknown Author is licensed under CC BY
22FamilySearch’s Family Tree Web Application© 2018 Carnegie Mellon University
2) Weak transactional semantics in the target NoSQL technology (continued)
Three Architecture and Design Challenges
23FamilySearch’s Family Tree Web Application© 2018 Carnegie Mellon University
3) Deploying and operating application services using a cloud service provider platform
Three Architecture and Design Challenges
Blueprint
Developers
Blueprint
Java
JavaScript
GitHub
Developers / Administrators
End Users
JIRA
Change Tracking
xMatters
Splunk
AppDynamics
Janitor
APICA
Domain Traffic Manager
Beanstalk
CloudFormation
Simple Systems Manager
Provisioning Service
System
Elastic Load Balancer
Service
AWS Console
25FamilySearch’s Family Tree Web Application© 2018 Carnegie Mellon University
We achieved our schedule objective
• Production cutover was about 16 months after project launch
• Complete RDBMS transition took about six additional months
Results
26FamilySearch’s Family Tree Web Application© 2018 Carnegie Mellon University
We achieved our scalability objective for the Tree database
Results
Peak DB Transactions Capacity Linear (Peak DB Transactions)
Convert from R3-2XL to R4-4XL Instance Type
33 to 24 Nodes
DSE 4.8.7 to DSE 5.0.9
SSD to EBS Storage36 to 27 Nodes
27 to 33 Nodes
27FamilySearch’s Family Tree Web Application© 2018 Carnegie Mellon University
We achieved our performance objective for the Tree web client
Results
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
3/13
/201
6
3/27
/201
6
4/10
/201
6
4/24
/201
6
5/8/
2016
5/22
/201
6
6/5/
2016
6/19
/201
6
7/3/
2016
7/17
/201
6
7/31
/201
6
8/14
/201
6
8/28
/201
6
9/11
/201
6
9/25
/201
6
10/9
/201
6
10/2
3/20
16
Late
ncy
(Mill
isec
onds
)
Sunday
Web Client-Facing Service Latency
7:00:00 PM
2:30:00 PM
12:00:00 PM
8:00:00 AM
28FamilySearch’s Family Tree Web Application© 2018 Carnegie Mellon University
We achieved our affordability, scalability, and security objectives via our cloud platform approach
Results
Statistic Average
Code Check-Ins / Builds Per Day 250
Deploys to Production Per Days 161
Deploy Time in Minutes 10
Auto-scale Events Per Day 1820
30FamilySearch’s Family Tree Web Application© 2018 Carnegie Mellon University
The miracle occurred and it’s working!
Summary
This Photo by Unknown Author is licensed under CC BY-NC-SA
32FamilySearch’s Family Tree Web Application© 2018 Carnegie Mellon University
SATURN 201814th Annual SEI Architecture Technology User Network Conference
MAY 7–10, 2018 | PLANO, TEXAS
FamilySearch’s Family Tree Web Application
Replacing Relational Database Technology and Transitioning to Cloud-Hosted ComputingRandy A. Ynchausti
Software Architect
Email: [email protected]
Twitter: @RandyYnchausti