Upload
avere-systems
View
55
Download
1
Embed Size (px)
Citation preview
WEBINAR
Solving Enterprise Business ChallengesThrough Scale-Out Storage & Big ComputeMichael Basilyan, Product Manager, Google Cloud PlatformScott Jeschonek, Director of Cloud Products, Avere SystemsRob Futrick, CTO, Cycle Computing
Housekeeping
• Slides
• Questions
• Recording
• Attachments
Presenters
Michael BasilyanProduct Manager
Scott JeschonekDirector of Cloud Products
Rob FutrickCTO
Introduction to Google Cloud PlatformFocusing on Compute Engine & Storage
Michael Basilyan [email protected] Manager, GCE
Agenda
• Google Cloud Overview
• Compute Engine VMs:
• GCE VM Instances & Managed Infrastructure
• Storage:
• Block Storage
• Cloud Storage
What is Google Cloud Platform?
7
Google Cloud Platform Services
VIRTUAL NETWORK
LOAD BALANCING
CDN
DNS
INTERCONNECT
Management Compute Storage Networking Data Machine Learning
STACKDRIVER
IDENTITY AND ACCESS
MANAGEMENT
CLOUD ML
SPEECH API
VISION API
TRANSLATE API
NATURAL LANGUAGE API
8
Google Cloud Platform Services
VIRTUAL NETWORK
LOAD BALANCING
CDN
DNS
INTERCONNECT
Management Compute Storage Networking Data Machine Learning
STACKDRIVER
IDENTITY AND ACCESS
MANAGEMENT
CLOUD ML
SPEECH API
VISION API
TRANSLATE API
NATURAL LANGUAGE API
GCE: Compute & VM Features
VM Live Migration = No Downtime
Custom Machine TypesAverage Savings: 19%
Create VMs shaped for your workloads instead of shaping your workloads to fit pre-defined VMs.
Preemptible VMsIdeal for batch, grid, and fault-tolerant workloads
Save 80% off regular VM list prices: flat $0.01 per core hourFlat pricing with no complex bidding or competitionSame performance (CPU, I/O, Net) as regular VMs
Example uses: Hadoop, Rendering/Transcoding, Genomics, Monte Carlo Simulations, etc.
Managed Infrastructure - zero devops for IaaSCreate Groups of Instances- Define Instance Template- Deploy Docker containers or
apps directly- Automatically connect new
instances to load balancer
Autoheal- Use app level healthcheck to
signal issue- Get machine recreated or
restarted
Autoscale- Add/Remove instances automatically
based on scaling policy (CPU utilization, LB load, Custom Metrics)
- Scale pool of workers with task queue
Update- Deploy new version of your software
with rolling update while serving traffic
- Do cannary, % rollout, control pace, roll-back
- Recreate in place or surge instances
Ways we save you money● Preemptible VMs
● Custom Machine Types
● Per-minute billing
● Sustained Use Discount○ The more you use, the bigger
the discount. Automatically.
● Instance right-sizing ○ Instance recommendations
displayed on VM Instances Page
○ Single Button Actuation
Block & Object Storage
Cloud Storage
Cloud Bigtable
Cloud Datastore
Cloud SQL
Good for:Binary or object data (BLOB)
Such as:Media, analytics, archive/backup
Good for:Hierarchical,mobile, web
Such as:User profiles,Game State
Good for:Web frameworks
Such as:CMS, eCommerce
Good for:Heavy read + write, events,
Such as:AdTech, Financial, IoT
Where do I store my data?
Big Query
Good for:Data Warehouse
Such as:Analytics, Dashboards
Relational NoSQL Object Warehouse
Good for:Local VM file storage
Such as:Application data/binaries
Block
Persistent Disk (GCE)
Cloud Storage
Cloud Bigtable
Cloud Datastore
Cloud SQL
Good for:Binary or object data (BLOB)
Such as:Media, analytics, archive/backup
Good for:Hierarchical,mobile, web
Such as:User profiles,Game State
Good for:Web frameworks
Such as:CMS, eCommerce
Good for:Heavy read + write, events,
Such as:AdTech, Financial, IoT
Where do I store my data?
Big Query
Good for:Data Warehouse
Such as:Analytics, Dashboards
Relational NoSQL Object Warehouse
Good for:Local VM file storage
Such as:Application data/binaries
Block
Persistent Disk (GCE)
Block StorageReliable, high-performance block storage for virtual machine instances on GCE
Standard Persistent Disk SSD Persistent Disk Local SSD
Targetscenarios
Large data processing workloads and some enterprise applications
Genomics processing, video transcoding in GCE
High performance database and enterprise applications
MySQL, SQL Server, Oracle
In-memory databases
High-performance scratch space
Features
Persistent storage
Cost sensitive ($.04 GB)
Persistent storage
Performance sensitive ($0.17GB)
Ephemeral storage
Highest-performance ($0.218 GB)
Encryption, Snapshots64 TB, Disk Size sets performance
(Attach larger VMS for max SSD performance)
Encryption3TB
Cloud Storage: Object/Blog store● Google Cloud Storage is a scalable
object storage service suitable for all kinds of unstructured data.
● Cloud Storage vs Perst. Disk:○ Scales to exabytes.○ Accessible from anywhere. ○ REST interface; higher latency
than locally attached block storage (PD)
○ Write semantics include insert and overwrite file only.
○ Offers versioning. ○ Cheaper!
● Lots of guidelines on picking storage on our site.
Regions and Zones
–––– 20182018
Current regions and number of zones
Edge points of presence
Network
Committed regions for 2017 and number of zones
#
# https://peering.google.comhttps://cloud.google.com/compute/docs/regions-zones/regions-zones
Google Cloud Platform InfrastructureGoogle Cloud Platform is built on a datacenter network infrastructure that supports Google scale, performance, and availability
2
3
Singapore2
S Carolina
N Virginia
BelgiumLondon
Tokyo
TaiwanMumbai
Sydney
OregonIowa
Frankfurt
São Paulo
Finland
3
3
33
3
3
2
43
3
3
Cloud HPC: Data Access ChallengesScott Jeschonek, Director of Cloud Products
HPC in the Cloud
• Bring 100s or 1000s of cores online, quickly and efficiently
• Networking within the Cloud Compute environment minimizes compute latency
• Creative use of preemptible / spot market VM instances allow large numbers of worker nodes at reasonable cost
“Pure” Cloud HPC
• Entire grid in Compute Cloud
• Data is located locally
•Cloud Storage options may be used
• 3rd party Data may be incorporated (from their cloud storage)
Hybrid HPC
Existing HPC clusters:
Capital investment- Possibly sunk cost already
Logical investment:- Hardware Tuned- Storage optimized- Network optimized- Daily ops dependent on status
quo
Cloud HPC Clusters:
Transient investment:- Can build on demand
infrastructure
Expand on-prem:- Use orchestration and grid
management to extend jobs into cloud
- Schedule jobs based on performance / cost requirements
Hybrid HPC
Grids On-Demand
Latency “Kills”• Access to Data is the main challenge for HPC
• Amplified in the cloud:
- Data has to be located on or near the worker nodes
- Data may be in your datacenter
- Copy it all to the cloud?
- Costs for workers grows if data has to be copied to local disks
- Pipelines may require multiple writes (of results)
- Writes to local storage increases consistency risks
- Writes back to on-prem storage introduces significant latency
Using a Data Access Layer
Advantages of Data Access Layer
Keep your data on prem! – Data in cloud is only there while the compute nodes work the jobs.
- Reduce the security objections, simplify the move to cloud
Increase cloud compute performance – using file system caching, most of the data will be in RAM, close to the nodes
- Avoids ingest latencies and slashes transit latency after first read
Scale out – Using solution that facilitates 10s of 1000s of core file system connections
Hybrid Cloud / Hybrid HPC Using Avere Technology
Customer Needs Avere Delivers
Low-latency file access Edge-Core Architecture
Scalable Performance and Availability
Scale-out Clustering
NFS & SMB interfaces FlashCloud File System for Object
Single pool of storage Global Namespace
High Security AES-256 Encryption, KMIP
Flexibility Physical and virtual products
Lessons Learned from 10 YearsHow Cloud Changes Big ComputeRob Futrick, CTO
33
The BroadInstitute
Need: 270,000 hours of computing
Why: Machine learning to map relationships among cancer datasets
© Copyright Cycle Computing LLC | All Rights Reserved
PAGE 34
Internal cluster queue too long
Up & running in 1 hour, scaled and completed project in 2 weeks
30 years of Computing in 6 hours!
Submit jobs, orchestrate ML application
Encrypt, route data to Cloud, return results
51,200 cores To run R ML framework
Secure Cluster
Cell Line Data, RNA,
DNA
Scaling Machine Learning @ The Broad Institute
© Copyright Cycle Computing LLC | All Rights Reserved
PAGE 35
Manufacturing& Electronics
Pharma &Biotech
Financial & Insurance
Media &Entertainment
Oil & Gas
65% of G2000 are limited by access to Big Compute
© Copyright Cycle Computing LLC | All Rights Reserved
PAGE 36
The Challenges: Cloud & HPC Big Compute
User Inputs
Existing Workflows
Data Dependencies Instance types
Applications
Scalability
Budget Controls
AuthorizationSecurity Stack
IT LOB Inputs
Job scripts & data
Cloud accounts
Storage / Data sources
OS variations
AD / LDAP Authorization
© Copyright Cycle Computing LLC | All Rights Reserved
PAGE 37
The Solution: CycleCloud for Cloud HPC & Big Compute
User Inputs
Existing Workflows
Data Dependencies Instance types
Applications
Scalability
Budget Controls
AuthorizationSecurity Stack
IT LOB Inputs
Job scripts & data
Cloud accounts
Storage / Data sources
OS variations
AD / LDAPAudit/Compliance data
Usage data (User, Group, App)
Job run-time by instance data
AppServer platform
Internal
© Copyright Cycle Computing LLC | All Rights Reserved
PAGE 38
Who is Cycle Computing?
• Leader in Cloud Big Compute/HPC • Pioneering Cloud Management Software for 10 years• 370M compute-hours managed• Compute hour growth: 7x every 2 years
• CycleCloud Value Proposition• Simple Managed Access to Big Compute• Accelerating Innovation for the Enterprise
=> Faster time to result, with cost control
• Our customers• Fortune 500, startups, and public sector• Life sciences & pharma, financial services,
manufacturing, insurance, electronics
© 2016 Copyright | All rights reserved
7 Lessons Learned
© Copyright Cycle Computing LLC | All Rights Reserved
PAGE 40
#1 – Zero waiting in line for compute
© Copyright Cycle Computing LLC | All Rights Reserved
PAGE 41
#2 – Ask questions of any scale
Ask the right question,
regardless of scale
Think about the problem first
Then the system
© Copyright Cycle Computing LLC | All Rights Reserved
PAGE 42
#3 – Users with unique requirements are OK
Trivial to support different use cases
Different GPU, RAM, SSD, OS needs can be created easily
Move workloads that don’t fit internally to Cloud
© Copyright Cycle Computing LLC | All Rights Reserved
PAGE 43
#4 – Cloud gets faster/cheaper over time
© Copyright Cycle Computing LLC | All Rights Reserved
PAGE 44
#5 – Time & cost are the sole metrics that matter
© Copyright Cycle Computing LLC | All Rights Reserved
PAGE 45
Everythingyou don’t think about!
© Copyright Cycle Computing LLC | All Rights Reserved
PAGE 46
#6 – Accelerating answers, accelerates people
720 (hours) 720 720
Computing Analysis
2880 hours /120 Days to Decision
Computing720
AnalysisSCALABLE COMPUTING (in hours)
720
Computing Analysis Analysis
1456 hours /60.6 Days to Decision
7208
Computing
ANTICIPATED BENEFIT (in hours)
8
© Copyright Cycle Computing LLC | All Rights Reserved
PAGE 47
#6 – Accelerating answers, accelerates people
720 (hours) 720 720
Computing Analysis
2880 hours /120 Days to Decision
Computing720
AnalysisSCALABLE COMPUTING (in hours)
Higher Quality Output, Iterative Analysis, Less Context Switching
Computing & AnalysisPOST ADOPTION: AGILE DESIGN PROCESS
8
© Copyright Cycle Computing LLC | All Rights Reserved
PAGE 48
#7 – Every smart person gets their own workspace
Old: Shared internal cluster
• Competition for resources• Waiting in line for compute• Zero sum game between users
New: Cluster Per Researcher
• Remove bottlenecks• Cost controls to manage $• No waiting = 2x faster users
User
User User UserUser User UserUserUser
User
User User
© Copyright Cycle Computing LLC | All Rights Reserved
PAGE 49
Lessons Learned Summary
1. Zero Queue Wait for computing
2. Any scale, Any time
3. Users with unique requirements are ok
4. Performance goes up over time, same cost
5. Time and Cost are the sole metrics
6. Faster iterations
7. Every researcher gets their own workspace
49
© Copyright Cycle Computing LLC | All Rights Reserved
PAGE 50
The Solution: CycleCloud for Cloud HPC & Big Compute
User Inputs
Existing Workflows
Data Dependencies Instance types
Applications
Scalability
Budget Controls
AuthorizationSecurity Stack
IT LOB Inputs
Job scripts & data
Cloud accounts
Storage / Data sources
OS variations
AD / LDAPAudit/Compliance data
Usage data (User, Group, App)
Job run-time by instance data
AppServer platform
Internal
Questions& Answers
Contact Information
Michael BasilyanProduct Manager
Scott JeschonekDirector of Cloud [email protected]
AvereSystems.com
Rob FutrickCTO