28
Developing the Stratoscale System at Scale Muli Ben-Yehuda Chief Scientist devopsdays TLV October 2015

Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015

Embed Size (px)

Citation preview

Page 1: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015

Developing the Stratoscale System at Scale

Muli Ben-YehudaChief Scientist

devopsdays TLV

October 2015

Page 2: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015

2

What is the Stratoscale System?

Run Virtual Machines & Containers

High Performance Storage

I

M

Intelligent Resource Management

The Stratoscale operating system turns any cluster of standard x86-servers into a single, intelligent, private cloud for running virtual machines and containers.

Stratoscale provides all the necessary software components - including software-defined storage and networking, compute (hypervisor), and management services - required for building and running your very own cloud infrastructure.

Page 3: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015

3

Who We Are

• Software focused on intelligent, scale-out hyper-converged infrastructure• Targeting Enterprises, Service Providers and Web-Scale users• Founded in 2013 Backed by top VC’s and strategic investors

• $42M in two investment rounds• 12 patents submitted, 20+ drafts to be filed • Experienced management team

• Anobit, Waze, XIV, Mellanox, Primesense, Panaya, ...• Team of 70+ leading experts• Based in Herzliya, Israel and Boston, MA

Page 4: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015

4

What This Talk Is About: Scaling

● People

● Processes

● Systems

● Development

Page 5: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015

Context: The Technology Stack

Remote MemoryAC/PC Live Migration

SLABased

ComputingScale-Out Distributed Storage

Cloud Management StackHA Clustering

Analysis and Insight Generation

Memory Dedup & Compression

StorageDedup & Compression

Single Pane Mgmt

Standard APIs

Page 6: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015

It's not your usual devops env

● Kernel and hypervisor● Distributed Storage● Networking● Cloud management● UI/UX

Page 7: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015

In the beginning

● 0-10 developers● Single git repo for all company source code

● Atlassian Bamboo for CI/CD

● Softlayer bare-metal servers

Page 8: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015
Page 9: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015

CI not keeping up

● 10-20 developers● Let's write our own CI system● How hard can it be?

Page 10: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015
Page 11: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015
Page 12: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015

SO YOU BELIEVE WRITING YOUR OWN CI WAS A GOOD IDEA?

TELL ME MORE ABOUT GROWTH

Page 13: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015

Growing pains

● 20+ developers● Build times are long and getting longer● Multiple build types● Rapid growth

● 1 vanilla, 3 vanillas, 5 vanillas, …● Everyone is an owner → no one is an owner● Cascading changes affect everyone immediately

● Build is broken more often than it is not

Page 14: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015
Page 15: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015

Scaling development (tools)

● Goal: 50+ developers● First, we need some tools

● Osmosis is an rsync replacment with git tendencies

● Solvent is a build artifact repository● The Inaugurator is a tiny Linux image that does self-provisioning for bare-metal servers

● Upseto is a repo/git-submodule replacement ● These tools and others are available at https://github.com/Stratoscale/

Page 16: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015
Page 17: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015

Scaling development (tests)

● Unit tests as part of dev flow● make → run unit tests● Jenkins → continuous build

● Unit tests for function/class/multiple classes

● Whitebox tests for testing daemons at the API level

● voodoo for mock objects (https://github.com/shlomimatichin/Voodoo-Mock)

Page 18: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015

The Rackattack

● Lots of tests at scale require lots of iron to run tests

● Some tests can run in VMs● But no substitute for baremetal servers

● Rackattack allocations & provisions & reclaims baremetal servers using osmosis/solvent/inaugurator in seconds

Page 19: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015
Page 20: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015
Page 21: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015

Subsystem tests

● Isolate subsystems● Management● Storage● Networking● Cluster● Runtime

● Test features● Integration with neighbours

Page 22: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015

System tests

● End-to-end testing● Using API/CLI/GUI

● Allocate nodes → install system → run test scenarios

● Test user stories

Page 23: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015

Problems solved

● 50+ developers● Fast dev → test → deploy → run cycles

● Fast provisioning of bare-metal and virtual test envs

● Rapid test feedback● Automated tools for dev/test/ops

● Eat our own dogfood

Page 24: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015
Page 25: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015

The next challenges

● 2x order of magnitude scaling● 200+ developers x 1K nodes per developer● We need better API definitions● We need better testing coverage● We need ingrained best practices ● (Even more) continous integration & continuous delivery

● How do you do on-premise continuous delivery?● Serviceability - call home, logs, analysis, ...

Page 26: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015
Page 27: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015

In conclusion

● Scaling up is hard to do● Sense of accomplishment: guaranteed● Different approaches for different growth stages● Find the right mix of DIY and available solutions● Testing is crucial● Devops is not just for web apps

Page 28: Developing the Stratoscale System at Scale - Muli Ben-Yehuda, Stratoscale - DevOpsDays Tel Aviv 2015

Thank you!

[email protected]