Upload
ryan-thomas
View
324
Download
1
Embed Size (px)
DESCRIPTION
Mesos in development at Atlassian
Citation preview
Mesos @ Atlassian
MesosCon14 Ryan Thomas
Development Team Lead
Overview• Why?
• Design Considerations
• Service Design
• Platform Design
• Current Status / Challenges
Why?
• Atlassian applications are traditionally single-tenanted monolithic Java applications
• Decompose shared services out of the applications
• Facilitate quick service development cycles
• Utilise current delivery pipeline
Design Considerations• This must run in AWS
• Inside of a VPC (with no public routes)
• Single logical point of external ingress through Riverbed Stingrays
• Single logical point of external egress through squid proxies
• Accessible from our data centres over ADCs
• Service SOE must be decoupled from service itself
Service Design• Container based service SOE
• Services are packaged as “slugs”
• Services must be stateless
• No guarantee consecutive requests go to the same service node
• Services declaratively described in a Service Descriptor
Platform Design
Platform Design
Platform Design
Platform Design
Platform Iteration 1
• Container Manager: Mesos + Marathon + mesos-docker-executor
• Service Locator: Netflix Eureka!
• Service Manager, Slug Builder, Load Balancer Manager: All In House
Platform Iteration 1
• All platform services in Docker containers
• Worked fine in development on a single machine
• Doesn’t work in reality due to MESOS-809
• Services require Eureka-client for registration / heartbeat
Platform Iteration 2• Platform services provisioned on “bare metal”
• Service Locator: Marathon
• Generally worked well
• Edge would health check its pool members and keep ones starting from serving traffic
• Service upgrades required a service interruption
Platform Iteration 3• Container Manager: Mesos + Aurora
• Service Locator: Aurora
• Aurora provides a lot of the functionality we were planning on building into our Service Manager
• Seamless, rolling upgrades
• Service health checks
• Notifications / service ownership / quotas
• Aurora setup & deployment was a little more involved than Marathon
• Load Balancer Manager & Service Manager need to talk Thrift to Aurora
Current Status / Challenges• Not serving production traffic yet
• How do we handle multiple regions with region-agnostic services & clients?
• Our applications are not historically stateless, how can we run them on Mesos?
• Service testing - spin up the world vs. testing against APIs vs. just do it live
Mesos Hackathon!Team forming / announcements !
@ 5:40pm (here)!
!
Starts 9am in the Ontario Room tomorrow
HipChat Room bit.ly/1q36ScB