Upload
levon-avakyan
View
118
Download
6
Embed Size (px)
Citation preview
2Content
• Definitions – to be one page• SRE vs DevOps – little bit of phylosophy• Approach – how to do well• Cases – how we are doing in Competitive
Gaming
What I will speak about
3
Definitions
To be on one page
4ReliabilityLittle bit of the theory
Reliability is theoretically defined as the probability of success , as the frequency of failures; or in terms of availability, as a probability derived from reliability, testability and maintainability. Reliability plays a key role in the cost-effectiveness of systems.
5Reliability EngineeringLittle bit of the theory
• Reliability engineering is engineering that emphasizes dependability in the lifecycle management of a product.
• Reliability engineering deals with the estimation, prevention and management of high levels of "lifetime" engineering uncertainty and risks of failure.
6 Software ReliabilityLittle bit of the theory
• Software Reliability (SR) depends on good requirements, design and implementation. Software reliability engineering relies heavily on a disciplined software engineering process to anticipate and design against unintended consequences.
7Site reliability engineeringLittle bit of the theory
Site reliability engineering (SRE) is a discipline that incorporates aspects of software engineering and applies that to operations whose goals are to create ultra-scalable and highly-reliable software systems.
SRE might be considered a subset of Devops that possesses additional skill sets.
8Development Operations Little bit of the theory
DevOps is a term used to refer to a set of practices that emphasize the collaboration and communication of both software developers and information technology (IT) professionals while automating the process of software delivery and infrastructure changes. It aims at establishing a culture and environment where building, testing, and releasing software can happen rapidly, frequently, and more reliably
9
SRE VS DevOps
Little bit of philosofy
10
Site Reliability Engineering
• Main focus on to creation ultra-scalable and highly reliable software systems.
• It is a one of engineering specializations
• Fully embedded in the lifecycle of product
Development Operations
• Main focus on automated deployment process on production and staging environments
• It is a role
• Mostly working with environments
SRE (SR) vs DevOpsComprasion
11SRE (SR) vs DevOpsConclusion
• SRE (SR) is a broader concept than DevOps
• We cannot put versus between SRE (SR) and Devops because they achieves the similar goals, but with different approaches
12
Approach
How to do well
13Product lifecyclePaste one content item here. Field is obligatory to complete.
14Pre-production
Main purpose:• Create specification for Development• Clarify with business all details
Main artefacts are requirements and high level design (HLD) of new feature/product
SRE Role:• Review and clarify HLD• Adding specifically requirements to improve reliability and
reduce impact to players in case of failures
15Development
Main purpose:• To develop the application• To test the application
Main artefacts are release tag, SDD, test suites, regulations/automation for release
SRE Role:• Review and clarify SDD• Monitoring design• Load and performance test (tooling, environments)• Stress tests• Release preparations (tooling, massive migrations, release time
estimation)
16Release
Main purpose:• Check that application is ready to go production• To deliver application to production environment
Main artefacts are released application and release postmortem
SRE Role:• Review regulations• Automatize process with standard tools
17Post-Release
Main purpose:• Monitoring• Maintains• Mitigating risks and decrease impact for user in case of outgages
Main artefacts are bugs and improvments for dev team and data for product management team to analyze it
SRE Role:• L2+-L3 maintains• Data collection tools
18Conclusion
• SRE is embedded in all life cycle of life product
• Main aim of SRE it is increase reliability
• The scope of the responsibilities is very variable and depends on company layout
19
Сases
How we are doing in Competitive Gaming
20Cases
• World of Tanks football tournament • Companies on WoT Global Map
21World of Tanks Football Tournament
Features:• Cross project product• Great importance for players and company• New battle type
22ArchitectureWotld of Tanks Football Tournament
23RisksWorld of Tanks Football Tournament
• High load• A very long route for battle - a lot of points of outage• First big load for Team Management System• A lot of separated teams are working on event
24What we have doneWorld of Tanks Football Tournament
• Did end to end load and performance test of system• Got the prediction of players count from publisher• Based on numbers create recommendation for the
schedule• Added safe day in schedule• Created tooling to move groups, steps, battels of
tournament to the other date• Isolated battle processing and API• Created auto scale configuration for workers
25Global MapGlobal Map
Features:• Potentially increasing battle counts to proccess• Have no chance to fault because it will influence to
the results of 3-week event
26ArchitectureGlobal Map
27RisksGlobal Map
• High load• New gameplay features• New vector tiles engines• No chances to move battles
28What we have doneGlobal Map
• Massive load test of new tiles vector engine• Additional monitoring that based on game logic• Added requirements to have opportunity to scale
most of workers
29Conclusion
• SRE (SR) is a broader concept than DevOps• We cannot put versus between SRE (SR) and Devops
because they achieves the similar goals, but with different approaches
• SRE is embedded in all life cycle of life product• Main aim of SRE it is increase reliability• The scope of the responsibilities is very variable and
depends on company layout