Running applications in a production environmentDevOps Configuration Management • Configuration...

Preview:

Citation preview

Running applications in a production environment

Nikola Krgović

https://joind.in/talk/64924

How do most web applications start

• A CMS (Wordpress, Drupal, etc)• A small custom made site• Using Rapid development tools (Frameworks&ORMs)• Agile development : Minimal viable product• Deployed to a single server

Once the real traffic arrives

• Need for performance• Price constraints force horizontal scalability• High Availability becomes a necessity

Changes in methodology

• Agile is embraced.• 12-Factor App• Continuous Delivery and testing• Continuous Deployment• DevOps

Continuous deployment• Continuous delivery (CD or CDE) is a software engineering approach in

which teams produce software in short cycles, ensuring that the software can be reliably released at any time and, when releasing the software, doing

so manually.• Continuous deployment (CD) is a software engineering approach in which

software functionalities are delivered frequently through automated deployments.

Continuous deployment

• Creates a need for more complex tools• Mandatory Automated testing• Both unit tests and integration tests are necessary

Continuous deployment

DevOps• DevOps is a software development methodology that combines software development

(Dev) with information technology operations (Ops). The goal of DevOps is to shorten the systems development life cycle while also delivering features, fixes, and updates frequently in close alignment with business objectives.The DevOps approach is to

include automation and event monitoring at all steps of the software build

• DevOps is a methodology - not a job title. :)

DevOps

DevOpsDevOps practices change the life of a developer :

• Configuration management tools to create environment• Deployments are automated : No manual “touch-ups” on the server• No direct access to servers. Code is on shared storage, deployed trough

a “jumpbox” or immutable inside a container• Logs are centralised, and available trough a dedicated app - usually the

ELK stack is used : You need to master RegExp• Application performance monitoring becomes a regular practice

DevOpsConfiguration Management

• Configuration Deployment tools like Ansible guarantee all environments are setup the same

• Configuration management tools, like Puppet use agents, which add assurance that the environment will remain the same throughout use

• Very little effect on the developers, other then the guarantee that the system will be deployed and maintained in a consistent manner

DevOps : Monitoring

DevOps : APM

DevOps : Kibana

DevOps : Logs

Development Environment

12-Factor App :

X. Dev/prod parityKeep development, staging, and production as similar as possible

Development EnvironmentTypical :

• Developers machine (Virtualbox+Vagrant / MiniKube / OKD*)• Code / CI (GitLab with Test)• Test Systems (“Beta”)• Staging system• Production

*Kubernetes system previously known as OpenShift Origin

Development Environment

Development Environment

High Availability

• Highly available systems have no single point of failure• Well designed HA systems don’t have redundant and “hot-

standby” components : design is “active-active”• Well designed apps can scale horizontally

High AvailabilityTypical Components

• Load Balancers• Content Delivery Network and Object Storage• Application Servers• Relational Database Management System• Key-Value storage• Queue• Document Storage / Object Storage / NoSQL• Full Test Search• Shared Storage

High Availability System

Load BalancersNginX or Haproxy

• Distributes connections to application servers• Checks application severs for health• Terminates TLS connections• Does cookie manipulation• Redirecting if needed• Web Application Firewall

Load BalancersNginX or Haproxy

• proxy_set_header x-real-ip $remote_addr• proxy_set_header x-forwarded-for $proxy_add_x_forwarded_for• proxy_set_header x-forwarded-proto $scheme

CDN and Object Storage• Object storage uses an API (usually S3) to store data.• Usually used as-a-service , but can be hosted on-prem.• Simple and easy to use from concurrent locations

• CDN’s offer faster loading times for data• Should be used for all static assets (images, css, js)• Served of a different, cookie-less domain• Require versioning, due to long caching times• When used as-a-service offer a simple way to geo-distribute data and

significantly speed up loading times.

Application Servers• Application servers must be stateless• Applications can be stateful with shared session storage • Deploy is done via automation• Non-container deployments often use shared storage• If using interpreted systems, like PHP, you need to flush cache

opcache_reset()

Application Servers• Unix privileges are not an enemy

• SE Linux security contexts and Mandatory Access Control are your friends too

httpd_sys_content_t httpd_sys_rw_content_t

Application Servers

• A pool of servers is ~100X more powerful then your machine• A pool of servers will have ~10,000X visitors of your machine

• Memory is a very critical resource. Talk about it with Ops!

Key-Value storage

• Redis is the default choice, use memcached only if you must. • Redis does have high availability options• Almost never disk persistent. Disk is used for cache warmup.• Can be deployed shared or per-instance

• Shared Redis is needed if servers are stateful, for session storage• Per-instance Redis is more performant, but complicates cache invalidation

Queue

ZeroMQ, RabbitMQ, AWS SNS

• Highly available by design• Centralized and scalable• Provide a simple method of asynchronously processing messages.• Provides a built-in mechanism for retrying• Should be used instead of in-database queues

Full-Text Search

ElasticSearch or Sphinx

• Usually used in read-only fashion• ElasticSearch has high availability clustering• Sphinx can be made HA with HAProxy• Loading data into FTS needs a separate process

Document StorageMongoDB

• Not a Relational database• Fully ACID compliant• Great for storing object, poor with relations and “join”-like queries• Has built-in high availability using quorum, initial change is just the connection string

mongodb://s1.example.net:27017,s2.example.net:27017,s3.example.net:27017/

MongoDB• Not a Relational database• Poor performance relations and “join”-like queries• Queries that require object manipulation can be slow• It is advisable to use readPerf() and send slow queries to

secondary instances

Relational DatabasesMySQL or PostreSQL

• Used to store relational data• Always design using normal forms (1NF, 2NF, 3NF, BC)• Usually has asynchronous replication• In-app logic usually scales better then in-db, but…

Relational DatabasesIndexing

• Primary keys are a must• Covering index vs Row read• An index too many• Master vs Slave index

Relational DatabasesPrivilege Separation

• GRANT SELECT,INSERT,UPDATE,DELETE, CREATE TEMPORARY TABLE ON ‘schema’.* to

‘application_user’@’10.%’;

• Forget about migrations from code

Relational Databases

Relational DatabasesAsynchronous replication

• Assume slave will always have ~30s replication lag• High availability can provide connectivity but can’t do a read-

write split • Automated solutions exist (ProxySQL, mysqlnd_ms) but still

require hinting for some cases like SELECT after INSERT

Relational Databases

Relational DatabasesORMs as a disaster in production

• ORM can be viewed as a rapid prototyping tool, but that’s it• ORM’s can slow down JOIN’s by orders of magnitude• At very small cases with ~100,00 rows you get:

Bulk inserts were tested at 2x the speed of ORMJoins sometimes go over 10x faster then ORM

Relational DatabasesOptimizing in production

• Work with DBA’s (or OPS) on indexing• Explain is your friend• Temporary tables can speed things up massively

Conclusions• Keep the stack as small as possible• Use the right tool for the right job• Don’t use multiple tools for the same job• Always consider that you’ll have millions of users• When in doubt scale horizontally

Running applications in a production environment

Questions…?

Recommended