12
logentries.com 1

ELK Stack Costs - Amazon S3yourdai7/wp-content/uploads/... · logentries.com 6 Production environments need to be reliable and fault tolerant. Elasticsearch, Logstash, and Kibana

Embed Size (px)

Citation preview

logentries.com

2

Table of ContentsIntroduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

What is the ELK Stack? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

The ELKeBMWS Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Beats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Marvel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Watcher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Shield . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Cluster/Tribe Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

The High Cost Of Low Cost Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Hardware Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Scaling Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Cloud Hosting Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Data Storage Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Resource Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

logentries.com

3

About the author, David PosinDavid has been involved in the Information Technology Industry for 2 decades. Fifteen years of that time was spent consulting with many companies in a wide range of industries to build solid technology stacks and robust application architectures. David has watched the Cloud and the World Wide Web grow from their infancy, and now spends every day fully entrenched in those ZRUOGV�� &XUUHQWO\�� 'DYLG� EXLOGV� KLJK�SHUIRUPDQFH� ZHE� DSSOLFDWLRQV� DQG� RHUV� SURIHVVLRQDO�technical writing services.

About LogentriesLogentries is a leading SaaS-based log management tool used for real-time log centralization, search and analysis. DevOps, Security & IT professionals use Logentries to manage both logs and unstructured machine data for immediate visibility into their IT environments. Logentries makes it easy to get insights from your log data without building, maintaining or supporting your own log management stack.

logentries.com

4

Introduction

The ELK Stack is the current preferred stack for do-it-yourself (DIY) logging. It is generally

thought to be composed of three software packages: Elasticsearch, Logstash, and Kibana. The

truth is that a successful ELK Stack implementation requires a great deal more than those three

technologies. Even with the best of community support, DIY logging with the ELK Stack will have

surprises and unexpected costs. This paper will point out some of the less well understood

requirements of a robust DIY ELK Stack.

logentries.com

5

The ELK Stack, also called the Elastic Stack, starts with a combination of three separate WHFKQRORJLHV�FRQȴJXUHG�WR�ZRUN�WRJHWKHU��(DFK�SLHFH�LQ�WKH�(/.�6WDFN�KDQGOHV�RQH�SDUW�RI�WKH�general logging equation:

• Elasticsearch - Data storage and searching

• Logstash - Gathering and formatting

• Kibana - Reporting and analyzing

These three technologies are a good start but do not encompass the full services required for a robust production ready logging stack. There are additional technologies needed to maintain the health and security of the stack, as well as mechanisms to collect and disseminate information.

This white paper focuses on the production environment. It explores the costs and requirements for a reliable, robust, and scalable Stack. Outside of production, Elasticsearch, Kibana, and Logstash are capable of being run on the same machine. While that is true for a development environment, running a production grade stack on only one server is not advisable.

What is the ELK Stack?

Adding Context

logentries.com

6

Production environments need to be reliable and fault tolerant. Elasticsearch, Logstash, and Kibana will need to be supported by other packages. They will need monitoring and redundancy like all software packages used in production.

Beats Having Logstash run decentralized with installations on separate machines may not be ideal. In an enterprise network, it might be preferable to have a central point to process and ȴOWHU�ORJ�GDWD��ΖW�LV�SRVVLEOH�WR�LQVWDOO�/RJVWDVK�RQ�one server and have data shipped to it.

To centralize log information in this way requires software called, Beats, on every PDFKLQH�EHLQJ�ORJJHG��%HDWV�GHȴQHV�DQG�controls the process of sending data from GLHUHQW�ORJ�W\SHV�WR�/RJVWDVK��6RPH�example Beats are Packetbeat, Filebeat, and Winlogbeat. All of these are designed to ship WKH�VSHFLȴF�ORJ�W\SH�WKH\�DUH�IDPLOLDU�ZLWK�

Marvel Like all services in a production environment, the Elastic Stack services need to be monitored. This responsibility is accomplished with a software package called Marvel. Marvel is designed to monitor and report the health of all of your Elastic Stack components. The importance of Marvel only increases as the Stack grows. Clusters and Tribes (discussed below) can mean there are lots of independent components that all need to be monitored.

Watcher One of the core responsibilities of any logging solution is to make people aware of critical events. The Elastic Stack has a tool called Watcher to provide this essential function. Watcher observes incoming log entries and VHQGV�QRWLȴFDWLRQV�ZKHQ�FHUWDLQ�HYHQWV�RFFXU��Kibana can report on the event and Logstash FDQ�GLVSHUVH�LW��IRU�\RXU�VXSSRUW�VWD�WR�EH�QRWLȴHG�LPPHGLDWHO\�RI�SUREOHPV�EHIRUH�WKH\�JURZ�UHTXLUHV�:DWFKHU��1RWLȴFDWLRQV�FDQ�EH�sent via email and through other services EDVHG�RQ�WKH�FRQȴJXUDWLRQ�

Shield Security is always going to be a consideration when installing a service. Shield was created to meet this need for the Elastic Stack and WR�FHQWUDOL]H�VHFXULW\�DPRQJVW�WKH�GLHUHQW�Stack components. It is recommended to use this product over non-Elastic security methods. Nginx is sometimes suggested to help limit access but as this updated blog post, “Restricting Users for Kibana with Filtered Aliases” shows, using technology outside the Elastic stack can have unexpected consequences.

The ELKeBMWS Stack

logentries.com

7

Another important security consideration is the use of HTTP communication by default. This should be changed when moving to production. Updating the stack for HTTPS can be done by using Shield. It will require some DGGLWLRQDO�FRQȴJXUDWLRQ�DQG�WKH�DSSURSULDWH�FHUWLȴFDWHV�

Cluster/Tribe Nodes Finally, there are scaling issues to plan for. Elasticsearch is built using clusters to help handle and distribute Elasticsearch queries around the network. Clusters are comprised of master and data nodes, and potentially, FOLHQWV��&OXVWHUV�FDQ�ȴOO�XS�ZLWK�GDWD�RYHU�time, and it will be necessary to scale. As the Elasticsearch documentation states on the Ȋ6FDOH�LV�1RW�ΖQȴQLWHȋ�SDJH,

“ Most scaling problems can be solved by adding more nodes [servers].

It’s important to prepare for adding nodes (servers) to your network as the Elasticsearch index grows.

(YHQWXDOO\��HYHQ�FOXVWHUV�ZRQȇW�EH�VXɝFLHQW�WR�store all the data Elasticsearch encompasses.

Every Elasticsearch node and/or client (master nodes, data nodes, and clients) stores information about an Elasticsearch cluster for proper routing called the cluster state.

Eventually, the cluster state will be large enough to slow down performance. When that occurs, it will be time to introduce Tribe Nodes to the network. Tribe nodes allow searching across Elasticsearch clusters.

Installing a robust and scalable production-ready Elastic Stack is more than Elasticsearch, Logstash, and Kibana. A full accounting of the services required are:

• Elasticsearch

• Logstash

• Kibana (with an Elasticsearch client)

• Beats (per server and data-type being logged)

• Marvel

• Watcher

• Shield

• Clusters

• Tribe nodes (not initially, but eventually over time)

A well put together Elastic Stack will require all of these pieces before it can come close to the full functionality provided by a SaaS Logging service (like Logentries).

The ELK Stack is really the ELKeBMWSC(Tn) Stack.

logentries.com

8

Logstash, Kibana, and Elastic are free open source solutions. There is no cost to using the software in a self-hosted environment. Being open source is one of the biggest attractions of the Elastic Stack. Free is a compelling price point. Although the software is free, running it is not. There are several costs to be aware of.

The High Cost Of Free Solutions

Hardware Costs The Elastic Stack is not an “install and go solution”. The number of servers required will depend on your needs. At a minimum, for any production environment, you will have to install software on three servers. Elasticsearch and Kibana will each have their own servers, plus adding Logstash to at least one host server.

There are performance reasons to consider having users connect to Kibana from a machine separate than Elasticsearch. Elasticsearch can require a lot of CPU and memory depending on the operations being run. If Kibana is sharing those resources, the result is added latency and slow performance for the user. Running Kibana on its own server is recommended by the Elastic documentation. “While Kibana isn’t terribly resource intensive, we still recommend running Kibana separate from your Elasticsearch data or master nodes.” *

Maintaining Elasticsearch performance is a careful balance between number of servers and amount of data. To realize the full capabilities of Elasticsearch it is necessary to distributes pieces of the searchable data amongst its servers. Expect to add servers over time to keep the search performant.

Like all services, Elasticsearch will fail on occasion so planning for failover is important. The recommendation is to have a one to one ratio between server and a replicated backup. Each primary server should keep a complete copy of its data on at least one replica server. In the event of a hardware failure, Elasticsearch will automatically switch to the replica. This is the recommendation of the Elasticsearch documentation as well, “It provides high availability in case a shard/node [server] fails. For this reason, it is important to note that a replica shard is never allocated on the same node [server] as the original/primary shard that it was copied from.”

*(https://www.elastic.co/guide/en/kibana/current/production.html).

logentries.com

9

This is especially important for not losing data in Kibana. If data is unreachable, Kibana can’t indicate its absence. Reports and graphs will simply be incorrect until the problem is UHDOL]HG�DQG�ȴ[HG�

Servers may also be required to support the various tools mentioned above. Marvel and Watcher may require their own hardware for performance and logic reasons. Centralizing Logstash is also recommended so there will be a need for at least one Logstash server to receive log data and to send it to Elasticsearch. Therefore, the absolute minimum number of servers is 5:

• Elasticsearch primary server

• Elasticsearch replica server

• Kibana

• Logstash

• Marvel, Watcher, etc.

Scaling Costs Installing the Elastic Stack is only beginning. It will grow over time and will need monitoring and scaling to keep it healthy. New indexes will require new servers. Growing logs will require more disk space. Changes in your data or logging structure will require reindexing your data.

Unfortunately, there will be problems that can’t be solved by throwing more disk space or servers at Elasticsearch. Scrunch.com’s

blog post, “Lessons Learned From A Year Of Elasticsearch In Production”, mentions several SRWHQWLDO�SHUIRUPDQFH�DHFWLQJ�LVVXHV�WR�monitor. They suggest monitoring thread pools and heap memory, both of which can FDXVH�VLJQLȴFDQW�SHUIRUPDQFH�LVVXHV�LI�WKHLU�sizes are not monitored. Marvel can help with this, as well as a regular schedule of pruning and archiving.

Costs in Lost Opportunity Setting up indexes is as much an art as it is D�VFLHQFH��7KHUH�ZLOO�EH�D�GHȴQLWH�OHDUQLQJ�FXUYH�WKDW�ZLOO�DHFW�WKH�TXDOLW\�RI�WKH�GDWD�JDWKHUHG��ΖQGH[HV�QRW�EHLQJ�FRQȴJXUHG�correctly can mean important data is lost XQWLO�WKH�LVVXH�LV�UHFWLȴHG��ΖW�LV�LPSRUWDQW�WR�be vigilant about what is being logged and comparing it to what should be logged.

Cloud Hosting Costs Self-hosting will help limit cost but may not be practical or desirable. In that case, cloud hosting is the most likely option. Cloud computing will incur costs for:

• Hardware

• Data stored

• Data transferred between servers

Data transfer costs in particular can vary wildly. A major event or issue could cause a burst of activity that results in much higher than usual costs. Major bursts of activity could result in overage fees at best, and data loss at worst.

logentries.com

10

Data Storage Costs The amount of space needed for data storage requires careful consideration. Elasticsearch works by storing independent indexes of data. Data can be indexed more than once GHSHQGLQJ�RQ�KRZ�WKH�LQGH[HV�DUH�FRQȴJXUHG��$GGLWLRQDO�ȴHOGV�DGGHG�WR�GRFXPHQWV�IRU�indexing purposes can also add to data size. Furthermore, storage needs will increase over time as the data being indexed, and the indexes grow. It is best to prepare for storage requirements to increase.

“ Data storage is probably the biggest cost you will experience over time.

Documentation Costs Meticulous documentation is essential for every Elastic Stack implementation. The institutional knowledge gained from building an Elastic Stack can’t be recreated.

This information will be extremely valuable for long-term maintenance and support. As your Stack matures and ages over time, it is important to keep the documentation current.

logentries.com

11

The ELK Stack is most useful when having full control over the environment is important and the needed resources are available. As illustrated here, the Elastic Stack is not a shortcut to avoid the costs of proper logging. The Elastic Stack may not have a monthly fee and may not have software licenses, but that cost is still there in the form of rigor, resources, and scaling. The decision of whether or not to use the Elastic Stack for DIY logging is not about how much it costs compared to managed services, but rather where you want to allocate your resources and funds.

Conclusion

logentries.com

12

Start your 30-Day Logentries Free Trial Today. Save yourself and your team from the headache of standing up and maintaining the ELKeBMWSC(Tn) stack. Logentries makes it easy to manage all of your machine data.

Get started for free at logentries.com

��Unlimited log centralization

��Secure data transmission

��Protection from log manipulation

��Easy search for known events & patterns

��Full RegEx Support

��$RUGDEOH�SODQV�

��Real-time Alerts

��Inactivity Alerts

��Anomaly Detection

��'DWD�ȴOWHULQJ��REIXVFDWLRQ

��Custom tagging of known events

��Custom retention policies

Figure: Customizable Dashboard view