View
239
Download
1
Category
Preview:
Citation preview
Chronix as long term storage for Prometheus
Florian Lautenschlager, Moritz Kammerer
@flolaut, @phxql
Prometheus
Cloud Native Application
Cloud Native ApplicationCloud Native Application
Cloud Native Application
Cloud Native ApplicationCloud Native Application
Real-time monitoring and alerting for cloud native apps to detect
anomalies close to their occurrence and to initiate measures.
TIMENOW 14 Days
Beyond real-time monitoring of cloud native apps?
Nothing more to do?
Prometheus
Cloud Native Application
Cloud Native ApplicationCloud Native Application
Cloud Native Application
Cloud Native ApplicationCloud Native Application
TIMENOW THEN
Real-time monitoring and alerting for cloud native apps to detect
anomalies close to their occurrence and to initiate measures.Lossless long term storage to store
data forever allowing analyses
beyond real-time monitoring!
Chronix
Agenda
■ Some words about Chronix, its Architecture, its Features, and its Performance.
■ How did we built the integration with Prometheus.
■ Showcase: Prometheus, Chronix Ingester, Chronix, and Grafana
Chronix is more than just a simple time series database. It’s a time series processing tool stack for all purposes.
Time Series Database: What’s that?
■ Definition 1: “A sample s is a tuple of {timestamp, value}, where the
value could be any kind of object.”
■ Definition 2: “A time series T is an arbitrary list of chronological
ordered samples of one value type”.
■ Definition 3: “A chunk C is a chronological ordered part of a time
series.”
■ Definition 4: “A time series database TSDB is a specialized database
for storing and retrieving time series in an efficient and optimized
way”.
s
{t,v}
1T
{s1,s2}
TCT
T1
C1,1
C1,2
TSDBT3C2,2
T1 C2,1
Chronix’ architecture enables both efficient storage of time series and millisecond range queries.
(1)
Semantic Transformation
(2)
Attributes and Chunks
(3)
Basic Compression
(4)
Multi-Dimensional
Storage
Record
data:<chunk>
attributes
Record
data:compressed
<chunk>
attributes
Record Storage
68 Billion Points1 Mio. Chunks *
68.000 Points~ 96% Compression
Optional
The key data type of Chronix is called a record. It stores a compressed time series chunk and its attributes.
record{
data:compressed{<chunk>}
//technical fields id: 3dce1de0−...−93fb2e806d19 version: 1501692859622883300 start: 1427457011238 end: 1427471159292
//optional attributes host: prodI5 process: scheduler group: jmxmetric: heapMemory.Usage.Usedmax: 896.571
}
Data:compressed{<chunk of time series data>}
■ Time Series: timestamp, numeric value
■ Traces: calls, exceptions, …
■ Logs: access, method runtimes
■ Complex data: models, test coverage,
anything else…
Optional attributes
■ Arbitrary attributes for the time series
■ Attributes are indexed
■ Make the chunk searchable
■ Can contain pre-calculated values
Chronix provides specialized aggregations, transformations, and analyses for time series that are commonly used.
Aggregations
■ Min / Max / Average / Sum / Count
■ Percentile
■ Standard Deviation
■ First / Last
■ Range
Analyses
■ Trend Analysis
Using a linear regression model
■ Outlier Analysis
Using the IQR
■ Frequency Analysis
Check occurrence within a time range
■ Fast Dynamic Time Warping
Time series similarity search
■ Symbolic Aggregate Approximation
Similarity and pattern search
Transformations
■ Bottom/Top n-values
■ Moving average
■ Divide / Scale
■ Downsampling
Many more
Many more
Only scalar values? One size fits all? No! What about logs, traces, and others? No problem – Just do it yourself!
■ Chronix Time Series
■Time Series framework that is used by Chronix.
■Time Series Types:
■Numeric: Doubles (the time series known to be the default)
■More to come.
public interface TimeSeriesConverter<T> {
/*** Shall create an object of type T from the given binary time series.*/
T from(BinaryTimeSeries binaryTimeSeriesChunk, long queryStart, long queryEnd);
/*** Shall do the conversation of the custom time series T into the binary time series that is
stored.*/
BinaryTimeSeries to(T timeSeriesChunk);}
That‘s the easiest way to play with Chronix. A single instance of Chronix on a single node.
Java 8 (JRE)
Chronix - 0.4
Solr - 6.2.1
Lucene
Solr plugins
8983
Your Computer
Chronix-Query-Handler
Chronix-Ingestion-Handler
Chronix-Retention
OpenTSDB
Prometheus
KairosDB
HTTP
Chronix-Compaction-Handler
Chronix Client
InfluxDB
Graphite
Go
Java
Code-Slide: How to set up Chronix, ask for time series data, and call some server-side aggregations in Java.
■ Create a connection to Solr and set up Chronix
■ Define and range query and stream its results
■ Call some aggregations
solr = new HttpSolrClient("http://localhost:8913/solr/chronix/")chronix = new ChronixClient(new MetricTimeSeriesConverter<>(),
new ChronixSolrStorage(200, groupBy, reduce))
query = new SolrQuery("metric:*Load*")chronix.stream(solr,query)
query.addFilterQuery("function=max,min,count,sdiff")stream = chronix.stream(solr,query) Signed Difference:
First=20, Last=-100
-80
Group chunks on a combination
of attributes and reduce them to
a time series.
Get all time series whose
metric contains Load
Compared to other time series databases Chronix‘ results for our use case are outstanding.
■ We have evaluated Chronix with:
■ InfluxDB, OpenTSDB, and KairosDB
■All databases are configured as single node
■ Storage demand for 108 GB of raw csv time
series data.
■Chronix (8.7 GB) saves 20% – 84% of the space
other time series databases.
■ Query times on imported data.
■73% – 92% faster on data retrieval.
■80% – 97% faster on a mix of analyses.
■ Memory footprint: after start, max during
import, max during query mix
■Chronix takes 1.6 times less memory than
the best alternative.
The hard facts. For more details I suggest you to read our research paper about Chronix.
Florian Lautenschlager, Michael Philippsen, Andreas Kumlehn, Josef Adersberger
Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in
Operational Data
FAST 2017 (submitted)
17
Let‘s dig into Chronix Ingesters’ internals.
Image Credit: http://www.taringa.net/posts/ciencia-educacion/12656540/La-Filosofia-del-Dr-House-2.html
Big Picture. It’s a simply and scalable architecture.
Prometheus
Standard Prometheus
InstallationChronix ServerChronix Ingester
• Collects metrics from
various services.
• Writes them to its
default storage
• Writes them using the
standard remote write
interface to Chronix
Ingester
• Collects samples in
batches and writes
them later to Chronix
with an ideal batch size
• Writes checkpoints to
disk to avoid loss of
data.
• Scales easily
• Lossless long term
storage
• Data distribution
(Apache Solr)
• Rich set of analyses
functions for data
analytics beyond real-
time monitoring.
Chronix Chronix
Single Host
Prometheus Chronix ServerChronix Ingester
In-Memory
Everything runs on a single machine. Small. Simple. Beautiful.
S S S B B B
S Sample: {t,v}
B Batch: [{t,v},{t,v},{t,v}]
Single HostPrometheus
Chronix Server
Chronix Ingester
In-Memory
Once per Prometheus on a single host.
Chronix Ingester
In-Memory
Prometheus
S Sample: {t,v}
B Batch: [{t,v},{t,v},{t,v}]
Single Host
Prometheus
Chronix ServerChronix Ingester
In-Memory
Chronix Ingester Singleton ;-)
Prometheus
S Sample: {t,v}
B Batch: [{t,v},{t,v},{t,v}]
B B B
Single Host
Prometheus
Chronix Server
Chronix Ingester
In-Memory
Chronix Ingester Cloud behind a proxy to serve multiple Prometheus servers.
Prometheus
S Sample: {t,v}
B Batch: [{t,v},{t,v},{t,v}]
N
G
I
N
XChronix Ingester
In-Memory
Prometheus
Prometheus
Single Host
Single Host
Single HostSingle HostPrometheus
Chronix ServerChronix Ingester
In-Memory
Cloud Mode: Multiple Prometheus Servers, One Chronix Ingester per Host, A Chronix Server Cloud
Prometheus
N
G
I
N
XChronix Ingester
In-Memory
Prometheus
Prometheus Chronix Server Cloud
M
a
s
t
e
r
Architectural Key Factor: The Chronix Ingestor
■ Small Go Program
■Binary Size: 8.5 MB
■Lines of Code: ~ 720 LoC
■Scales easily: Copy, Execute
■ Handles writes from Prometheus
■Just a small configuration:
remote_write: url:http://<host>:<port>/ingest
■ Batches samples in memory
■Prometheus sends single samples.
■Chronix needs large chunks (n single
samples) to work well
■Max Batch Age
■5M, 12H, ..
■ Crash and restart resilience
■ In-memory is dangerous. The Ingester
holds some amount of transient state
■Regularly writes checkpoints of the entire
in-memory state to disk
■Latest checkpoint is loaded on restart
Chronix loves Chunks. Hence the Ingester batches samples.
The data models for Prometheus and Chronix are similar.
■ Prometheus
■Uses so called lables (key-value pairs) to store dimensional values
■Are added dynamically
■Stores samples (pairs of timestamp and scalar value)
■ Chronix
■Uses attributes (key-value pairs) to store dimensional values
■Schema, Schema less, Dynamic Fields, etc.
■Stores samples of timestamp an any value type: scalar, trace, string, etc.
An example Chronix schema to define the available fields.
<?xml version="1.0" encoding="UTF-8" ?>
<schema name="Chronix" version="1.5"><types><fieldType name="int" class="solr.TrieIntField" precisionStep="0" positionIncrementGap="0"/><fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0"/><fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" positionIncrementGap="0"/><fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/><fieldType name="binary" class="solr.BinaryField"/>
</types>
<fields> <!-- The required fields --> <field name="id" type="string" indexed="true" stored="true" required="true"/> <field name="_version_" type="long" indexed="true" stored="true"/><field name="start" type="long" indexed="true" stored="true" required="true"/><field name="end" type="long" indexed="true" stored="true" required="true"/><field name="data" type="binary" indexed="true" stored="true" required="false"/><field name="metric" type="string" indexed="true" stored="true" required="true"/><!-- Dynamic field for tags --><dynamicField name="*_s" type="string" indexed="true" stored="true"/>
</fields><uniqueKey>id</uniqueKey> <solrQueryParser defaultOperator="OR"/>
</schema>
Definition of types
Available Fields
Prometheus labels are strings. Chronix Ingester creates them in
Chronix Server dynamically using the dynamicField *_s.
Prometheus_Label -> Chronix_Label
host -> host_s
Showcase: Prometheus, Chronix Ingester, Chronix and Grafana
Prometheus Chronix ServerChronix Ingester
In-Memory
S S S
Grafana
In-Memory
B B B
Disk usage: 12 Hours of Data
3.610.844 Samples
Prometheus: ~ 26 MB
Chronix: ~ 5 MB
A few words about performance in our showcase.
A few words about performance in our showcase.
CPU usage: 4 Cores available (= 400 % Max)
A few words about performance in our showcase.
Memory consumption (max. 8 G)
Prometheus
Prometheus Configuration
Chronix Default Web-UI
Using the data source plugins for Chronix and Prometheus.
Ingester Health: Everything Green!
Short Term Data in Prometheus.Long Term Data in Chronix.
Everything is open source and free to everyone.The code is the truth.
Chronix Website: www.chronix.ioChronix Github: https://github.com/ChronixDB- Ingester: https://github.com/ChronixDB/chronix.ingester
Questions?- Twitter: @ChronixDB, @flolaut, @phxql- Slack: https://qaware.slack.com/messages/chronix/
Now it’s your turn.
Now it’s your turn.
Recommended