Upload
volha-banadyseva
View
2.940
Download
2
Tags:
Embed Size (px)
DESCRIPTION
#BigDataBY
Citation preview
ABOUT
Simple developer
In DSP team
Real time auction engine for Ads
BUSINESS CASE
A lot of users
A lot of data
Real time decision making
Based on data we have
CONTENT
We are living with Cassandra already more than 26000 hours and counting
I picked up random moments from our interesting marriage
Sometimes sleepless nights
A lot of intense days
Some experience gained hopefully
Come by and discuss things
I was born
Went to school
Finished university
Something special has happened
LET’S STORE COOKIE PROFILES FIRST
Search for fast, scalable storage
Capable of storing big amount of data
And handle write intense workload
Please, not SQL Server again?
OH, NO-SQL YOU SAY
Cassandra!
Let’s do some benchmarks and compare!… with what?
Ok… let’s bring it home
HAPPY MOMENT
Adform was mainly Microsoft based company
First open source project in company
First “Linux first” piece of software
First COOL tech
Started to adopt other open source / not Microsoft solutions afterwards
MICROSOFT BASED…
No Linux experience etc.
Let’s run on windows as it’s written in JAVA
All services was written in .NET
…and no libraries for .NET at all
Virtualization…
HAPPY MOMENT
It’s very fun to invent things
In house .NET client with failover, load balancing, dynamic node discovery
FINALLY UP AND RUNNING
3 (+-) years ago started in PROD with version 0.7 beta 2
Everything was running smooth with few gigs of data
Hardware had a lot of spare capacity
Were proud NO SQL experts
DSP KICKS IN
Decision to start DSP project
Data in Cassandra for decision making
A lot of frontend servers
More reads and
… 20 ms. per requests SLA
20 MS
20 MS – GAME CHANGER
OK when you can allow 100 MS for part of the requests
SAD MOMENT
Move from Windows to LinuxMove from virtual to bare metal“Smart” in house Cassandra clientData access tuning for ex. Dynamic snitchOptimizing memory usageCode base optimized on data accessData in RAMGC tuningCompaction tuningConsultancy
SAD MOMENT
Fat nodes
Stability
RESULTED IN
A lot of DEV time spent
On tracking changes in new Cassandra releases
Tweaking configuration
Monitoring compactions, read stages etc.
Thinking about further optimizations
…but…
Experience was extracted
AND ON TOP
Scale 4x times!!!
Maybe it’s time to search for alternatives
Fast read access
CONSISTENT read access
Fat nodes
Low time on maintenance
.NET “compatible”
AEROSPIKE
Aerospike?
Not open source at that time
Quite amount of money
Promises looks too good
Everyone skeptical
But let’s give a shot
SHOT
POC cluster
2 nodes, SSD based, consumer grade discs
Shadow workload
No tweaking
“FAT” nodes
SHOT
Forgot all skepticism in one week
POC became PROD
Disc load >90%
5 month no problems
99% < 10ms consistent performance
Simpler, easier to maintain code base
Focus on business
JUST IN CASE
So we are running on proprietary piece of software
But.. data model, access patterns are compatible with most nosql’s
Just in case
It’s already open source
CASSANDRA
Cassandra still in toolbox
To maintain knowledge base
And it’s great
For less time restricted workloads
Write workloads
Other cases
THE END
When you feel that someone becomes too big part of your life
In not pleasant way
Search for alternatives
Love ends in 3 years
Data model is important
Read documentation (limitations etc.)
I don’t know – maybe there is more
NUMBERS
2 storage servers
1 TB of data
120 000 reads
8 000 writes
42 client servers