View
190
Download
4
Category
Tags:
Preview:
Citation preview
HOW MANY NODES?PROPERLY SIZING YOUR COUCHBASE CLUSTERPerry KrugSr. Solutions Architect
©2015 Couchbase Inc. 2
Read this Article:
http://blog.couchbase.com/how-many-nodes-part-1-introduction-sizing-couchbase-server-20-
cluster
©2015 Couchbase Inc. 3
Application Server
Size Couchbase Server
Sizing == performance Serve reads out of RAM Enough IO for writes and disk operations Mitigate inevitable failures
Reading Data Writing Data
Couchbase Server
Give medocument A
Here is document A
A
Couchbase Server
Please storedocument A
OK, I storeddocument A
A
Application Server
©2015 Couchbase Inc. 4
Scaling out permits matching of aggregate flow rates so queues do not grow
network networknetwork
Couchbase Server Couchbase Server Couchbase Server
Application Server Application ServerApplication Server
5 Factors of Sizing
©2015 Couchbase Inc. 6
How many nodes?
5 Key Factors determine number of nodes needed:
1) RAM2) Disk3) CPU4) Network5) Data Distribution/Safety
(per-bucket, multiple buckets aggregate)Couchbase Servers
Web application server
Application user
©2015 Couchbase Inc. 7
RAM sizing
1)Total RAM Managed document cache:
Working set Metadata Active+Replicas
Index caching (I/O buffer)
Keep working set in RAM for best read performance
Server
Give medocument A
Here is document A
A
A
A
Reading Data
Application Server
©2015 Couchbase Inc. 8
Working set depends on your application
Late stage social game
Many users no longer active; few logged in at
any given time.
Ad NetworkAny cookie can show
up at any time.
Business applicationUsers logged in during
the day. Day moves around the globe.
working/total set = 1working/total set = .01 working/total set = .33
Couchbase Server Couchbase Server Couchbase Server
©2015 Couchbase Inc. 9
RAM Sizing - View/Index cache (disk I/O)
File system cache availability for the index has a big impact performance:
Test runs based on 10 million items with 16GB bucket quota and 4GB, 8GB system RAM availability for indexes
Performance results show that by doubling system cache availability query latency reduces by half throughput increases by 50%
Leave RAM free with quotas
©2015 Couchbase Inc. 10
Disk Sizing: Space and I/O
2) Disk Sustained write rate Rebalance capacity Backups XDCR Views/Indexes Compaction Total dataset:
(active + replicas + indexes)
Append-only
I/O
Space
Please storedocument A
OK, I storeddocument A
A
Server
A
A
Writing Data
Application Server
©2015 Couchbase Inc. 11
Disk Sizing: Space and I/O Disk Writes are Buffered
Bursts of data expand the disk write queue Sustained writes need corresponding throughput
Disk throughput affected by disk speed SSD > 10K RPM > EBS SSDs give a huge boost to write throughput and
startup/warmup times RAID can provide redundancy and increase throughput
Throughput = read/write+compaction+indexing+XDCR 2.1 introduces multiple disk threads
Default is 3 (1 writer / 2 readers), max is 8 combined
Best to configure different paths for data and indexes Plan on about 3x space (append-only, compaction,
backups, etc)
©2015 Couchbase Inc. 12
CPU sizing
3)CPU Disk writing Views/compaction/XDCR RAM r/w performance not impacted Min. production requirement:
4 cores+1 per bucket+1 core per Design Doc+1 core per XDCR stream
©2015 Couchbase Inc. 13
Network sizing
4) Network Client traffic Replication (writes) Rebalancing XDCR
Reads+Writes
Replication (multiply writes) and Rebalancing
network networknetwork
Couchbase ServerCouchbase Server Couchbase Server
Application ServerApplication ServerApplication Server
©2015 Couchbase Inc. 14
Network Considerations
Low latency, high throughput (LAN) - within cluster
Eliminate router hops: Within Cluster nodes Between clients and cluster
Check who else is sharing the network Increase bandwidth by:
Add more nodes (will scale linearly) Upgrade routers/switches/NIC’s/etc
©2015 Couchbase Inc. 15
Data Distribution
5)Data Distribution / Safety (assuming one replica): 1 node = Single point of failure 2 nodes = +Replication 3+ nodes = Best for production
Autofailover Upgrade-ability Further scale-ability
Note: Many applications will need more than 3 nodes
Servers fail, be prepared. The more nodes, the less impact a failure will have.
©2015 Couchbase Inc. 16
How many nodes recap
5 Key Factors determine number of nodes needed:
1) RAM2) Disk3) CPU4) Network5) Data Distribution/Safety
(per-bucket, multiple buckets aggregate)
Couchbase Servers
Web application server
Application user
Deployment Considerations
©2015 Couchbase Inc. 18
Hardware Minimums
RAM: At least ~4GB (highly dependent on data set)
Disk: Fastest “local” storage available-SSD is better-RAID 0 or 10, not 5
CPU (minimums): 8 cores+ 1-per bucket+ 1-per design document+ 1-per XDCR stream
Hardware requirements/recommendations are the intersection of what’s needed versus what’s available.
©2015 Couchbase Inc. 19
Hardware Considerations
Designed for commodity hardware Scale out, not up…more smaller nodes better
than less larger ones (can scale up later) Tested and deployed in EC2 Physical hardware offers best performance and
efficiency Certain considerations with using VM’s:
RAM use inefficient / Disk IO usually not as fast Local storage better than shared SAN 1 Couchbase VM per physical host You will generally need more nodes Don’t overcommit
©2015 Couchbase Inc. 20
Couchbase in AWS
R3 or C3 instances best value for performance Higher RAM-to-CPU ratios Come with SSD’s
Disk Choice: SSD’s are best Ephemeral is okay Single EBS not great, use LVM/RAID Views/indexes on ephemeral, main data on EBS or both
on SSD Backups: Use cbbackup locally on each node and
migrate to EBS/S3 Can use EBS snapshots
©2015 Couchbase Inc. 21
Couchbase in AWS
Deploy across AZ’s with rack/zone awareness Use a EIP/public-hostname instead of private IP:
Easier connectivity from outside AWS Easier restoration/better availability Couchbase XDCR across regions must use hostname
In AWS as with any cloud/virtual deployment, you will likely need more nodes than you would with a physical infrastructure
Effects of…
©2015 Couchbase Inc. 23
Views/Indexes
Effect on scale/sizing: Increase the CPU and disk IO requirements More complex views require more CPU More view output requires more disk IO More RAM should be left out of the quota for better IO
caching Indication:
Indexes significantly behind data writes (or growing delays)
What do to: Make sure you follow best practices in view writing Add more nodes to distribute processing “work” Look into SSD’s
©2015 Couchbase Inc. 24
XDCR
Effect on scale/sizing: XDCR is CPU Intensive Disk IO will double Memory needs to be sized accordingly (bi-directional
may mean more data) Indication:
A rising XDCR queue on source What to do:
More nodes on source and destination will drain queue faster (scales linearly)
Tune replication streams according to CPU availability
©2015 Couchbase Inc. 25
As your workload grows… Effects on scale/sizing:
More reads:• Individual documents will not be impacted (static working
set)• Views may require faster disks, more disk IO caching
More writes will increase disk IO needs Indications:
Cache miss ratio rising Growing disk write queue / XDCR queue Compaction not keeping up
What to do: Revise sizing calculations and add more nodes if needed
Most applications don’t need to scale the number of nodes based upon normal workload variation.
©2015 Couchbase Inc. 26
As your dataset grows… Effects on scale/sizing:
Your RAM needs will grow: Metadata needs increase with item count Is your working set increasing? Your disk space will likely grow (duh?)
Indications: Dropping resident ratio Rising ejections/cache miss ratio
What to do: Revise sizing calculations, add more nodes Remove un-needed data
This is the most common need for scaling and will most likely result in needing more nodes
©2015 Couchbase Inc. 27
Rebalancing
Yes there is resource utilization during a rebalance but a “properly” sized cluster should not have any effect on performance during a rebalance: Distribution of data and work across all nodes Managed caching layer separates RAM-based
performance from IO utilization Rebalance automatically manages working set in RAM Rebalance automatically throttles itself if needed Can be stopped midway without endangering data or
progress
Proper sizing includes not maxing out all resources: leave some headroom in preparation
Couchbase 4.0
©2015 Couchbase Inc. 29
Sizing Couchbase Server 4.0
Multi-Dimensional Scalability (MDS) – Optionally Scale each service independently: Data Index Query
5 factors still apply: RAM Disk CPU Network Data Safety/Distribution
©2015 Couchbase Inc. 30
Sizing Couchbase Server 4.0 - Data
Data Service in 4.0 same as previous Couchbase Server: Enough RAM to cache reads Enough Disk to eventually persist writes CPU primarily for Views and XDCR At least 3 nodes – Replication at the bucket level
Minimum requirements: 4GB RAM, 8 Cores CPU
©2015 Couchbase Inc. 31
Sizing Couchbase Server 4.0 - Index
Index service new to 4.0 (a.k.a. GSI or “Secondary Indexes”): Primarily RAM and Disk IO bound ForestDB persistence engine At least 2 nodes for HA, each index replicated
individually
Minimum Requirements: 8GB RAM, 8 core CPU, “fast disk”
Note: 4.0 is still in beta, final sizing numbers are being formulated
©2015 Couchbase Inc. 32
Sizing Couchbase Server 4.0 - Query
Query Service new to 4.0 (a.k.a. N1QL) Primarily CPU bound Optimized for multi-core systems Very low RAM and disk requirements At least 2 nodes for HA – Queries automatically load
balanced
Minimum Requirements: 4GB RAM, 16+ Core CPU
Note: 4.0 is still in beta, final sizing numbers are being formulated
©2015 Couchbase Inc. 33
Sizing Couchbase Server 4.0 - MDS
Multi-Dimensional Scalability (MDS) Option 1: All 3 services enabled on all nodes – Size for
aggregate requirements (Data+Index+Query) Option 2: Separated services – Size nodes independently
for different workloads. i.e.:
• Data Service: More nodes with more RAM, less disk, less CPU
• Index Service: Fewer nodes with more RAM, more disk, less CPU
• Query Service: Fewer nodes with less RAM, less disk, more CPU
©2015 Couchbase Inc. 34
Sizing Couchbase Server 4.0 - MDS
Independent Load Distribution Modular Architecture to Construct the Database for
Your Need Pick HW Capacity – scale up and/or scale out Pick Services Layout - overlap and/or isolate services Pick Data/Index Partitioning
Couchbase Cluster
Index ServiceQuery
ServiceData Service
node1 node8
©2015 Couchbase Inc. 35
Sizing is tricky business…
Work with the Couchbase Team
Validate your “on-paper” numbers with testing
Constantly monitor production
©2015 Couchbase Inc. 36
Dive in…
Gather your workload and dataset requirements: Item counts and sizes, read/write/delete ratios
Review our documentation and formulas Test, Deploy, Monitor…rinse and repeat
©2015 Couchbase Inc. 37
Want more?
Lots of details and best practices in our documentation:
http://www.couchbase.com/docs/
And my sizing blog:http://blog.couchbase.com/how-many-nodes-part-1-introduction-sizing-couchbase-server-20-
cluster
Get Started with Couchbase Server 4.0: www.couchbase.com/beta
Get Trained on Couchbase: training.couchbase.com
Thank you perry@couchbase.com | @couchbase
Recommended