SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance

  • View
    107

  • Download
    2

Embed Size (px)

DESCRIPTION

"Benchmarking Solr Performance" - Tim Potter, Lucidworks

Text of SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance

  • 1. Search | Discover | Analyze Confidential and Proprietary Copyright 2013 Benchmarking Solr Performance June 18, 2014 Timothy Potter

2. Confidential and Proprietary Copyright 2013 My SolrCloud Experience At LucidWorks, mostly focused on hardening SolrCloud; Lucene/Solr committer Operated 36 node cluster in AWS for Dachis Group (1.5 years ago, 18 shards ~900M docs) Built a Fabric/boto framework for deploying and managing a cluster in EC2 Co-author of Solr In Action 3. Confidential and Proprietary Copyright 2013 Agenda Indexing performance tests Solr Scale Toolkit Next steps 4. Confidential and Proprietary Copyright 2013 Cluster sizing How many servers do I need to index X docs? ... shards ... ? ... replicas ... ? I need N queries per second over M docs, how many servers do I need? It depends?!? 5. Confidential and Proprietary Copyright 2013 Methodology Transparent repeatable results Ideally hoping for something owned by the community Synthetic docs ~ 1K each on disk, mix of field types Data set created using code borrowed from PigMix English text fields generated using a Zipfian distribution Java 1.7u55, Amazon Linux, r3.2xlarge nodes enhanced networking enabled, placement group, same AZ Stock Solr (cloud) 4.8.1 Using Shawn Heiseys GC tuning parameters Use Elastic MapReduce to generate load As many nodes as I need to drive Solr! 6. Confidential and Proprietary Copyright 2013 Indexing Results Cluster Size # of Shards # of Replicas Reducers Time (secs) Docs / sec 10 10 1 48 1762 73,780 10 10 2 34 3727 34,881 10 20 1 48 1282 101,404 10 20 2 34 3207 40,536 10 30 1 72 1070 121,495 10 30 2 60 3159 41,152 15 15 1 60 1106 117,541 15 15 2 42 2465 52,738 15 30 1 60 827 157,195 15 30 2 42 2129 61,062 7. Confidential and Proprietary Copyright 2013 Direct Updates Indexing Client 1 CloudSolrServer (SolrJ) ZooKeeper /clusterstate.json Shard 1 (leader) Shard 2 (leader) Shard 3 (leader) Watch /clusterstate.json compute shard assignment on clientbatch 8. Confidential and Proprietary Copyright 2013 Replication CloudSolrServer (SolrJ) ZooKeeper /clusterstate.json Shard 1 (leader) Shard 2 (leader) Shard 3 (leader) Watch /clusterstate.json Shard 1 (replica) Shard 2 (replica) Shard 3 (replica) Blocks for response from replica(s) 9. Confidential and Proprietary Copyright 2013 Dont swamp your servers! 10. Confidential and Proprietary Copyright 2013 Lessons Learned Know what throughput your client side is capable of generating If in MapReduce, index from reducers with speculative execution disabled Dont change Solr config without good reasons for doing so Overshard (but not too much) Near-linear scalability as I added nodes! 11. Confidential and Proprietary Copyright 2013 Query Performance Tests All nodes in SolrCloud perform indexing and execute queries Using the TermsComponent to build queries based on the terms in each field. Harder to accurately simulate user queries over synthetic data Need mix of faceting, paging, sorting, grouping, boolean clauses, range queries, boosting, filters (some cached, some not), etc ... Does the randomness in your test queries model (expected) user behavior? Start with one server (1 shard) to determine baseline query performance. Look for inefficiencies in your schema and other config settings 12. Confidential and Proprietary Copyright 2013 Solr Scale Toolkit Fabric / Python based toolset for deploying and managing SolrCloud clusters SolrJ-based client application useful for building tools that need access to cluster state information in ZooKeeper Code to support benchmarks for Solr 13. Confidential and Proprietary Copyright 2013 Python-based Tools boto Python API for AWS (EC2, S3, etc) Fabric Python-based tool for automating system admin tasks over SSH pysolr Python library for Solr (sending commits, queries, ...) kazoo Python client tools for ZooKeeper Supporting Cast: JMeter run tests, generate reports collectd system monitoring Logstash4Solr log aggregation JConsole/VisualVM monitor JVM during indexing / queries 14. Confidential and Proprietary Copyright 2013 Solr Scale Toolkit: Demo Launch a meta node Log agg / basic monitoring using SiLK Launch ZooKeeper Ensemble 3 nodes to establish quorum Setup cron job to clean-up snapshots Launch SolrCloud cluster Create new collection and index some docs Attach JConsole while indexing Run a healthcheck on the collection Checkout Banana Dashboard Backup / Restore Requires patch for SOLR-5956 Use fab patch_jars to update jars and do a rolling restart 15. Confidential and Proprietary Copyright 2013 Custom built AMI? Block device mapping dedicated disk per Solr node Launch and then poll status until they are live verify SSH connectivity Tag each instance with a cluster ID and username Provisioning machines fab new_ec2_instances:test1,n=3,instance_type=m3.xlarge 16. Confidential and Proprietary Copyright 2013 Two options: provision 1 to N nodes when you launch Solr cluster use existing named ensemble Fabric command simply creates the myid files and zoo.cfg file for the ensemble and some cron scripts for managing snapshots Basic health checking of ZooKeeper status: echo srvr | nc localhost 2181 ZooKeeper fab new_zk_ensemble:zk1,n=3 17. Confidential and Proprietary Copyright 2013 Upload a BASH script that starts/stops Solr Set system props: jetty.port, host, zkHost, JVM opts One or more Solr nodes per machine JVM mem opts dependent on instance type and # of Solr nodes per instance Optionally configure log4j.properties to append messages to Rabbitmq for Logstash4Solr integration SolrCloud fab new_solrcloud:test1,zk=zk1,nodesPerHost=2 18. Confidential and Proprietary Copyright 2013 BASH script that implements: start/stop Solr nodes on each EC2 instance sets JVM memory options, system properties (jetty.port), enable remote JMX, etc backup log files before restarting nodes ensure JVM is killed correctly before restarting Environment variables in: solr-ctl-env.sh solr-ctl.sh 19. Confidential and Proprietary Copyright 2013 Deploy a configuration directory to ZooKeeper Create a new collection Attach a local JConsole/VisualVM to a remote JVM Rolling restart (with Overseer awareness) Build Solr locally and patch remote Use a relay server to scp the JARs to Amazon network once and then scp them to other nodes from within the network Put/get files Grep over all log files (across the cluster) Miscellaneous Utility Tasks 20. Confidential and Proprietary Copyright 2013 fab mine: See clusters Im running (or for other users too) fab kill_mine: Terminate all instances Im running Use termination protection in production fab ssh_to: Quick way to SSH to one of the nodes in a cluster fab stop/recover/kill: Basic commands for controlling specific Solr nodes in the cluster fab jmeter: Execute a JMeter test plan against your cluster Example test plan and Java sampler is included with the source Other useful stuff ... 21. Confidential and Proprietary Copyright 2013 Java-based command-line application that uses SolrJs CloudSolrServer to perform advanced cluster management operations: healthcheck: collect metadata and health information from all replicas for a collection from ZooKeeper backup: create a snapshot of each shard in a collection for backing up to remote storage (S3) Framework for building complex tools that benefit from having access to cluster state information in ZooKeeper SolrCloud Tools (SolrJ client app) ./tools.sh tool healthcheck 22. Confidential and Proprietary Copyright 2013 SiLK Integration SiLK: Solr integrated with Logstash and Kibana Index time-series data, such as log data (collectd, Solr logs, ...) Build cool dashboards with Banana (fork of Kibana) Easily aggregate all WARN and more severe log messages from all Solr servers into logstash4solr Send collectd metrics to logstash4solr 23. Confidential and Proprietary Copyright 2013 SiLK Integration 24. Confidential and Proprietary Copyright 2013 Whats Next? Migrate to using Apache libcloud instead of using boto directly Benchmark mixed work-loads (queries and indexing) SiLK is improving rapidly! Chaos monkey tests integrate jepsen? Open source so please kick the tires! 25. Confidential and Proprietary Copyright 2013 Wrap-up Solr Scale Toolkit: https://github.com/LucidWorks/solr-scale-tk LucidWorks: http://www.lucidworks.com SiLK: http://www.lucidworks.com/lucidworks-silk/ Solr In Action: http://www.manning.com/grainger/ Connect: @thelabdude / tim.potter@lucidworks.com Questions?