View
0
Download
0
Category
Preview:
Citation preview
SNYPR 6.2 CU4
Architecture Guide
Date Published: 11/13/2019
Securonix Proprietary Statement
This material constitutes proprietary and trade secret information of Securonix, and shall not be disclosed to any
third party, nor used by the recipient except under the terms and conditions prescribed by Securonix.
The trademarks, service marks, and logos of Securonix and others used herein are the property of Securonix or
their respective owners.
Securonix Copyright Statement
This material is also protected by Federal Copyright Law and is not to be copied or reproduced in any form, using
any medium, without the prior written authorization of Securonix.
However, Securonix allows the printing of the Adobe Acrobat PDF files for the purposes of client training and
reference.
Information in this document is subject to change without notice. The software described in this document is
furnished under a license agreement or nondisclosure agreement. The software may be used or copied only in
accordance with the terms of those agreements. Nothing herein should be construed as constituting an additional
warranty. Securonix shall not be liable for technical or editorial errors or omissions contained herein. No part of this
publication may be reproduced, stored in a retrieval system, or transmitted in any form or any means electronic or
mechanical, including photocopying and recording for any purpose other than the purchaser's internal use without
the written permission of Securonix.
Copyright © 2019 Securonix. All rights reserved.
Contact Information
Securonix
14665 Midway Rd. Ste. 100
Addison, TX 75001
(855) 732-6649
SNYPR Architecture Guide 2
Table of Contents
Introduction 4
Nodes that Integrate with Hadoop 6
Deployment Alternatives 9Dedicated SNYPR Deployment 9SNYPR Deployment with Existing Hadoop Infrastructure 11
Hadoop Components 13
Search Deployment Options 15Search Embedded 15Search Dedicated 16Search Index Storage Estimates 17
Data Ingestion 20Phase 1: Collect and Publish 20Phase 2: Enrichment 21Phase 3: Processing 22Indexing Incoming Events 23Deployment Assumptions 25High Availability 31Reference Server Specifications 37
SNYPR Cloud Deployment 54
Considerations 55Amazon EC2 55Microsoft Azure 56Network 57Virtual Infrastructure 60
Recommendations 62Hadoop Cluster Tuning Recommendations 62Network Tuning Recommendations 74
Google Cloud 83Deployment Architecture 84
Spark Jobs Configuration for Kerberized Kafka 85Disaster Recovery Alternatives 86
SNYPR Architecture Guide 3
Introduction
IntroductionSNYPR is a big data security analytics platform built on Hadoop that utilizesSecuronix machine learning- based anomaly detection techniques and threat modelsto detect sophisticated cyber and insider attacks. SNYPR uses Hadoop both as itsdistributed security analytics engine and long-term data retention engine. Hadoopnodes can be added as needed, allowing the solution to scale horizontally to supporthundreds of thousands of events per second (EPS).
Featuresl Supports a rich variety of security data including security event logs, user identitydata, access privileges, threat intelligence, asset metadata, and netflow data.
l Normalizes, indexes, and correlates security event logs, network flows, and applic-ation transactions.
l Utilizes machine learning-based anomaly detection techniques, including beha-vior profiling, peer group analytics, pattern analysis, and event rarity to detectadvanced threats.
l Provides out-of-the-box threat and risk models for detection and prioritization ofinsider threat, cyber threat, and fraud.
l Risk-ranks entities involved in threats to enable an entity-centric (user or devices)approach to mitigating threats.
l Provides Spotter, a search feature with normalized search syntax that enablesinvestigators to investigate today’s threats and track advanced persistent threatsover long periods of time, with all data available at all times.
l Provides the Investigation Workbench to detect links across disparate data sets toenable quick investigations and hunting for cyber threats.
AudienceThe guide is intended for system administrators, system integrators, and deploymentteams who need to determine the deployment options in a Hadoop cluster.
Additional ResourcesIf you require additional information, the following SNYPR documents are available:
SNYPR Architecture Guide 4
Introduction
l Installation Guide: For system administrators, system integrators, and deploymentteams who need to install the application.
l Administration Guide: For system administrators who are responsible for ongoingoperations and management, business managers, and other users in a super-visory role who need information about how to use SNYPR to grant employeesand partners access to applications, check for policy violations, and managecases.
l Data Integration Guide: For data integrators who need to import activity and enrich-ment datasources to support existing and custom use cases.
ll Content Development Guide: For content developers who need to use existingcontent and develop custom use cases to detect the threats to your organization.
l User Guide: For information security professionals, security analysts who need todetect and manage threats, and risk and compliance officers, and IT specialistswho need to use the reporting capabilities of SNYPR to monitor and remediatecompliance.
SNYPR Architecture Guide 5
Nodes that Integrate with Hadoop
Nodes that Integrate with HadoopThe SNYPR architecture includes the following nodes that integrate with the Hadoopservices:
SNYPRApplicationServer
ConsoleUserInterface,configurationdb, Redis
These are edge nodes in a Hadoop cluster that are used forthe SNYPR user interface and the configuration repositoryfor all components used by the solution. Each ConsoleNode performs the following tasks:
l Provide visualizations formonitoring events, threat management dashboards,investigations and incidentresponse
l Build custom dashboards withvisualizations for viewing violation and event data
l Configure all ingestion jobs- user identities, access privileges, threat intelligence,security eventsand others
l Administration interface for applicationsupport, personnel and administrators
l Configure all policies andanalytics, including behavior-based anomaly detection,peer-based analytics,threat modeling and risk analytics
h
SNYPR Architecture Guide 6
Nodes that Integrate with Hadoop
SNYPREYE
SNYPREYEInterface,configurationdb
SNYPR Eye Server is a SNYPR monitoring and alertingserver that is used for the configuration and operationalhealth monitoring of all SNYPR services including the allthe servers in the hadoop cluster, the processes on theSNYPR Console, the SNYPR Spark Streamingapplications running in the YARN cluster, including theperformance of the data ingestion of all resources, theperformance and health of the SNYPR Search processes.The SNYPR Eye solution installs and manages SNYPR-EYE agents on the servers in the environment for localmonitoring.
SNYPRRemoteIngestionNode
SNYPRRemoteIngestionNode
SNYPR Remote Ingestion Nodes: These nodes are Edgenodes in a Hadoop cluster that are used to ingest securityevent log data into the environment with the Securonixconnectors.
Each SNYPR Ingestion node performs the following tasks:
l Import Events from log sources
l Publish events to KafkaMessage Bus with batching, compression andencryption
l Accept incoming log files on syslog
l Cache In-transit messages
HadoopMaster
Hadoopclustermanagementservices
Hadoop Master Nodes: These are the master servers in theHadoop cluster.
SNYPR Architecture Guide 7
Nodes that Integrate with Hadoop
Hadoop Compute / Storage Nodes: These are the mainnodes in a Hadoop cluster that are used to storecompressed data and process all the jobs associated withSNYPR.
Each SNYPR Compute/Storage node performs thefollowing tasks:
l Fetch data from the ingestion nodes.
l Perform all the jobs associated with SNYPRbased on the configuration stored in the Master node,including parsing, indexing,analytics, and storage.
l Store data with 90% compression in structuredJSON format.
l Pass processed data to SNYPR Search indexesthat are used by the SNYPR console for review by theend user.
HadoopKafkaBroker
KafkaBroker,dedicatedzookeeper
Kafka broker servers for in transit messages, configurationzookeeper servers dedicated to Kafka. These servers uselocal storage for in transit messages.
SNYPR Architecture Guide 8
Deployment Alternatives
Deployment AlternativesThe SNYPR solution utilizes services in a Hadoop cluster. SNYPR provides thefollowing deployment options:
l SNYPR UEBA: SNYPR User and Entity Behavior Analytics (UEBA)This solutionprovides security analytics on security events. Events are stored only during theprocessing of the analytics.
l SNYPR Security Analytics Data: This solution provides security analytics onsecurity events. Events are stored for historical purposes and high-performancethreat hunting solution is provided for searching and visualization of events.
Dedicated SNYPRDeploymentThe Securonix SNYPR solution, shown in the diagram above, illustrates the servicesthat are used within SNYPR. In this deployment diagram, SNYPR is deployed with adedicated Security Analytics Data Lake. In this configuration, the Master nodesinclude the SNYPR Console and the Cloudera Manager service as well as otherservices like the HDFS Namenode, the YARN resource manager, Zookeeper, andother services that are used by the Hadoop cluster.
Based on the size of the deployment (events per second (EPS), analytics processed,retention period) and the features being supported (UEBA, Security AnalyticsPlatform, Data Lake).
SNYPR Architecture Guide 9
Deployment Alternatives
The SNYPR Architecture will scale to meet the deployment requirements. For a smallUEBA deployment, a limited number of servers are deployed and a dedicatedSNYPR Search Server is used for index storage. The deployment include between 3and 6 Hadoop servers along with a dedicated SNYPR Search server. The SNYPRapplication and the Redis service are collocated with the Hadoop master services.
For a medium UEBA deployment, full high availability of all services is configured ofservers are deployed and two dedicated SNYPR Search Servers are used for indexstorage. The deployment include between 6 and 10 Hadoop servers along with twodedicated SNYPR Search servers and two dedicated SNYPR Application Servers.
SNYPRDeployment with Dedicated Security Analytics Data Lake – Medium –UEBA
For a large Security Analytics Data Lake deployment, full high availability of allservices is configured for all servers that are deployed and at least two dedicatedSNYPR Search Servers are used for index storage. The deployment includesbetween 6 and 10 Hadoop servers along three dedicated Kafka Brokers and twodedicated SNYPR Search servers.
SNYPR Architecture Guide 10
Deployment Alternatives
SNYPRDeployment with Dedicated Security Analytics Data Lake – Large –Security Analytics Data Lake
SNYPR Deployment with Existing HadoopInfrastructureThe SNYPR solution shown in the following diagram (Figure 5) illustrates theservices for SNYPR that are added to an existing Hadoop cluster. The SNYPRApplication, SNYPR Search and SNYPR-EYE nodes are shown on the top and theexisting Hadoop cluster is shown in the box on the bottom. For the supported Hadoopdistributions, please see the SNYPR Installation Guide.
SNYPR Architecture Guide 11
Deployment Alternatives
Logical SNYPRArchitecture – Existing Hadoop Cluster
SNYPR Architecture Guide 12
Hadoop Components
Hadoop ComponentsSNYPR users a Hadoop cluster for processing all data. The core Hadoopcomponents include the following services:
l HDFS (Hadoop Distributed File System): Used to store security events and viol-ations. Data is stored in compressed parquet format.
l YARN (Yet Another Resource Negotiator): Provides resource management cap-abilities for jobs.
l Spark Streaming: Processing framework for live streaming data.
l HBase: Distributed no-SQL data store on HDFS to store the results of the ana-lytics.
l Kafka: Horizontally scalable message-bus used to manage the delivery of incom-ing security events.
l Impala (CDH) or Hive (HDP): Provides a SQL interface to the data stored inHDFS.
l ZooKeeper: Cluster management software to maintain configurations and syn-chronization services across nodes within a cluster.
SNYPR Architecture Guide 13
Hadoop Components
Logical SNYPR Architecture – Dedicated Security Analytics DataLake
SNYPR Architecture Guide 14
Search Deployment Options
Search Deployment OptionsSNYPR Search is a high-performance indexing and search solution that stores allactivity events in the environment that are access by the user interface.
SNYPR Search is deployed on an edge node in the Hadoop cluster. It requiresaccess to the SNYPR Console on the application server and the Kafka Brokers.These servers perform event indexing as well as storage of all violation data andrelated information used by the SNYPR user interface.
Embedded Dedicated
DescriptionLimited search server forsmall UEBA deployments.Limited to one search cell.
Dedicated search server forsmall UEBA or SecurityAnalytics Data Lakedeployments.
Indexing rate perSearch Cell (multiplecells are configured forincreased performance)
3k average EPS
5k peak EPS
Multiple Search Cellsare supported, each cellsupports 10k averageEPS / 15k peak EPS.Redundancy of searchindexes with replicationcan be configured forhigh availability andfaster searchperformance.
Retention 7 days 30 days or more.
Search EmbeddedAn embedded deployment of SNYPR Search is collocated with the SNYPRApplication and shares the resources on that server. The resources required for anembedded deployment of SNYPR Search are:
SNYPR Architecture Guide 15
Search Deployment Options
l 10 CPU
l 16 GB RAM
l 1 TB usable storage
An embedded SNYPR Search server is for small UEBA deployments and is limited to3,000 EPS average and 5k peak (EPS), and 7 days of retention. For deploymentscenarios with greater requirements, SNYPR Search Dedicated servers will be used.
SNYPR Search Embedded Mode
Search DedicatedThe SNYPR Search Dedicated deployment options are listed in the diagram below. ASNYPR Search Standard deployment uses a single dedicated server for indexingand searching. A SNYPR Search High Performance Cell includes separate serversfor indexing and searching. In a high-performance cell, the indexes are replicatedacross servers for redundancy and for isolating indexing workload from searchworkload.
SNYPR Architecture Guide 16
Search Deployment Options
SNYPR Search Dedicated
Search Index Storage Estimates
Embedded: 7 Days
Premium: 30Days
Premium: 30DayswithReplica
Days 7 30 30
Replicas
1 1 2
EPSAvgMessage Size
events / day GB/dayStorage(GB)
Storage(GB)
Storage(GB)
1,000 600 86,400,000 48 169 724 1,448
2,500 600 216,000,000 121 422 1,810 3,621
5,000 600 432,000,000 241 845 3,621 7,242
SNYPR Architecture Guide 17
Search Deployment Options
Embedded: 7 Days
Premium: 30Days
Premium: 30DayswithReplica
7,500 600 648,000,000 362 N/A 5,431 10,863
10,000
600 864,000,000 483 N/A 7,242 14,484
15,000
6001,296,000,000
724 N/A 10,863 21,726
20,000
6001,728,000,000
966 N/A 14,484 28,968
Premium: 60Days
Premium: 60DayswithReplica
Premium: 90Days
Premium: 90DayswithReplica
Days 60 60 90 90
Replicas
1 2 1 2
EPS
AvgMessageSize
events / dayGB/day
Storage(GB)
Storage(GB)
Storage(GB)
Storage(GB)
SNYPR Architecture Guide 18
Search Deployment Options
Premium: 60Days
Premium: 60DayswithReplica
Premium: 90Days
Premium: 90DayswithReplica
1,000
60086,400,000
48 1,448 2,897 2,173 4,345
2,500
600216,000,000
121 3,621 7,242 5,431 10,863
5,000
600432,000,000
241 7,242 14,484 10,863 21,726
7,500
600648,000,000
362 10,863 21,726 16,294 32,589
10,000
600864,000,000
483 14,484 28,968 21,726 43,452
15,000
6001,296,000,000
724 21,726 43,452 32,589 65,178
20,000
6001,728,000,000
966 28,968 57,936 43,452 86,904
SNYPR Architecture Guide 19
Data Ingestion
Data IngestionSNYPR includes a data ingestion pipeline that includes normalization, contextenrichment, and correlation.
All event data in SNYPR is stored in a super enriched format. The Open EventFormat (OEF) is a self-describing format capable of supporting information fromheterogeneous data sources, while also adding enrichment data sets like useridentity data, threat intelligence feeds, asset information and others. This formatenables events to be contextually enriched at ingestion time. This ensures thathistorical changes to the enriched data are captured with the event at the time itoccurred. The original source event is always maintained in the OEF event. (Seehttps://openeventformat.org for details )The three phases of the SNYPR eventingestion pipeline are shown below.
Phase 1: Collect and PublishIn this phase, events are collected and a SNYPR publisher on the Remote IngestionNode (RIN) forwards the messages to the Kafka raw topic. There are multiple types ofSNYPR publishers, including the Ingestion node that uses the SNYPR ConnectorLibrary (Figure 3) and the syslog publisher that forwards messages directly to theKafka raw topic (Figure 5). The SNYPR publishers forward all events to the raw topicin the SNYPR transport format. This transport format adds metadata to the sourceevents to describe the event source and tag the events for processing in theenrichment job. The SNYPR publishers also support batching, compression, andencryption of the events that are published. This minimizes the bandwidth fortransmission to the centralized Kafka brokers.
SNYPR Architecture Guide 20
Data Ingestion
Single Pipeline
Phase 2: EnrichmentThe SNYPR Enrichment Spark Streaming job is responsible for event filtering,normalization, and context enrichment of the raw logs. During context enrichment,context is added to the incoming log data. This context enrichment includesenrichment from user HR sources, geolocation information, threat intelligence datum,and other lookup data like internal network maps and asset data. Additionally, the rawevent log message is stored in the original format as one of the columns in thenormalized schema.
Multiple Enrichment Pipelines
SNYPR Architecture Guide 21
Data Ingestion
Phase 3: ProcessingThe third phase of the event ingestion pipeline is a parallel phase where multipleSpark streaming jobs subscribe to the enriched topic and perform indexing, storeenriched events in HDFS, and also analyze the events for threats.
The ingested data is stored for long-term storage in HDFS as parquet files and madeaccessible as Hive database tables that are partitioned by resource, year, and day.
The solution also indexes the data and stores it in SNYPR Search Solr collections.The solution creates additional index collections as the data size passes aconfigurable threshold, and maintains a control index for execution of parallel queriesacross the entire set of collections. The index files are maintained on the dedicatedSNYPR Search servers on local storage. This configuration provides parallel queryexecution across all the collections for deterministic response time for interactive useby the SNYPR user interface.
The log compliance data is stored in a read-only format that cannot be modified.SNYPR supports strong authentication, authorization, and encryption of the Hadoopinfrastructure. SNYPR also provides application layer encryption and masking thatcan be enabled selectively.
SNYPR uses Edge nodes for the user interface and for the SNYPR Search nodes. Allprocessing and long term storage of data is done within the Hadoop cluster. SNYPRprovides a feature called Spotter as an integral part of the solution. This featureprovides online searching and visualization of event data for the configured indexretention period.
The SNYPR Remote ingestion node includes the connectors that are used to ingestthe log data. The connectors leverage the specific log source APIs or files to accessthe log data. The incoming log messages are associated with a Job ID and aResource ID before they are submitted to Kafka, so that they can be processed by theSpark Streaming enrichment job. The connectors also perform offset management ofthe source of the log data to ensure that all the logs messages are obtained and, insome cases, pre-processing of the source data. An example of pre-processing the logdata is the Ironport syslog connector. This connector converts the multi-linemessages into a single line for publishing to Kafka.
SNYPR Architecture Guide 22
Data Ingestion
Multiple Independent Pipelines
Indexing Incoming EventsSNYPR includes dedicated SNYPR Search servers. These servers are edge nodesin the Hadoop cluster and consume the enriched messages from the Kafka topic andperformed local indexing on the search servers. The search indexes are designed tooptimize the search performance by paralleling the searching across multiple sub-indexes or SNYPR Search collections. Each collection is further distributed across aconfigured number of shards to ensure distribution of the workload. Each Solr serverin the cluster is allocated CPU and memory to allow the SNYPR Search server toperform optimally.
The indexed events are ingested in real-time by the solution. SNYPR includes twoalternatives for indexing events.
l SNYPR Local Event Indexer (LEI) is an indexing process that reads enriched datafrom the Kafka topics and indexes events to the SNYPR Search servers.
SNYPR Architecture Guide 23
Data Ingestion
l The SNYPR indexing job is a distributed Spark Streaming job that runs within theHadoop cluster. The compute and memory resources used for indexing arereserved capacity to ensure that events are ingested at the rate that they arrive tothe solution. This allows the indexing of ingested events to be paralleled acrossthe cluster to meet the deployment requirements of the solution.
An index control core collection is used to track the number of collections that thesolution is hosting. The solution maintains a maximum number of documents percollection threshold. The solution dynamically creates additional collections as moreevents are imported into the environment. The solution also provides the ability toduplicate redundant event data from the indexes during ingestion.
SearchingThe Spotter search interface allows users to search across all events. Interactive anddeterministic response time for searches is obtained by executing parallel searchesacross the collections. This approach ensures that the size of each index is optimizedand that the infrastructure can grow to support larger indexes without impacting theuser experience. The search results are incrementally returned to the user interfaceand displayed to the user as they arrive to ensure the responsiveness of the Spotterinterface.
SNYPR Architecture Guide 24
Data Ingestion
Deployment AssumptionsDeploying a SNYPR environment requires many considerations for each of thecomponents of the solution.
For a standard deployment architecture, the following is recommended:
l Fast network access for the Hadoop cluster and edge nodes – 10 gigabyte Eth-ernet with jumbo frames configured on all switches and network interfaces (MTU-U=9000).
l All services running in a single data center
l A balanced SNYPR cluster with similar nodes (CPU, memory, storage, network)
l Securonix SNYPR using standard Securonix connectors for data ingestion. Theexact sources of event data are deployment specific
l The log event data available to the SNYPR environment (Ingestion Nodes), or fordirect connector access to log sources, based on the connector used
l Storage bandwidth recommended: 1000 IOPS per Hadoop and SNYPR Searchserver
l Purging Online Event data after retention period days to minimize required stor-age, unless there is a business need for long term historical searching. Violationand behavior data is not purged.
l Java 8 used by the cluster
For Hadoop tuning, See the section in this guide: Hadoop Cluster TuningRecommendations.
SNYPR Kafka Topic Partitioning Reference
KafkaTopics
10000 - 20000 EPS
Partitions Replication
tenantid-Raw 75 2
tenantid-Enriched 75 2
tenantid-Ops 1 2
SNYPR Architecture Guide 25
Data Ingestion
KafkaTopics
10000 - 20000 EPS
Partitions Replication
tenantid-Tiertwo 75 2
tenantid-Control 1 2
tenantid-IndexerCount 1 2
tenantid-Violations 75 2
tenantid-User 1 2
tenantid-Count 1 2
tenantid-Preview 1 2
SNYPR Search Shard Allocation Reference
Solr Collections
10000 - 20000 EPS
Servers 9
Shards Replication
tenantid-activity 12 2
tenantid-violation 12 2
tenantid-whitelist 1 2
tenantid-entitymetadata 1 2
tenantid-tpi 1 2
tenantid-eeocontrolcore 1 2
SNYPR Architecture Guide 26
Data Ingestion
Solr Collections
10000 - 20000 EPS
Servers 9
Shards Replication
tenantid-lookup 1 2
tenantid-ipmapping 1 2
tenantid-watchlist 1 2
tenantid-dailyviolationsummary
1 2
tenantid-users 1 2
tenantid-riskscorecard 1 2
tenantid-entityrelation 1 2
tenantid-access 1 2
SNYPR YARN Resource Allocation ReferenceThe SNYPR Spark applications are configured based on the ingestion rate that mustbe supported. An example of the Spark Application resources allocation is shown inthe table below. The table below is an example of the resource allocation for adeployment that supports 20,000 events per second with typical workload. There aremany variables affecting a deployment and the specific sizing recommended. ContactSecuronix for specific information.
SNYPR Architecture Guide 27
Data Ingestion
SparkStreamingYARNResources
10,000 - 20,000 EPS
Driver Executors
vCPUMemory(GB)
NumberofExecutors
vCPUMemory(GB)
EventEnrichment
6 2 80 1 3
Event Ingestion 6 2 20 1 2
BehaviorAnalytics
1 2 10 1 4
Policy EngineIEE
1 2 40 1 2
Policy EngineAEE
1 2 10 1 3
Risk Generation 1 2 10 2 2
Traffic Analyzer 1 2 10 1 4
Behavior Profile 1 2 6 1 2
RoboticBehavior
1 2 10 1 3
Event Archiver 1 2 10 1 1
Phishing 1 2 1 1 4
SNYPR Architecture Guide 28
Data Ingestion
SparkStreamingYARNResources
10,000 - 20,000 EPS
Driver Executors
vCPUMemory(GB)
NumberofExecutors
vCPUMemory(GB)
YARNResources
21 22 217 546
TotalYARNResources
238 568
SNYPR Extra Large DeploymentsThe sizing guidelines in this document are references for deployment of SNYPR. Thesolution will support much larger deployments based on the customer requirements.
For large deployments the search servers are dedicated servers rather than beingcollocated on the Compute/Storage nodes. This allows the search indexers to scaleas needed without impacting other services. This includes Solr and a dedicatedZookeeper configuration to avoid contention.
See Figure 10 for an example of the deployment with dedicated search servers.
There is no upper limit to the deployment size. The deployment architecture for extra-large deployments will be determined based on the specific deploymentrequirements. Contact Securonix for details.
The major variables that dictate the deployment recommendations include:
l Ingestion Rate (Events Per Second) of security event data
l Number of Users interacting with the application interactively
l The data retention requirements for online data
SNYPR Architecture Guide 29
Data Ingestion
l The data retention requirements for log data
l The disaster recovery strategy
SNYPR Architecture Guide 30
Data Ingestion
High AvailabilityThe SNYPR solution includes high availability of all the components of theinfrastructure. The Hadoop cluster is configured for high availability based on bestpractices deployment of Hadoop. This includes (but is not limited to) at a minimumhigh availability of the HDFS Namenodes, YARN resource Managers, at least 3zookeeper servers, and at least 3 kafka brokers. The high availability for the SNYPRservers that leverage the Hadoop cluster are described below.
SNYPR Application ServerHigh availability of the SNYPR Console is provided with an HA configuration of twonodes, with the user interface active on one of the two nodes during normal operation.MySQL replication, and a Redis cluster is configured as well as backup of the filesystem where the configuration data is stored (referred to as SECURONIX_HOME). Aload balancer is configured for access to the user interface.
SNYPR Architecture Guide 31
Data Ingestion
SNYPR-EYE ServerHigh availability of the SNYPR-EYE Server is provided with an HA configuration oftwo nodes, with the user interface active on one of the two nodes during normaloperation. MySQL replication, as well as backup of the file system where theconfiguration data is stored (referred to as SNYPR-EYE_HOME) is configured onthese servers for high availability and a load balancer is configure for access to theuser interface.
SNYPR Architecture Guide 32
Data Ingestion
SNYPR Search ServerHigh availability of the SNYPR Search Servers is configured for each SNYPR Searchcell in the deployment. The SNYPR Search cell includes a Local Event Indexer (LEI)as well as multiple search instances. A search cell with high availability will includeat least 2 SNYPRSNYPR Search servers. The LEI process is running on the primaryserver for indexing the incoming event data from the Enriched topic on Kafka. Asearch server provides a replica of all indexed data on another server. During a fail-over, the LEI is started on the second search server to enable active indexing on thatserver.
SNYPR Architecture Guide 33
Data Ingestion
SNYPR Remote Ingestion NodesAt least two SNYPR Remote Ingestion nodes (RINs) are recommended for highavailability in each location that they are deployed. RINs are typically installed ineach major data center in close proximity to the logs that are being collected. Thedata collected by the RINs and forwarded to the kafka brokers is in compressedbatches that minimize the network transfer by roughly 90%. The RINs also encrypt thepayload and support SSL and mutual authentication as well as Kerberosauthentication.
The RINs collect data through two different methods, the push method and the pullmethod. The push method uses the embedded syslog server to collect and forwarddata to the kafka topics. The pull method uses the Securonix Connectors installed onthe RIN to connect to the APIs and gather the logs and forward them the to the Kafkatopic. High availability is provided on the kafka brokers by having 3 separate kafkabrokers and replication of the topics for availability.
A sticky load balancer is recommended for incoming traffic to the Remote Ingestionnode for incoming syslog traffic.
SNYPR Architecture Guide 34
Data Ingestion
SNYPR Remote Ingestion Nodes
Hadoop Cluster Guidance for High AvailabilityThe Hadoop infrastructure services are used for high availability. The recommendedsettings are as follows:
l At least three Kafka brokers with ISR=3
l HDFS replication factor =3
l Kafka message retention = 2 days
l Kafka In Sync Replica (ISR=3)
l HDFS replication set to three
l HA Namenode
l HA Resource Manager
SNYPR Architecture Guide 35
Data Ingestion
l At least three Zookeeper servers
l If security is required:l Kerberos authentication of all services in the Hadoop cluster
l Encryption of HDFS folders with HDFS encryption is also available for sens-itive resource data
l Authorization for protection of the access to data in the Hadoop cluster isrecommended with the native tools (Ranger for Hortonworks, Sentry forCloudera)
l The SNYPR Edge Nodes for Ingestion and the Console User interface interactwith the Hadoop services and support Kerberos
This is not a complete list. It is recommended that you follow the Hadoop bestpractices for deployment.
In addition to the storage required for the data, the compute and memory required forrunning the SNYPR jobs must be available in the Hadoop cluster. The SNYPRsolution includes several jobs that are running in the cluster. YARN is used toschedule the resources. The primary jobs that are part of SNYPR and the resourcesallocation are listed below.
The specific infrastructure required is based on the required peak ingestion rate.
Request specific deployment guidance from Securonix.
SNYPR Architecture Guide 36
Data Ingestion
Reference Server SpecificationsThis section contains recommendations for the following topics:
l Hardware Specifications
l Server Mount Point
Hardware SpecificationsThe hardware specifications for the infrastructure are listed in the following table:
ConfigurationSNYPR-M1:Hadoop Master
SNYPR-M2:Hadoop Masterwith SNYPR
SNYPR-M3:Hadoop Masterwith SNYPR andKafka
Server Model Dell R640 Dell R640 Dell R640
CPU2 x Intel XeonGold 5120 2.2G,14C/28T
2 x Intel XeonGold 5120 2.2G,14C/28T
2 x Intel XeonGold 5120 2.2G,14C/28T
Memory256GB RDIMM,2666MT/s
256GB RDIMM,2666MT/s
256GB RDIMM,2666MT/s
Boot Storage2 x 1.6TB SSDSATA Mix Use12Gbps 512e
2 x 1.6TB SSDSATA Mix Use12Gbps 512e
2 x 1.6TB SSDSATA Mix Use12Gbps 512e
Additional Storage4 x 2.4 TB 10KRPM SAS 12Gbps4Kn
6 x 2.4 TB 10KRPM SAS 12Gbps4Kn
8 x 2.4TB 10K RPMSAS 12Gbps 4Kn
Network 10GE 10GE 10GE
Power 2 x 1100W 2 x 1100W 2 x 1100W
Rack Units 1RU 1RU 1RU
SNYPR Architecture Guide 37
Data Ingestion
ConfigurationSNYPR-C1:Standard DensityCompute/Storage
SNYPR-C2: HighDensityCompute/Storage
SNYPR-C3:MaximumDensityCompute/Storage
Server Model Dell R640 Dell R740xd Dell R740xd
CPU2 x Intel XeonGold 5120 2.2G,14C/28T
2 x Intel XeonGold 5120 2.2G,14C/28T
2 x Intel XeonGold 5120 2.2G,14C/28T
Memory256GB RDIMM,2666MT/s
256GB RDIMM,2666MT/s
256GB RDIMM,2666MT/s
Boot Storage2 x 1.6TB SSDSATA Mix Use12Gbps 512e
2 x 1.6TB SSDSATA Mix Use12Gbps 512e
2 x 1.6TB SSDSATA Mix Use12Gbps 512e
AdditionalStorage
10 x 2.4 TB 10KRPM SAS 12Gbps4Kn
24 x 2.4 TB 10KRPM SAS 12Gbps4Kn
30 x 2.4TB 10KRPM SAS 12Gbps4Kn
Network 10GE 10GE 10GE
Power 2 x 1100W 2 x 1100W 2 x 1100W
Rack Units 1RU 2 RU 2 RU
ConfigurationSNYPR-SEARCH1:Standard DensityCompute/Storage
SNYPR-SEARCH3:Maximum DensityCompute/Storage
Server Model Dell R640 Dell R740xd
SNYPR Architecture Guide 38
Data Ingestion
ConfigurationSNYPR-SEARCH1:Standard DensityCompute/Storage
SNYPR-SEARCH3:Maximum DensityCompute/Storage
CPU2 x Intel Xeon Gold 51202.2G, 14C/28T
2 x Intel Xeon Gold 51202.2G, 14C/28T
Memory 256GB RDIMM, 2666MT/s 256GB RDIMM, 2666MT/s
Boot Storage2 x 1.6TB SSD SATAMix Use 12Gbps 512e
2 x 1.6TB SSD SATAMix Use 12Gbps 512e
Additional Storage10 x 2.4TB 10K RPM SAS12Gbps 4Kn
30 x 2.4TB 10K RPM SAS12Gbps 4Kn
Network 10GE 10GE
Power 2 x 1100W 2 x 1100W
Rack Units 1RU 2 RU
ConfigurationSNYPR-K3:Kafka Brokers
SNYPR-R1:RemoteIngestion Node
SNYPR-S3:SNYPR Console
Server Model Dell R640 Dell R640 Dell R640
CPU2 x Intel XeonGold 5120 2.2G,14C/28T
2 x Intel XeonGold 5120 2.2G,14C/28T
2 x Intel XeonGold 5120 2.2G,14C/28T
Memory128GB RDIMM,2666MT/s
64GB RDIMM,2666MT/s
128GB RDIMM,2666MT/s
SNYPR Architecture Guide 39
Data Ingestion
ConfigurationSNYPR-K3:Kafka Brokers
SNYPR-R1:RemoteIngestion Node
SNYPR-S3:SNYPR Console
Boot Storage2 x 1.6TB SSDSATA Mix Use12Gbps 512e
2 x 1.6TB SSDSATA Mix Use12Gbps 512e
2 x 1.6TB SSDSATA Mix Use12Gbps 512e
Additional Storage10 x 2.4 TB 10KRPM SAS 12Gbps4Kn
4 x 2.4 TB 10KRPM SAS 12Gbps4Kn
4 x 2.4 TB 10K RPMSAS 12Gbps 4Kn
Network 10GE 10GE 10GE
Power 2 x 1100W 2 x 1100W 2 x 1100W
Rack Units 1RU 1RU 1RU
Alternate hardware configuration can be used, but equivalent specifications arerequired for CPU, memory, network bandwidth, storage capacity and bandwidth.
Server Mount PointThe storage mount point configuration for each of the servers is listed in the tablebelow:
Mount PointSNYPR-M1:HadoopMaster
SNYPR-M2:HadoopMaster withSNYPR
SNYPR-M3:HadoopMaster withSNYPR andKafka
Comments
/ 100 GB 100 GB 100 GB
RAID 1, (1.6TB mixed useSSD drives),xfs
SNYPR Architecture Guide 40
Data Ingestion
Mount PointSNYPR-M1:HadoopMaster
SNYPR-M2:HadoopMaster withSNYPR
SNYPR-M3:HadoopMaster withSNYPR andKafka
Comments
/boot 2 GB 2 GB 2 GB
RAID 1, (1.6TB mixed useSSD drives),xfs
swap 10 GB 10 GB 10 GB
RAID 1, (1.6TB mixed useSSD drives),xfs
/zookeeper 200 TB 200 TB 200 TB
RAID 1, (1.6TB mixed useSSD drives),xfs
/var 800 GB 800 GB 800 GB
RAID 1, (1.6TB mixed useSSD drives),xfs
/dfs 200 GB 200 GB 200 GB
RAID 1, (1.6TB mixed useSSD drives),xfs
/securonix 4.2 TB 6.3 TB 8.4 TB
RAID 10, xfs, ifsyslog is usedlocally usehigher storageamount
SNYPR Architecture Guide 41
Data Ingestion
Mount PointSNYPR-M1:HadoopMaster
SNYPR-M2:HadoopMaster withSNYPR
SNYPR-M3:HadoopMaster withSNYPR andKafka
Comments
/snyprsearch - - - RAID 6
/data1 - - 2.1 TBJBOD, xfs,noatime
/data2 - - 2.1 TBJBOD, xfs,noatime
/data3 - - 2.1 TBJBOD, xfs,noatime
/data4 - - 2.1 TBJBOD, xfs,noatime
MountPoint
SNYPR-C1:StandardDensityCompute/Storage
SNYPR-C2:High DensityCompute/Storage
SNYPR-C3:MaximumDensityCompute/Storage
Comments
/ 100 GB 100 GB 100 GB
RAID 1,(1.6 TBmixed useSSDdrives),xfs
SNYPR Architecture Guide 42
Data Ingestion
MountPoint
SNYPR-C1:StandardDensityCompute/Storage
SNYPR-C2:High DensityCompute/Storage
SNYPR-C3:MaximumDensityCompute/Storage
Comments
/boot 2 GB 2 GB 2 GB
RAID 1,(1.6 TBmixeduse SSDdrives),xfs
swap 10 GB 10 GB 10 GB
RAID 1,(1.6 TBmixed useSSDdrives),xfs
/zookeeper
200 TB 200 TB 200 TB
RAID 1,(1.6 TBmixeduse SSDdrives),xfs
/var 800 GB 800 GB 800 GB
RAID 1,(1.6 TBmixed useSSDdrives),xfs
SNYPR Architecture Guide 43
Data Ingestion
MountPoint
SNYPR-C1:StandardDensityCompute/Storage
SNYPR-C2:High DensityCompute/Storage
SNYPR-C3:MaximumDensityCompute/Storage
Comments
/dfs 200 GB 200 GB 200 GB
RAID 1,(1.6 TBmixeduse SSDdrives),xfs
/securonix - - -
RAID 10,xfs, ifsyslog isusedlocally usehigherstorageamount
/snyprsearch
- - - RAID 6
/data1 2.1 TB 2.1 TB 2.1 TBJBOD,xfs,noatime
/data2 2.1 TB 2.1 TB 2.1 TBJBOD,xfs,noatime
/data3 2.1 TB 2.1 TB 2.1 TBJBOD,xfs,noatime
SNYPR Architecture Guide 44
Data Ingestion
MountPoint
SNYPR-C1:StandardDensityCompute/Storage
SNYPR-C2:High DensityCompute/Storage
SNYPR-C3:MaximumDensityCompute/Storage
Comments
/data4 2.1 TB 2.1 TB 2.1 TBJBOD,xfs,noatime
/data5 2.1 TB 2.1 TB 2.1 TBJBOD,xfs,noatime
/data6 2.1 TB 2.1 TB 2.1 TBJBOD,xfs,noatime
/data7 2.1 TB 2.1 TB 2.1 TBJBOD,xfs,noatime
/data8 2.1 TB 2.1 TB 2.1 TBJBOD,xfs,noatime
/data9 2.1 TB 2.1 TB 2.1 TBJBOD,xfs,noatime
/data10 2.1 TB 2.1 TB 2.1 TBJBOD,xfs,noatime
SNYPR Architecture Guide 45
Data Ingestion
MountPoint
SNYPR-C1:StandardDensityCompute/Storage
SNYPR-C2:High DensityCompute/Storage
SNYPR-C3:MaximumDensityCompute/Storage
Comments
/data11 - 2.1 TB 2.1 TBJBOD,xfs,noatime
/data12 - 2.1 TB 2.1 TBJBOD,xfs,noatime
/data13 - 2.1 TB 2.1 TBJBOD,xfs,noatime
/data14 - 2.1 TB 2.1 TBJBOD,xfs,noatime
/data15 - 2.1 TB 2.1 TBJBOD,xfs,noatime
/data16 - 2.1 TB 2.1 TBJBOD,xfs,noatime
/data17 - 2.1 TB 2.1 TBJBOD,xfs,noatime
SNYPR Architecture Guide 46
Data Ingestion
MountPoint
SNYPR-C1:StandardDensityCompute/Storage
SNYPR-C2:High DensityCompute/Storage
SNYPR-C3:MaximumDensityCompute/Storage
Comments
/data18 - 2.1 TB 2.1 TBJBOD,xfs,noatime
/data19 - 2.1 TB 2.1 TBJBOD,xfs,noatime
/data20 - 2.1 TB 2.1 TBJBOD,xfs,noatime
/data21 - 2.1 TB 2.1 TBJBOD,xfs,noatime
/data22 - 2.1 TB 2.1 TBJBOD,xfs,noatime
/data23 - 2.1 TB 2.1 TBJBOD,xfs,noatime
/data24 - 2.1 TB 2.1 TBJBOD,xfs,noatime
SNYPR Architecture Guide 47
Data Ingestion
MountPoint
SNYPR-C1:StandardDensityCompute/Storage
SNYPR-C2:High DensityCompute/Storage
SNYPR-C3:MaximumDensityCompute/Storage
Comments
/data25 - - 2.1 TBJBOD,xfs,noatime
/data26 - - 2.1 TBJBOD,xfs,noatime
/data27 - - 2.1 TBJBOD,xfs,noatime
/data28 - - 2.1 TBJBOD,xfs,noatime
/data29 - - 2.1 TBJBOD,xfs,noatime
/data30 - - 2.1 TBJBOD,xfs,noatime
SNYPR Architecture Guide 48
Data Ingestion
Mount PointSNYPR-SEARCH1:StandardDensityCompute/Storage
SNYPR-SEARCH3:MaximumDensityCompute/Storage
Comments
/ 100 GB 100 GB
RAID 1, (1.6TB mixed useSSD drives),xfs
/boot 2 GB 2 GB
RAID 1, (1.6TB mixeduse SSDdrives), xfs
swap 10 GB 10 GB
RAID 1, (1.6TB mixed useSSD drives),xfs
/zookeeper 200 TB 200 TB
RAID 1, (1.6TB mixeduse SSDdrives), xfs
/var 800 GB 800 GB
RAID 1, (1.6TB mixed useSSD drives),xfs
/dfs 200 GB 200 GB
RAID 1, (1.6TB mixeduse SSDdrives), xfs
SNYPR Architecture Guide 49
Data Ingestion
Mount PointSNYPR-SEARCH1:StandardDensityCompute/Storage
SNYPR-SEARCH3:MaximumDensityCompute/Storage
Comments
/securonix - -
RAID 10, xfs,if syslog isused locallyuse higherstorageamount
/snyprsearch 17 TB 60 TB RAID 6
Mount PointSNYPR-K3:KafkaBrokers
SNYPR-R1:RemoteIngestionNode
SNYPR-S3:SNYPRConsole
Comments
/ 100 GB 100 GB 100 GB
RAID 1, (1.6TB mixed useSSD drives),xfs
/boot 2 GB 2 GB 2 GB
RAID 10, xfs,if syslog isused locallyuse higherstorageamount
SNYPR Architecture Guide 50
Data Ingestion
Mount PointSNYPR-K3:KafkaBrokers
SNYPR-R1:RemoteIngestionNode
SNYPR-S3:SNYPRConsole
Comments
swap 10 GB 10 GB 10 GB
RAID 10, xfs, ifsyslog is usedlocally usehigher storageamount
/zookeeper 200 TB - -
RAID 10, xfs,if syslog isused locallyuse higherstorageamount
/var 800 GB 1000 GB 1000 GB
RAID 10, xfs, ifsyslog is usedlocally usehigher storageamount
/dfs 200 GB 200 GB 200 GB
RAID 10, xfs,if syslog isused locallyuse higherstorageamount
/securonix - 4.2 TB 4.2 TB
RAID 10, xfs, ifsyslog is usedlocally usehigher storageamount
SNYPR Architecture Guide 51
Data Ingestion
Mount PointSNYPR-K3:KafkaBrokers
SNYPR-R1:RemoteIngestionNode
SNYPR-S3:SNYPRConsole
Comments
/snyprsearch - - - RAID 6
/data1 2.1 TB - -JBOD, xfs,noatime
/data2 2.1 TB - -JBOD, xfs,noatime
/data3 2.1 TB - -JBOD, xfs,noatime
/data4 2.1 TB - -JBOD, xfs,noatime
/data5 2.1 TB - -JBOD, xfs,noatime
/data6 2.1 TB - -JBOD, xfs,noatime
/data7 2.1 TB - -JBOD, xfs,noatime
/data8 2.1 TB - -JBOD, xfs,noatime
/data9 2.1 TB - -JBOD, xfs,noatime
/data10 2.1 TB - -JBOD, xfs,noatime
SNYPR Architecture Guide 52
Data Ingestion
Alternatives for Limiting the Size of the InfrastructureThe recommended architecture assumes full functionality and full access to indexeddata and source data for the duration of the retention period.
Other factors may reduce the size of the recommended infrastructure such as areduction in the volume of log data or filtering of some log data to avoid storage ofunneeded events.
You can configure the Hadoop compute and storage nodes to use very dense storageper node. The following table shows an example configuration that is possible. Thisconfiguration includes dense storage.
SNYPR Architecture Guide 53
SNYPR Cloud Deployment
SNYPR Cloud DeploymentSNYPR solution can be deployed in a cloud environment. Several considerationsmust be addressed when deploying SNYPR in a cloud including the following:
l Infrastructure selection: The infrastructure used should provide equivalentresources (CPU, memory, and storage capacity and bandwidth) to the physicalserver recommendations listed in this document.
l Deployment Architecture: SNYPR could be deployed exclusively in the cloud oras a hybrid cloud / on-site topology
l Network Access: The infrastructure must have access to the data (user, access,event log, TPI, etc.) that will be used. A Virtual Private Cloud may be required fortransmission of sensitive data.
Infrastructure SelectionSNYPR can be deployed in public or private cloud environments. Based on thedeployment requirements of the solution, the specific infrastructure used for eachcloud infrastructure should be selected to ensure that the appropriate resources areavailable. This includes selection of the appropriate virtual instance types to supportthe CPU, memory, storage and network bandwidth requirements of the solution.
SNYPR Architecture Guide 54
Considerations
ConsiderationsThis section contains considerations for the following topics:
l Amazon EC2
l Microsoft Azure
l Network
l Virtual Infrastructure
Amazon EC2There are several Amazon EC2 Instance Types that are a good fit for deployingSecuronix. The M4 general purpose instances are recommended. These are definedby Amazon as:
"M4 instances are the latest generation of General-Purpose Instances. This familyprovides a balance of compute, memory, and network resources, and it is a goodchoice for many applications."
Featuresl 2.4 GHz Intel Xeon® E5-2676 v3 (Haswell) processors
l EBS-optimized by default at no additional cost
l Support for Enhanced Networking
l Balance of compute, memory, and network resources
HadoopMaster
Compute /Storage
KafkaSNYPRSearch
SNYPRSearch
AmazonEC2InstanceType
R5.4xlarge m4.16xlarge M5.2xlarge M5.4xlarge m4.16xlarge
SNYPR Architecture Guide 55
Considerations
HadoopMaster
Compute /Storage
KafkaSNYPRSearch
SNYPRSearch
RAM (GB) 128 256 32 64 256
vCPU 16 64 8 16 64
Storage(GB, splitintomultipleEBSvolumes)
10,000 10,000 3,000 3,000 10,000
Amazon provides several alternatives for the instance types used, like the R3.8XL,and the D2.8XL, which are also good options. The storage chosen should provideadequate bandwidth to the volume used. This is the equivalent of 1000 IOPs perinstance to the selected storage type.
In addition to standard Amazon AWS EC2 instances, the guidance for deployingCloudera in Amazon Web Services is recommended. See the following link:https://www.cloudera.com/partners/solutions/amazon-web-services.html.
Microsoft AzureSeveral Azure Virtual Machine instance types (https://azure.microsoft.com/en-us/pricing/details/virtual-machines/linux/) are a good fit for deploying Securonix. A G4(East US2), D15 v2 (East US2), or H16m (South Central US) instance type isrecommended.
The D sv2 instances are recommended. These are defined by Microsoft as:
“D11-15 v2 instances are based on the 2.4 GHz Intel Xeon® E5-2673 v3 (Haswell)processor, and can achieve 3.1 GHz with Intel Turbo Boost Technology 2.0. D11-15v2 are ideal for memory-intensive enterprise applications. D15 v2 instance is isolatedto hardware dedicated to a single customer.
For persistent storage, use the variant “Dsv2” VMs and purchase Premium Storageseparately.”
SNYPR Architecture Guide 56
Considerations
HadoopAdmin
Compute /Storage
KafkaBroker
SNYPRSearch
SNYPRConsole
MicrosoftAzureInstanceType
E16 v3 D64 v3 E16 v3 D64 v3 E16 v3
RAM (GB) 128 256 128 256 128
vCPU 16 64 16 64 16
Storage(GB, splitinto multipleEBSvolumes)
3,000 10,000 5,000 10,000 5,000
Microsoft provides several alternatives for the storage for the instances used. Thestorage chosen should provide adequate bandwidth to the volume used. This is theequivalent of 1000 IOPs per instance to the selected storage type.
In addition to standard Azure instances, the following guidance for deployingCloudera in Microsoft Azure is recommended. See the link:https://www.cloudera.com/more/news-and-blogs/press-releases/2015-09-24-cloudera-enterprise-data-hub-edition-provides-enterprise-ready-hadoop-for-microsoft-azure.html.
NetworkA SNYPR deployment includes network transfer of several types of data into thesolution. This includes User, Access, TPI, Event Logs, Network Maps and other typesof data for a typical deployment. Due to the potential sensitivity of some of this data, avirtual private cloud may be required for each deployment.
In addition to the security considerations, the infrastructure will require sufficientnetwork bandwidth. The type of network traffic used by the solution is:
SNYPR Architecture Guide 57
Considerations
l End User Access to the Securonix User Interface
l Import of User, Access, and TPI data into the Master nodes
l Cluster communication and synchronization between the cluster nodes
l Import of event log data into the child nodes
The largest network traffic required is to transfer the event log data from the source tothe child nodes for import through the solutions connectors. The network traffic ratefrom the event logs sources to the child can be calculated by multiplying the eventsper second time the average message size.
5000 events per second (EPS) to two ingestion nodes in the deployment with anaverage message size of 500 byte, will require 2.5 MB per second, or roughly 25 Mbper second of bandwidth.
Network Bandwidth Characteristics by Tier
Tier DescriptionNetworkRequirements
Admin
This tier is where the end users losingto the user interface (traffic on port443). This tier also includes allmanagement services for the clusterand connects to the Compute /Storage / Search tier and MessagingTier for various services. incomingconnectionsWeb Services on port443, MySQL configuration for sparkjobs, Redis, zookeeper and otherhadoop cluster services. This tierhosts the management services thatthe agents on the Admin, Compute /Storage / Messaging tiers willcommunicate with.
10 GB ethernet, MTU =9000, centralized datacenter for Admin,Compute / Storage /Search, and MessagingTiers)
SNYPR Architecture Guide 58
Considerations
Tier DescriptionNetworkRequirements
Compute/Storage/Search
Network traffic for to theseservers includes spark, Impala,HDFS, HBase services.Outbound traffic to services in theAdmin tier and the MessagingTier are also required.
10 GB ethernet, MTU= 9000, centralizeddata center for Admin,Compute / Storage /Search, andMessaging Tiers)
SNYPR Search
Network traffic for to these serversincludes SNYPR Search (Solr).Outbound traffic to services in theKafka Messaging Tier is required.
10 GB ethernet, MTU =9000, centralized datacenter for Admin,Compute / Storage /Search, and MessagingTiers)
Messaging
This tier includes incoming trafficto Kafka Brokers (SSL traffic toport 9093, and zookeeper trafficon port 2181).
10 GB ethernet, MTU= 9000, centralizeddata center for Admin,Compute / Storage /Search, andMessaging Tiers)
Collection
This server collects logs and providesa syslog server on port 514. Theconnectors on the server also collectlogs with native protocols. Theprimary network traffic from this tier isto the Admin tier on port 443 for webservices and the Kafka brokers in theMessaging Tier on port 9093 SSL)
Remote data center withoutbound networkaccess to the centralizeddata center.
If 10 gigabyte Ethernet is not available and gig-bit Ethernet is used in the deployment,then the performance of the deployment will be limited by the network performance.
SNYPR Architecture Guide 59
Considerations
Network Bandwidth Requirements from RIN Collection Tier toMessaging TierThe table below displays the network bandwidth requirements from the RemoteIngestion Nodes (RINs) collection tier to the messaging tier (Kafka Brokers).
Average EPS 20,000 EPS
Number of RINs 1 RINS
average message size 600 bytes
Transferred to Kafka aftercompression (%)
30 %
Total Traffic to Kafka 36 Mbits/s
Traffic per RIN to Kafka(assuming equal distribution)
36 Mbits/s
Virtual InfrastructureDue to the high-performance requirements of the solution, physical servers ordedicated cloud instances are recommended. A virtual infrastructure can beconsidered for small deployments or non-production environments.
Considerations for virtual deploymentsl These are VMs that can be deployed as needed on the vSphere cluster, withoutover- subscription of either CPU or Memory resources. Configure CPUs alongphysical socket boundaries. According to vmware, one VM per NUMA node isadvisable.
l These nodes house the Cloudera Master services and serve as the gateway/edgedevice that connects the rest of the customer’s network to the Cloudera cluster.
l Care should also be taken to ensure automated movement of VMs is disabled.There should be no DRS or vMotion of VMs allowed in this deployment model.
SNYPR Architecture Guide 60
Considerations
This is critical as VMs are tied to physical disks and movement of VMs within thecluster will result in data loss.
l Configure Distributed Resource Scheduler (DRS) rules so that there is strong neg-ative affinity between the master node VMs. This ensures that no two masternodes are provisioned or migrated to the same physical vSphere host.
l Key configuration parameter to consider is the MTU size to ensure that the sameMTU size being set at the physical switches, guest OS, ESXi VMNIC and thevswitch layers. This is relevant when enabling jumbo frames. (9000 MTU), whichis recommended for Hadoop environments.
l Set up virtual disks in “independent persistent” mode for optimal performance.Eager Zeroed Thick virtual disks provide the best performance.
l Each provisioned disk is mapped to one vSphere datastore (which in turn containsone VMDK or virtual disk)
l VMXNET3 NIC should be configured.
l Disable or minimize anonymous paging by setting vm.swappiness=0 or 1.
l VMs on the same physical host are affected by the same hardware failure. In orderto match the reliability of a physical deployment, replication of data across two vir-tual machines on the same host should be avoided.
SNYPR Architecture Guide 61
Recommendations
RecommendationsThis section contains recommendations for the following:
l Hadoop Cluster Tuning
l Network Tuning
Hadoop Cluster Tuning RecommendationsThe tuning parameters in Table 1 describe the Hadoop tuning parameters for each ofthe services in the Hadoop cluster that optimize the Hadoop cluster performance forthe SNYPR workloads.
Hadoop Cluster Performance
Yarn
Allyarn containermemory
60 GB60GB
70 GB
Yarn
Yarn
AllJava Heap Sizeof NodeManager
850 MB 1 GB 850 MB
Yarn
Yarn
All
ZooKeeperClient TimeoutzkClientTimeout
1 min1min
1 min
SNYPR Architecture Guide 62
Recommendations
Hbase
All
HBase: JavaHeap SizeThrift in Bytes:1 GB
1 GB 1 GB 1 GB
Hbase
Hbase
Cloudera
hbase.rpc.timeout
15 min10min
15 min
Hbase
Hbase
Cloudera
RegionServerLease Period
15 min10min
15 min
Hbase
Hbase
Cloudera
HBase ServiceAdvancedConfigurationSnippet (SafetyValve) forhbase-site.xml
name:hbase.ipc.warn.response.time value: 500
name:hbase.ipc.warn.response.time value: 500
HDFS
SNYPR Architecture Guide 63
Recommendations
HDFS
AllJava Heap Sizeof DataNode inBytes
8 GB 8gb 8 GB
HDFS
HDFS
AllDataNodeBalancingBandwidth
1GB optional , 10MBdefault
1GBoptional ,10MBdefault
1GB optional , 10MBdefault
HDFS
HDFS
All
MaximumNumber ofTransferThreads
1600016000
16000
Impala
Impala
Cloudera
ImpalaDaemonMemory Limit
12 GB20gb
12 GB
Spark
SNYPR Architecture Guide 64
Recommendations
Spark 2
AllJava Heap Sizeof HistoryServer in Bytes
512 MB512MB
512 MB
Hive
Hive All
Hive : SparkDriverMaximum JavaHeap Size : 256MB
256 MB256MB
256 MB
Hive
Hive All
Hive : SparkExecutorMemoryOverhead : 26MB
26 MB256MB
26 MB
Hive
Kafka
All
KAFKA:MaximumMessage Size -message_max_bytes - 10 MiB
10 MiB10MB
10 MiB
Kafka
Kafka
AllKafka Brokerlogging level
ERROR ERROR
SNYPR Architecture Guide 65
Recommendations
Kafka
Kafka
All
ZooKeeperSessionTimeoutzookeeper.session.timeout.ms
6s 6s 6s
Kafka
Kafka
Allopen file limit ormaximum filedescriptors
100000100000
100000
Kafka
Kafka
All
Data RetentionHourslog.retention.hours
7 days7Days
7 days
Kafka
SNYPR Architecture Guide 66
Recommendations
HDFS
All
BlocksWithCorruptReplicasMonitoringThresholds
warning:0.5,critical:1
warning:0.5,critical:1
warning:0.5, critical:1
HDFS
HDFS
AllReplicationFactor
2 2 2
HDFS
HDFS
AllMaximal BlockReplication
512 512 512
HDFS
Imapla
Cloudera
dump when outof memory
disableddisabled
disabled
Kafka
Spark
Cloudera
dump when outof memory
disableddisabled
disabled
SNYPR Architecture Guide 67
Recommendations
Yarn
Cloudera
dump whenout of memory
disableddisabled
disabled
Zookeeper
Cloudera
dump when outof memory
disableddisabled
disabled
zookeeper-kafka
Cloudera
dump whenout of memory
disableddisabled
disabled
Hbase
Cloudera
Dump Heapwhen out ofmemory
disableddisabled
disabled
Kafka
All
MinimumNumber ofReplicas inISRmin.insync.replicas
1 1
SNYPR Architecture Guide 68
Recommendations
HBASE
All
JavaConfigurationOptions forHBaseRegionServer
-XX:+UseParNewGC-XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70-XX:+CMSParallelRemarkEnabled -XX:ParallelGCThreads=20 -XX:ConcGCThreads=15 -XX:+UnlockExperimentalVMOptions -XX:G1MixedGCLiveThresholdPercent=85 -XX:G1HeapWastePercent=2 -XX:InitiatingHeapOccupancyPercent=35-XX:+PrintReferenceGC -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=20M -verbose:gc -XX:+PrintGCDetails-XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:/var/log/hbase/gc.log
-XX:+UseParNewGC-XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -XX:ParallelGCThreads=20 -XX:ConcGCThreads=15 -XX:+UnlockExperimentalVMOptions -XX:G1MixedGCLiveThresholdPercent=85 -XX:G1HeapWastePercent=2 -XX:InitiatingHeapOccupancyPercent=35-XX:+PrintReferenceGC -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=20M-verbose:gc -XX:+PrintGCDetails-XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:/var/log/hbase/gc.log
SNYPR Architecture Guide 69
Recommendations
Zookeeper
Cloudera
Jute MaxBuffer
90 MB90MB
90 MB
Zookeeper
AllJava Heap Sizeof ZookeeperServer in Bytes
6 GB 8 GB 6 GB
Zookeeper
AllMinimumSessionTimeout
80008000
8000
Zookeeper
AllMaximumSessionTimeout
9000090000
90000
Zookeeper
AllCanaryConnectionTimeout
20 seconds
20seconds
20 seconds
Zookeeper
AllTick TimetickTime
4000 4000 4000
Zookeeper
All
MaximumClientConnectionsmaxClientCnxns
80008000
8000
SNYPR Architecture Guide 70
Recommendations
Zookeeper-Kafka
All
Zookeeper-kafka - JavaHeap Size ofZookeeperServer in Bytes
8 GB 8GB 8 GB
Zookeeper-Kafka
AllTick TimetickTime
40002000
4000
Zookeeper-kafka
Cloudera
Jute Max Buffer 50 MB50MB
50 MB
Zookeeper-kafka
AllMaxclientconnections
60006000
6000
Zookeeper-kafka
AllminSessionTimeout
4000 4000 4000
SNYPR Architecture Guide 71
Recommendations
Zookeeper-kafka
AllmaxSessionTimeout
9000060000
90000
YARN
All
yarn.resourcemanager.am.max-retries,yarn.resourcemanager.am.max-attempts
20 20 20
Impala
AllImpaladaemonSafety valve
--enable_partitioned_aggregation=true --enable_partitioned_hash_join=true
Hadoop Cluster Log Configuration
Service Level Property
HBase ERROR Gateway Logging Threshold
HBase ERRORHBase REST Server LoggingThreshold
HDFS ERROR DataNode Logging Threshold
HDFS ERROR Failover Controller Logging Threshold
HDFS ERROR Gateway Logging Threshold
HDFS ERROR HttpFS Logging Threshold
HDFS ERROR JournalNode Logging Threshold
SNYPR Architecture Guide 72
Recommendations
Service Level Property
HDFS ERROR NFS Gateway Logging Threshold
HDFS ERRORNameNode Block State Change LoggingThreshold
HDFS ERROR NameNode Logging Threshold
HDFS ERROR SecondaryNameNode Logging Threshold
Hive ERROR Gateway Logging Threshold
Hive ERROR Hive Metastore Server Logging Threshold
Hive ERROR HiveServer2 Logging Threshold
Hive ERROR WebHCat Server Logging Threshold
Imapala ERRORImpala Catalog Server LoggingThreshold
Imapala ERROR Impala Daemon Logging Threshold
Imapala ERRORImpala Llama ApplicationMasterLogging Threshold
Imapala ERROR Impala StateStore Logging Threshold
Kafka ERROR Gateway Logging Threshold
Kafka ERROR Kafka Broker Logging Threshold
Kafka ERROR Kafka MirrorMaker Logging Threshold
Key Value Store ERROR Lily HBase Indexer Logging Threshold
Oozie ERROR Oozie Server Logging Threshold
Spark ERROR Shell Logging Threshold
Spark ERROR Gateway Logging Threshold
YARN ERROR History Server Logging Threshold
Gateway Logging Threshold
SNYPR Architecture Guide 73
Recommendations
Service Level Property
YARN ERROR JobHistory Server Logging Threshold
YARN ERROR NodeManager Logging Threshold
YARN ERROR ResourceManager Logging Threshold
Zookeeper ERROR Server Logging Threshold
Clouder Manager ERROR Activity Monitor Logging Threshold
Clouder Manager ERROR Alert Publisher Logging Threshold
Clouder Manager ERROR Event Server Logging Threshold
Clouder Manager ERROR Host Monitor Logging Threshold
Clouder Manager ERROR Service Monitor Logging Threshold
Network Tuning RecommendationsThe network configuration can have a dramatic performance impact on theenvironment.
The network tuning guidance in this section can be used to optimize the networkconfiguration for the linux servers in the environment.
Modify Network Kernel SettingEdit the Network Tuning Parameters in / etc / sysctl.conf file:
# vi /etc/sysctl.conf
Edit the following values:
# allow testing with buffers up to 128MB
net.core.rmem_max = 134217728
SNYPR Architecture Guide 74
Recommendations
net.core.wmem_max = 134217728
# increase Linux autotuning TCP buffer limit to 64MB
net.ipv4.tcp_rmem = 4096 87380 67108864
net.ipv4.tcp_wmem = 4096 65536 67108864
# recommended default congestion control is htcp
net.ipv4.tcp_congestion_control=htcp
# recommended for hosts with jumbo frames enabled (only relevant
for systems with 10GB interfaces)
net.ipv4.tcp_mtu_probing=1
# recommended for CentOS7
net.core.default_qdisc = fq
In order for the above changes to take effect, reboot the server.
Increase the Transmit Queue LengthSet the txqueuelen permanently:
vi /etc/rc.local
SNYPR Architecture Guide 75
Recommendations
Add the following (this is the interface where you will receive data):
/sbin/ifconfig em1 txqueuelen 10000
To validate:
# ifconfig em1 | grep txque
ether 90:b1:1c:1f:e6:1b txqueuelen 10000 (Ethernet)
Location Value
/etc/sysctl.conf vm.swappiness = 10
/etc/security/limits.conf hdfs - nofile 32768
/etc/security/limits.conf mapred - nofile 32768
/etc/security/limits.conf hbase - nofile 32768
/etc/security/limits.conf yarn - nofile 32768
/etc/security/limits.conf solr - nofile 32768
/etc/security/limits.conf sqoop2 - nofile 32768
/etc/security/limits.conf spark - nofile 32768
/etc/security/limits.conf hive - nofile 32768
/etc/security/limits.conf impala - nofile 32768
/etc/security/limits.conf hue - nofile 32768
/etc/security/limits.conf kafka - nofile 32768
/etc/security/limits.conf hdfs - nproc 32768
/etc/security/limits.conf mapred - nproc 32768
/etc/security/limits.conf hbase - nproc 32768
/etc/security/limits.conf yarn - nproc 32768
SNYPR Architecture Guide 76
Recommendations
Location Value
/etc/security/limits.conf solr - nproc 32768
/etc/security/limits.conf sqoop2 - nproc 32768
/etc/security/limits.conf spark - nproc 32768
/etc/security/limits.conf hive - nproc 32768
/etc/security/limits.conf impala - nproc 32768
/etc/security/limits.conf hue - nproc 32768
/etc/security/limits.conf kafka - nproc 32768
/etc/security/limits.d/20-nproc.conf hdfs - nproc 32768
/etc/security/limits.d/20-nproc.conf mapred - nproc 32768
/etc/security/limits.d/20-nproc.conf hbase - nproc 32768
/etc/security/limits.d/20-nproc.conf yarn - nproc 32768
/etc/security/limits.d/20-nproc.conf solr - nproc 32768
/etc/security/limits.d/20-nproc.conf sqoop2 - nproc 32768
/etc/security/limits.d/20-nproc.conf spark - nproc 32768
/etc/security/limits.d/20-nproc.conf hive - nproc 32768
/etc/security/limits.d/20-nproc.conf impala - nproc 32768
/etc/security/limits.d/20-nproc.conf hue - nproc 32768
/etc/security/limits.d/20-nproc.conf kafka - nproc 32768
/sys/kernel/mm/transparent_hugepage/defrag
echo never
/sys/kernel/mm/transparent_hugepage/enabled
echo never
Proposed Configuration Tuning
jetty.conf for SNYPR Searchmake the default timeout 180K ms vsthe 50K
SNYPR Architecture Guide 77
Recommendations
/etc/sysctl.conf
# --------------------------------------------------------------------
# The following allow the server to handle lots of connection requests
# --------------------------------------------------------------------
# Increase number of incoming connections that can queue up
# before dropping
net.core.somaxconn = 50000
# Handle SYN floods and large numbers of valid HTTPS connections
net.ipv4.tcp_max_syn_backlog = 30000
# Increase the length of the network device input queue
net.core.netdev_max_backlog = 20000
# Increase system file descriptor limit so we will (probably)
# never run out under lots of concurrent requests.
# (Per-process limit is set in /etc/security/limits.conf)
fs.file-max = 100000
# Widen the port range used for outgoing connections
net.ipv4.ip_local_port_range = 10000 65000
# If your servers talk UDP, also up these limits
net.ipv4.udp_rmem_min = 8192
SNYPR Architecture Guide 78
Recommendations
/etc/sysctl.conf
net.ipv4.udp_wmem_min = 8192
# --------------------------------------------------------------------
# The following help the server efficiently pipe large amounts of data
# --------------------------------------------------------------------
# Disable source routing and redirects
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.accept_source_route = 0
# Disable packet forwarding.
net.ipv4.ip_forward = 0
net.ipv6.conf.all.forwarding = 0
# Disable TCP slow start on idle connections
net.ipv4.tcp_slow_start_after_idle = 0
# Turn on the tcp_window_scaling
net.ipv4.tcp_window_scaling = 1
# Turn on the tcp_timestamps
net.ipv4.tcp_timestamps = 1
# Turn on the tcp_sack
SNYPR Architecture Guide 79
Recommendations
/etc/sysctl.conf
net.ipv4.tcp_sack = 1
# Change Congestion Control (default: reno)
net.ipv4.tcp_congestion_control=htcp
# Increase Linux autotuning TCP buffer limits
# Set max to 16MB for 1GE and 32M (33554432) or 54M (56623104) for 10GE
# Don't set tcp_mem itself! Let the kernel scale it based on RAM.
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.core.rmem_default = 16777216
net.core.wmem_default = 16777216
net.core.optmem_max = 40960
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 87380 16777216
# --------------------------------------------------------------------
# The following allow the server to handle lots of connection churn
# --------------------------------------------------------------------
# Disconnect dead TCP connections after 1 minute
net.ipv4.tcp_keepalive_time = 60
SNYPR Architecture Guide 80
Recommendations
/etc/sysctl.conf
# Wait a maximum of 5 * 2 = 10 seconds in the TIME_WAIT state after a FIN, to handle
# any remaining packets in the network.
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 10
# How long to keep ESTABLISHED connections in conntrack table
# Should be higher than tcp_keepalive_time + tcp_keepalive_probes * tcp_keepalive_intvl )
net.netfilter.nf_conntrack_tcp_timeout_established = 300
net.netfilter.nf_conntrack_generic_timeout = 300
# Allow a high number of timewait sockets
net.ipv4.tcp_max_tw_buckets = 2000000
# Timeout broken connections faster (amount of time to wait for FIN)
net.ipv4.tcp_fin_timeout = 10
# Let the networking stack reuse TIME_WAIT connections when it thinks it's safe to do so
net.ipv4.tcp_tw_reuse = 1
# Determines the wait time between isAlive interval probes (reduce from 75 sec to 15)
net.ipv4.tcp_keepalive_intvl = 15
# Determines the number of probes before timing out (reduce from 9 sec to 5 sec)
net.ipv4.tcp_keepalive_probes = 5
# -------------------------------------------------------------
SNYPR Architecture Guide 81
Recommendations
RIN Syslog Configuration
When the NDB ( Data Broker ) is used in the design, the values of parameters below need tobe 0:
net.ipv4.conf.enp94s0f1.rp_filter = 0
The NDB is a one way device and will not acknowledge packets received that the OS maysend. For this reason, the Kernel will drop packets if the above is set to 1.
SNYPR Architecture Guide 82
Google Cloud
Google CloudThe table below shows an example configuration for a Google Cloud SNYPRarchitecture with 10,000 EPS, and 30-day os search index storage.
Type QuantityInstanceType
CPU Memory Storage
MasterServers
3N1-Highmem-16
16 vCPU 104 GB
(Quantity 1)/root 500 GB(SSD),(Quantity 1)/zookeeper250 GB (SSD)
SNYPRConsoleServers
1N1-standard-16
16 vCPU 60 GB
(Quantity 1)/root 500 GB(SSD),(Quantity 8)/data 500 GB(standard)
Compute /StorageServers
6N1-standard-64
64 vCPU 240 GB
(Quantity 1)/root 128 GB(SSD),(Quantity 5)/search[1-10]1000 GB(standard)
Search /StorageServers
1N1-Highmem-64
64 vCPU 416 GB
(Quantity 1)/root 128 GB(SSD),(Quantity 10)/search[1-10]5500 GB(SSD)
SNYPR Architecture Guide 83
Google Cloud
Type QuantityInstanceType
CPU Memory Storage
KafkaIngestionServers
3N1-standard-8
8 vCPU 30 GB
(Quantity 1)/root 128 GB(SSD),(Quantity 1)/zookeeper256 GB(SSD),(Quantity 3)/data 1024 GBGB (standard)
RemoteIngestionNodes
1N1-standard-8
8 cpu 30 GB
(Quantity 1)/root 128 GB(SSD),(Quantity 3)/data 2000GB GB(standard)
Deployment ArchitectureThe deployment of SNYPR includes a Hadoop cluster as well as servers for the userinterface and for event ingestion. When SNYPR is deployed in a cloud environment,there are two primary deployment alternatives. The first is a Securonix Clouddeployment where all servers in the cluster are hosted in the cloud.
The second is a Securonix Cloud / On-Premise deployment where the console nodesare deployed in the cloud and the ingestion nodes are deployed on-premise. See thediagram (Figure 9) for optional on-premise ingestion nodes.
SNYPR Architecture Guide 84
Spark Jobs Configuration for Kerberized Kafka
Spark Jobs Configuration forKerberized KafkaWhen running the SNYPR spark applications in a kerberized cluster, add the belowparameters in the sparkjobs scripts in order to sparkjobs for connecting to securekafka.
--driver-java-options "-
Djava.security.auth.login.config=/opt/keytabs/jaas.conf -
Djute.maxbuffer=50000000 -Dspark.driver.userClassPathFirst=true -
Dspark.executor.userClassPathFirst=true" \
--conf "spark.executor.extraJavaOptions=-
Djava.security.auth.login.config=/opt/keytabs/jaas.conf -
XX:+UseConcMarkSweepGC -
Dlog4j.configuration=./conf/log4j.properties -
Djute.maxbuffer=50000000 -Xss1G" \
SNYPR Architecture Guide 85
Spark Jobs Configuration for Kerberized Kafka
Disaster Recovery AlternativesSNYPR can be deployed to meet several disaster recovery objectives. Because ofthe size of the solution and the costs associated with disaster recover, several DRalternative strategies are available. Since SNYPR can be deployed with an existingHadoop environment, the disaster recovery strategy must align with the DR strategyfor the Hadoop infrastructure being used for SNYPR. The alternatives in thisdocument assume a dedicated Hadoop infrastructure for SNYPR, and describe thedisaster recovery considerations for the entire solution, including Hadoop. If anexisting Hadoop environment is used, the same considerations, are relevant, but theactual configuration of the Hadoop disaster recovery will be assumed to be part of theexisting Hadoop infrastructure.
AlternativesThe SNYPR Disaster Recovery Alternatives include:
1. Advanced DR with Full Infrastructure - identical infrastructure with data replicationfrom primary site to DR Site, with the ability to continue processing in flight mes-sages from the Kafka brokers at the DR site.
2. Full DR with Full Infrastructure - identical infrastructure with select data replicationfrom primary site to DR Site, with the ability to rebuild search indexes after a DRfrom the historical enriched event data, and the ability to process new activityevents at the DR site.
3. Limited DR with limited infrastructure - limited infrastructure with violation, sum-mary, and configuration data only and the ability to process new activity events.
ConsiderationsThe considerations for disaster recovery must be made for each service included inthe solution. The primary considerations for each of the node types are described asfollows:
l SNYPR Console Nodes: The SNYPR Console Nodes include the SNYPR Userinterface and the SNYPR configuration database.
l SNYPR Search Servers are dedicated search nodes that include a local eventindexer and multiple search instances for distributed searches. These servers are
SNYPR Architecture Guide 86
Spark Jobs Configuration for Kerberized Kafka
edge nodes in a hadoop cluster that read data from Kafka and index the data tolocal storage on the search servers. The SNYPR Search servers include optim-ization for maximum search performance and density on physical server. ApacheSolr is used for the underlying search server.
l SNYPR-EYE Server is a SNYPR monitoring and alerting server that is used forthe configuration and operational health monitoring of all SNYPR services includ-ing the all the servers in the Hadoop cluster, the processes on the SNYPR Con-sole, the SNYPR Spark Streaming applications running in the YARN cluster,including the performance of the data ingestion of all resources, the performanceand health of the SNYPR Search processes. The SNYPR Eye solution installsand manages SNYPR-EYE agents on the servers in the environment for localmonitoring.
l SNYPR Remote Ingestion Nodes include the ingestion servers with the con-nectors, the incoming activity log files, and the Kafka brokers with the in-flight mes-sages.
l Hadoop Master: These nodes also include the Hadoop administration serviceslike Cloudera Manager and Zookeeper when Hadoop is deployed as part of thesolution. The considerations for disaster recovery at this tier include file systemreplication with rsync, or a backup and restore strategy, as well as MySQL data-base replication for the SNYPR configuration database and the Hive metastore.
l Compute / Storage Nodes: The SNYPR Compute / Storage Nodes include HDFSand all the files stored by the system in HDFS for Hive / Impala table access, Solrindexes, and HBase tables. The considerations for disaster recovery at this tierinclude replication (using distcp) or backup and recovery of the HDFS data,HBase replication (using the WALs), and replication of the Solr collection schemadata.
l Kafka Brokers: The considerations for disaster recovery at this tier include KafkaMirrorMaker for the in-flight Kafka messages.
The exact disaster recovery strategy implemented should be in alignment with thebusiness continuity requirements for each deployment. The table shows thealternatives for disaster recovery configuration with the impact on the businesscontinuity.
SNYPR Architecture Guide 87
Spark Jobs Configuration for Kerberized Kafka
Advanced DRwith FullInfrastructure
Full DR with FullInfrastructure
Limited DR withLimitedInfrastructure
DR Target 1 day 1 week1 week (Violations,behavior and dataonly)
Configuration Data X X X
Violation Data X X X
Case Management X X X
BehaviorSummaries
X X X
Historical EnrichedEvents
X X X
Search Indexes Xrebuild searchindexes after DRinitiation
X
Kafka in-flightmessages
X X X
UnprocessedEvent Files
X X X
The availability of the data that SNYPR needs at the disaster site as well as networkfailover and end user access to the disaster recovery infrastructure must also beconsidered. The typical services that are needed at the disaster site to continueprocessing are shown in the diagram below. This includes user and access data, aswell as event logs that are ingested by the solution. For details, refer to the ClouderaBackup and Disaster Recovery at:https://www.cloudera.com/documentation/enterprise/5-8-x/topics/cm_bdr_about.html.
SNYPR Architecture Guide 88
Recommended