Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Security with NSXGreater security in the digital business age
With Scott South, VCP6-DCV, VCP-Cloud, VCP6-NV, VCI, CCNP-Data Center, CCSI
August 16th and 17th 2017
© 2017 VMware Inc. All rights reserved.
2
“By 2020, 60% of digital businesses
will suffer major service failures
due to the inability of IT security
teams to manage digital risk.”
Gartner, “Special Report: Cybersecurity at the Speed of
Digital Business,” May 2016.
Business demands
Control costs and reduce complexity
Deliver applications faster to improve time to market
Decrease business risk in an environment of advanced persistent threats
VMVM
VMVM
APPVMVM
VMVM
APPVMVM
VMVM
APP
So what’s getting in the way?
3
Increasing cost and complexity
Lack of internal network security controls
Proliferation of devices without consistent security
? ? ?
Q: How do you deliver applications that meet the demands of the business while ensuring greater levels of security?
A: Decouple applications from the infrastructure through the Software-Defined Data Center and Network Virtualization.
4
What is Software-Defined Data Center (SDDC)?
Hardware
Software
Data center virtualization layer
Pooled compute, network, and storage capacity
Vendor independent, best price/performance/service
Simplified configuration and management
Intelligence in software
Operational model of VM for data center
Automated provisioning and configuration
Compute
Storage
Network
Compute
Storage
Network
5
NSX value proposition
“Network platform”
Virtual networks
Network,
storage,
compute
Virtualization layer
Network virtualization
is at the core of the
Software-Defined Data
Center approach Network and security
services now in the
hypervisor
VMVM
VMVM
APPVMVM
VMVM
APPVMVM
VMVM
APP
6
7
? ? ?
The problem
Cost and complexity
Lack of internal security
Mobile proliferation
8
The solution: network virtualization
Simpler security management
Ubiquitous enforcement
Granular threat containment
VMVM
VMVMAPP
VMVM
VMVMAPP
VMVM
VMVMAPP
Security with NSX
Micro-segmentation
9
Our security realitiesWhen threats breach the perimeter, it’s hard to stop lateral spread
10
INTERNET
DATA CENTER
DATA CENTER PERIMETER
Low priority systems are
often targeted first.
Attackers can move freely
around the data center.
Attackers then gather and
exfiltrate the valuable data.
MICRO-SEGMENTATION
Our security realitiesSecurity is a top priority, but investments are not aligned for success
11
20% of investmentfocused on internal controls, resulting in lack of visibility and control
80% of investmentfocused on preventing perimeter
intrusion
DATA CENTER PERIMETER
INTERNET
YET MOST INVESTMENT STRATEGIES
FOCUS ON THE REVERSE
Only 20% comes
outside the perimeter (from the Internet)
Typically, 80% of network
traffic comes from inside
the data center
DATA CENTER
We need a new strategy for security
TODAY
MICRO-SEGMENTATION
What if you could…Enforce security at the most granular level of the data center?
12
DATA CENTER
Every VM can have:
Individual security policies
Individual firewallsINTERNET
DATA CENTER PERIMETER
MICRO-SEGMENTATION
What if you could…Define this level of security repeatedly and predictably
13
DB
Web
App
Granular threat containment
Logical policy grouping
Simplified security policy
MICRO-SEGMENTATION
Better security, simplified policy Define a policy using workload characteristics, not IPs and ports
An NSX security policy can be based on things like:
• Operating system
• Machine name
• Services
• Application tier
• Regulatoryrequirements
• Security posture
MICRO-SEGMENTATION
Creating and managing policies becomes a whole lot easier
DATA CENTER PERIMETER
PCI ScopePCI Scope
14
Next-generation security
“There just wasn’t a great way to insert
security in order to address east-west traffic
between VMs, nor have the security tied to the
applications as they moved around
dynamically.”
JOHN SPIEGEL
NETWORK MANAGER
COLUMBIA SPORTSWEAR
Micro-segmentation
15
Operationalizing Micro-segmentation
16
Where do I start with my journey towards micro-segmentation?
• Trace all east/west network traffic
• Map the location of applicationsand identify which ports they use
• Identify new, robust securitypolicies
NSX Interest NSX Deployed & Realizing Value
Pre-Deployment & Planning
Deployment Day-2 Operations
NSX Deployment Lifecycle
How do I know if my NSX configuration is robust?
• Adhere to industry standard bestpractices
• Ensure NSX configuration is idealprior to supporting production VMs
• Trace reported issues directlyback to misconfigurations
How do I efficiently manage NSX, as I’m introducing new technology?
• Manage traffic across physical andvirtual infrastructure
• Increase visibility into securitygroup policies & VM membership
• Track security posture over time
Micro-segmentation in Action
Firewall Placement Evolution
18 | 34
ESXi ESXi
Physical FW
connected to
hypervisors
FW Software
installed directly
into VM
ESXi
vSwitch
FW implemented
in the Hypervisor
Distributed FW model
Legacy
ESXi ESXi
Yesterday’s Virtual
Infrastructure
NSX’s
Virtual Infrastructure
NSX Distributed Firewall Overview (1)
19 | 34
Hypervisor Kernel Embedded Firewall:
• Is built directly in to the Hypervisor
• “Line Rate” Performance (15Gbps+ per host)
• No VM can circumvent Firewall (egress and ingresspackets are always processed by firewall).
– In case of extreme load (CPU saturated or memorycompletely full) the Distributed Firewall behaves as failclose: no packet will pass the FW.
V
MV
M
V
M
V
M
V
M
V
MV
M
V
MV
M
V
MV
M
V
M
V
M V
M
V
M
Distributed Virtual Firewall:
• No “Choke Point”
• Scale Out
• Enforcement closest to VM at vNIC level
• SpoofGuard functionality
• Protects against IP and MAC spoofing
NSX Distributed Firewall Overview (2)
• NSX Distributed Firewall provides security filtering functions on every host, inside thehypervisor and at kernel level
– DFW is a East-West stateful L2-L4 firewall
– Distributed enforcement of policy rules
• DFW offers centralized configuration
– Using the vSphere Web Client
20 | 34
NSX Distributed Firewall Overview (3)
• VM Name/Attribute Based and vCenter container based
• Policy independent of VM location:
– DFW can enforce security rules between 2 VM even if they are on same L2 segment ( VXLAN orVLAN)
– Policy rules always follow the VM, even if vMotion occurs
21 | 34
DFW Policy Enforcement
• DFW enforces rules at vNIC layer before encapsulation (or after de-capsulation)
– Independent of transport network (VXLAN or VLAN)
– Requires virtual Distributed switch : DVS
22 | 34
VXLAN 5001
vSphere Host
VM1
MAC1
IP1
VTEP IP: 10.20.10.10
vSphere Distributed Switch
vSphere Host
VM2
VTEP IP: 10.20.10.11
VM3MAC2
IP2
MAC3
IP3
Policy Rules:
Source Destination Service Action
VM1 VM2, VM3 TCP port
123
Allow
VM1 VM2, VM3 any Block
Note: there is no relationship between DVS ACL/Security capabilities and DFW
NSX Distributed Firewall Components - Communication
23 | 34
SSH
Client
NSX
Distributed
FirewallvSphere
TCP/5671
TCP/443
TCP/443
vSpher
e Client
VXLAN DR DFWSecurity
NSX
Manager
vCenter
Server
AP
I
ESXi
Host
REST API
Client
TC
P/2
2
Distributed Data Path
24 | 34
External Network
Source DestinationSource Destination
vSwitch
VSFW
vSwitch
VSFW
vSwitch
VSFW
Policy Rule Objects
• The DFW supports security rules, called policy rules, at the layer 2, and layer 3 and 4 levels
– Layer 2 rules are created under the Ethernet tab
– Layer 3 and layer 4 rules are created under the General tab
• General rules are enforced after Ethernet policy rules are applied
25 | 34
L2 rules control traffic at data link layer. Use L2
rules to filter specific source or destination mac
addresses or L2 protocols. Some examples of
L2 protocols are ARP, RARP, LLDP,…
L3/L4 rules control traffic at network and
transport layer. Use L3/L4 rules to filter specific
source or destination IP addresses or L4
protocols. Some examples of L4 protocols are
SSH (TCP port 22), HTTP (TCP port 80), …
DFW Sections
• DFW Sections segment policy rules for easier manageability and better performance
• Sections do not impact the overall security policies
26 | 34
Engineering SECTION
HR SECTION
Default SECTION
To merge two sections
together, click the symbol…
…and select the section to
merge with.
Policy Rule Object Fields
27 | 34
Source or Destination Field Description
IPv4 or IPv6 IPv4 or IPV6 address
can be host address, subnet or range of addresses
Datacenter vCenter Datacenter attribute.
Rule will apply for all VM/vNIC existing in the Datacenter
Cluster vCenter Cluster attribute.
Rule will apply for all VM/vNIC existing in the Cluster
Network vCenter Network (vSwitch) attribute
Rule will apply for all VM/vNIC connected to this network
Virtual App vCenter vAPP attribute
Rule will apply for all VM/vNIC part of the vAPP
Resource Pool vCenter Resource Pool attribute
Rule will apply for all VM/vNIC part of the Resource Pool
Virtual Machine VM name attribute
vNIC VM vNIC attribute
Logical Switch NSX logical switch attribute (VNI – or VXLAN Network Identifier).
Rule will apply for all VM/vNIC connected to this logical switch
Security Group NSX security group attribute (defined through Service Composer tab)
Rule will apply for all VM/vNIC part of the Security Group
IP sets List of IPv4 or IPv6 adress
Rule ID Rule Name Source Destination Service Action Applied To
Service Field Description
Protocol TCP, UDP, Oracle_TNS, FTP, SUN_RPC_TCP,
SUN_RPC_UDP. NBNS_BROADCAST,
NBDG_BROADCAST, ICMP, IGMP, IPCOMP,
IPV6ROUTE, IPV6FRAG, IPV6ICMP, IPV6NONXT,
IPV6OPTS, RSVP, GRE, ESP, AH, L2TP, SCTP, IPv4,
ARP, X25, LLC, FR_ARP, BPQ, DEC, DNA_DL,
DNA_RC, DNA_RT, LAT, DIAG, CUST, SCA, TEB,
RAW_FR, RARP, AARP, ATALK, IEEE_802_1Q, IPX,
NETBEUI, IPv6, PPP, ATMMPOA, PPP_DISC, PPP_SES,
ATMFATE, LOOP
Note: user can define custom protocol as well by clicking
on
New -> Service
Ports (Destination L4 ports) Unique port or range of ports
Advanced options:
Source Ports (Source L4 ports)
Unique port or range of ports
Action
Field
Description
Block Block the traffic
Allow Enable the
traffic
Log Log the traffic
information
Do not
log
Don’t Log the
traffic
information
Applied To Field Description
DataCenter vMWARE Datacenter attribute
Cluster vMWARE Cluster attribute
Distributed Virtual Port
Group
Port Group of a dVS
Network Network attribute
Virtual Machine VM attribute
vNIC vNIC attribute
Logical Switch VXLAN logical switch
DFW Centralized Management - Example
28 | 34
VC containers
- Clusters
- datacenters
- Portgroups
- VXLAN
VM containers
- VM names
- VM tags
- VM attributes
Identity
- User identity
- Groups
IPv6 compliant
- IPv6 address
- IPv6 sets
Services
- Protocol
- Ports
- Custom
IPv6 Services
Choice of PEP-Clusters
- VXLAN
- vNICs
Applied To Example
29 | 34
Compute Rack
VXLAN DR DFWSecurity
Compute Rack
VXLAN DR DFWSecurityvDS
vCenter server
VM1 VM2 VM3 VM4
NSX manager
NSX
controllers
Source Destination Service Action Applied To
VM1 VM2, VM3 TCP port 123 Allow VM1,VM2,VM3
VM1 VM4 UDP port 321 Allow VM1, VM4
Source Destination Service Action
VM1 VM2, VM3 TCP port 123 Allow
VM1 VM4 UDP port 321 Allow
Source Destination Service Action
VM1 VM2, VM3 TCP port 123 Allow
1
2
3 33 3
Source Destination Service Action Applied To
VM1 VM2, VM3 TCP port 123 Allow VM1,VM2,VM3
VM1 VM4 UDP port 321 Allow VM1, VM4
Source Destination Service Action
VM1 VM2, VM3 TCP port 123 Allow
Source Destination Service Action
VM1 VM4 UDP port 321 Allow
Logical Switch Rule Based Example
30 | 34
VM
2
WEB logical-switch-1
VXLAN 5001
VM3
APP logical-switch-2
VXLAN 5002
Web VM App VM
Router Instance 1
VM
1VM4
Source Dest Action
VM1/VM2 VM3/VM4 Block
VM1 VM2 Allow
VM2 VM1 Block
(assuming
default rule is
set to block)
VM3 VM4
VM4 VM3
NSX Service Composer
• Service Composer is an orchestration tool used to apply security policies to security groups.
• Security groups define the assets that you wish to protect:
– Security groups are built with static and dynamic membership.
– Membership criteria is flexible and configurable.
• Security policies list the security services and rules that will be assigned to the members of thesecurity group, such as:
– Firewall rules
– Network introspection services
– Guest introspection services
31 | 34
Security Group
• A security group is a construct that allows dynamic grouping of objects
– Based on inclusion and exclusion of objects defined under vCenter
• Done internally under NSX manager
– Network and Security -> Service Composer -> Security Groups tab
• Dynamic membership criteria can be defined to include objects into the security group:
– match ANY or ALL of the criteria below:
• Computer OS name, Computer Name, VM Name, Security Tag, Entity
32 | 34
Security Group Components
• A security group has three parts
1. Expression
• Defined the dynamic membership criteria of vCenter objects
• Configured in the Defined dynamic membership tab of in the New Security Group wizard
2. Inclusions
• Static membership selection of vCenter objects
• Configured in the Select objects to include tab of in the New Security Group wizard
3. Exclusions
• Static membership rejection of vCenter objects
• Configured in the Select objects to exclude tab of in the New Security Group wizard
• Objects identified in the inclusion part are added to the objects identified in the expression
– Any objects identified in the exclusions part is removed from the security group
33 | 34
Security Group Creation Wizard (1)
1. Enter the name and description of the security group
1. Define dynamic membership
34 | 34
Security Group Creation Wizard (2)
3. Select objects to include
35 | 34
Security Group Creation Wizard (3)
4. Select objects to Exclude
5. Review settings and click finish
36 | 34
Security Group Use Cases
• Security groups are used to assist in the aggregation of vCenter Server objects to help enforcebusiness policies:
– Create a group of all virtual machines running an unsupported version of Windows, such as Windows95
– Create a group of all port groups that are dedicated for the Finance business unit
– Create a group of all the virtual machines that start with the name QA
37 | 34
Security Group Rule Based Example
• SECURITY-GROUP-WINDOWS: Includes VMs that have a Computer OS name of Windows
• SECURITY-GROUP-LINUX: Includes VMs that have a Computer OS name of Linux
38 | 34
VM
2VM3
Windows
DFW Instance 1
VM
1VM4
WindowsLinux Linux
Source Dest Action
VM1 VM2 Block
VM1 VM4 Block
VM3 VM2 Block
VM3 VM4 Block
VM1 VM3 Allow
VM3 VM1 Allow
VM2 VM4 Allow
VM4 VM2 Allow
Security Policies
• Security policies define a collection of network and security services to be applied to a securitygroup.
• A security policy can contain services:
– Guest introspection services:
• Antivirus
• Vulnerability management
• Data security and data loss prevention solutions
– Network introspection services:
• Intrusion detection/prevention systems
– Firewall rules
•
Security Groups
Members (VM, vNIC) and
Context (user identity, security
posture)
Security PoliciesServices (firewall, antivirus)
Profiles (labels representing
specific policies)
APPLY
39 | 34
Security Policy Configuration
• To create a new security policy:
– Start the security policy creation wizard.
– Enter a name and description for the security policy.
– Select guest introspection services.
– Select firewall rules.
– Select network introspection services.
40 | 34
Achieving Micro-Segmentation in the Real World
Achieving Micro-segmentation in the Real World
Prepare Security Fabric
• Prepare Hosts forSecurity
• Optional: DeploySecurity VendorManagementConsoles foradvanced services
• Optional: Deploysecurity vendorappliances.
Monitor Flows
• Brownfield: Leverageexisting knowledgefrom Perimeterfirewalls
• Use NSX Built-InApplication RuleManager (ARM)
Determine Policy Model
• Analyze applicationprofiles using ARM
• Determine a policymodel based on thepatterns.
Apply Policy Model
• Determine approach: Firewall Rule Tableor Service ComposerPolicy Model
• Based on the PolicyModel – Creategrouping models
• Write Security Policy
42
Most Commonly Deployed Firewall Policy Model
43
Default Rule = Deny
Emergency Rules
Infrastructure Rules
Environment Rules
Inter-Application Rules
Intra-Application Rules
Used for Quarantine and/or Allow Rules
Global Rules – AD, DNS, NTP, DHCP, Backup, Mgmt Servers
Rules between Zones – Prod vs Dev, PCI vs Non PCI, Inter BU rules
Rules between Applications
Rules between the app tiers or the rules or between micro-services
Whitelisting / Zero Trust
Using
Application
Rule Manager
(ARM)
Application Rule ManagerOn-Boarding new applications
Application Rule Manager
▪Simplifies the process of creating Security Groups and Whitelisting FirewallRules for existing applications
▪How Does it work ?
Leverages Flow Monitoring to monitor all incoming/outgoing flows for select VNICs
After flow monitoring, flow tables are generated and user can filter irrelevant flows
User can replace individual VMs with SG, and replace ports with services
Firewall rules are generated and can be published to the Firewall Rule table after audit
▪ Scaling
Up to 5 simultaneous sessions
Up to 20 VNICs per session
Can co-exist with Flow Monitoring
Application Rule Manager
• Application Flow Data monitoring
➢ 5-tuple, Session Initiation Direction
• Analysis engine
➢ Intelligent duplicate/multicast/broadcast flow purging
➢ ALG
➢ Create/Update Security Groups, IP Sets and Services
• Manual Intervention
➢ Manually Flow Optimization
➢ Create/Update Security Groups, IP Sets and Services
Application Flow Data Monitoring
Analysis Engine
Manual Intervention
Security Validation and Audit
Firewall Rule Creation
• Security Validation and Audit
➢ Correlation between flows and recommended rules
➢ Recommendations for overlap/inconsistencies withexisting rule set
• Firewall Rule Creation
➢ Publish new rules into a new section
➢ Choose position and section parameters
46
Application Rule ManagerSession List
• Start New Session
• Retrieve Existing Session
Start/Stop flow
collection
Selected vNICs
47
Application Rule Engine
Collection can be
stopped by user
Flow List
• Raw Flows
• Displayed Live
Start Analysis
48
Application Rule Manager
Direction
• INTRA: Between VMs selected
• IN: Into one of the VMs selected
• OUT: From one of the VMs selected
Source/Destination• VMs and IP addresses
• Can be replaced with exiting Security Group / IP Set / Any
• Can be added to a New Security Group/ IP Set
Service
• One Port or one or more(known) services
• Can be added to a New Service / Service Group
• Can be replaced with Any
Flows
• Irrelevant flows can be hidden (removed from analysis)
• I.e. IPv6
49
Application Rule Manager
Indicates Modified
Flows
50
Application Rule Manager
Selected Flows
• Selected flows will be used to create a single rule
• Select groups of flows that belong in one rule and repeat
51
Application Rule Manager
52
Application Rule Manager
53
Application Rule Manager
54
Application Rule Manager
Published Section
55
Policy vs Rule Table ApproachWhich one to select ?
Determining Policy Creation Approach
Application Based
Infrastructure Based
Network Based
57
• Policies are application centric.• Data Center environments are
dynamic.• Advanced security services are
used.
• Policies are SDDCinfrastructure centric.
• Uses Logical constructs.• Data Center environments are
static.
• Policies are network centric.• Uses only IP Address and Mac
Address Centric.
Determining Policy Creation Approach
Application Based
Infrastructure Based
Network Based
58
• Service Composer based PolicyCreation
• Either Firewall Rule Table orService Composer based PolicyCreation.
• Firewall Rule Table based PolicyCreation.
Firewall Rule Creation
VM
2
WEB_LS
logical-switch-1
VXLAN 5001
VM3
APP_LS
logical-switch-2
VXLAN 5002
Distributed Logical
Router
VM
1VM4
1.1.1.1 1.1.1.2 2.2.2.1 2.2.2.2
Firewall ObjectivesLogical Network Topology:
1. Allow only HTTPS traffic to talk to thepool of Web-Servers.
2. Block traffic between Web-Servers.
3. Block traffic between Application Servers.
4. Allow Web-Servers to talk to Applicationservers on HTTP 8080
Security Groups Selection Criteria Resultant VMs
SG_WEB Static: Logical Switch – WEB_LS VM1, VM2
SG_APP Static: Logical Switch – APP_LS VM3, VM4
Step 1: Security Group CreationXX
59 | 34
Firewall Rule Creation – Using Firewall Rule Table
VM
2
WEB_LS
logical-switch-1
VXLAN 5001
VM3
APP_LS
logical-switch-2
VXLAN 5002
Distributed Logical
Router
VM
1VM4
Source Destination Service Action
SG_APP SG_APP Any Block
SG_WEB SG_WEB Any Block
Any SG_WEB HTTPS Allow
Any SG_WEB Any Block
SG_WEB SG_APP Tomcat
(HTTP 8080)
Allow
SG_WEB SG_APP Any Block
1.1.1.1 1.1.1.2 2.2.2.1 2.2.2.2
Using Firewall Rule TableLogical Network Topology:
SG_WEB SG_APP
60 | 34
Firewall Rule Creation – Using Security Policies
VM
2
WEB_LS
logical-switch-1
VXLAN 5001
VM3
APP_LS
logical-switch-2
VXLAN 5002
Distributed Logical
Router
VM
1VM4
1.1.1.1 1.1.1.2 2.2.2.1 2.2.2.2
Using Security PoliciesLogical Network Topology:
SG_WEB SG_APP Source Destination Service Allow
PSG PSG Any Block
Source Destination Service Allow
any PSG HTTP Allow
Source Destination Service Allow
PSG SG_APP TOMCAT (HTTP/8080) Allow
PSG SG_APP Any Block
SP0
SP1
SP2
SP0
SP1SP2
61 | 34
vRealize Network Insight
Network Insight Architecture & Deployment
VMware Delivers: Intelligent Operations for Software-Defined Datacenter
641 vRealize Suite components2 Included with vRealize Suite and ships with NSX
vRealize Operations1
Compute Storage Hybrid CloudNetwork & Security
vRealize
Network
Insight
vRealize Log Insight2
Physical/ Virtual/ Cloud Environment
Application
vRealize Business for Cloud1
Network Insight Architecture Components
Network Insight Proxy VM (Collector)
Network Insight Platform VM
HTTPS: 443
UI Access
(Google Chrome Only)
HTTPS: 443
Additional ports access
needed for appliance
debugging:
Port 22 (SSH)
Port 5480 (HTTP)
VMware
NSX Edges
VMware
NSX
Manager
Various
Network
vendor
Switches /
Routers
Cisco UCS /
rackmount
VMware
vCenter
VMware
NSX
Controller
VMware
vSphere
(ESX)
HTTP: 443
SSH: 22
SNMP: 161SSH: 22SSH: 22
HTTPS: 443SSH: 22
SNMP: 161HTTPS: 443
Read-Only CredentialsIPFix: 2055
Data Sources
Network Insight Architecture – Platform VM
• Analytics Grid
– Store data from Proxy VMs
– Processes in real-time, batch
– VXLANs graphs, paths, MTU events…
• Storage & Search Engine
– Stores configurations, changes, performance stats
– Indexes configurations, events
– Supports data retention policy
• UI, REST API, Search Engine
– UI on top of REST API (private)
– Search Engine• SDDC models awareness
• Combines configuration, flows, performance data
– Flow analytics components (high performance)• Access flows at large scale
• Analyze flows, rules, micro-segmentation graphs
Objects Time series store (POSTGRES)
Metrics Time series store (HBASE)
Configuration Time series Index (ELASTIC SEARCH)
UI, REST API
Network Insight Search Engine
Flows-Rules Analytics Engine
Flows Store
Computation Grid (YARN, KAFKA )
SDDC models/Graphs/Event processors
UI, REST API,
Network
Insight
Search Engine
Data Stores
Search Engine
Analytics Grid
Arkin Confidential 67
Network Insight Architecture – Proxy VM (aka Collector)• Proxy
– Collects data from Data sources using appropriate protocol(s)
– Receives IPFix (Netflow) data from ESX on port UDP:2055
– Securely pairs with Platform before uploading data or getting instructions
– Reduces/batches data significantly before upload
• Collector process
– Only way to upload data, receive instructions from Platform
• Platform not available? Store in Offline Message Store
– Has specific adapters for Data sources, get data messages from them
– Receives data from Flow Processor
– Adapters may use Postgres to keep some state
• Offline Message Store
– Stores latest data here temporarily if Platform is unavailable
– Restricted to % of disk space
– Good for few hrs to days of data depending on size of environment
• Flow Processor (high performance)
– Processes raw flow records files (nfcapd), generates 5-tuples, 4-tuples and aggregate statistics
– Apply algorithms, heuristics to stitch records, dedups, avoid negative scenarios (port scan, ..)
– Supports up to 600K unique 4-tuples at any point of time. Good for all deployments so far.
• Flows of servers with millions of clients on internet are collapsed to smaller number of 4-tuples
• Postgres, mainly keep state for functioning of components on Proxy VM, not for datacenter data
Postgres
Flows
Processor
Offline Msg
Store Collector
(Configuration, metrics, events – using
REST, SOAP, CLI, SSH, SNMP, …)
vCenter NSXUCS AristaNfcapd
Platform to Proxy VM Relationship
• Proxy talks to Platform over HTTPS:443
– Uploads data to Platform
– Long polls for instructions (add/stop data source, …)
– Platform is not available?
• Latest data is stored locally on Proxy per disk space limit
• Stored data is uploaded to Platform once available
• Platform does NOT connect to Proxy (one-way)
• Platform buffers data from Proxy, then process it
– Latest data is stored in buffer, per disk space limit
– Platform is too busy to take data?
• Proxy slows down the upload
• UI shows the Proxy status, its last heart-beat(Notification)
– Proxy fails?
• No failover of its Data sources to a new proxy
• UI status will reflect its last heart beat
68
Network Insight Proxy VM (Collector)
Network Insight Platform VM
Proxy VM to Data sources Relationship
• Proxy owns the Data source credentials (encry)
• One Data source can feed exactly one Proxy
– E.g. Can’t split a vCenter across multiple proxies
• Multiple Data sources can feed one Proxy VM
• Proxy fails?
– No failover of Data sources to a new proxy
– Delete/add Data source to new Proxy in UI
• Proxy cannot connect to the Data source?
– Data source status (UI) will reflect the error
• Proxy too busy?
– Slows the data fetch & upload
– Data source status (UI) will reflect the lag in somecases
69
Network
Insight
Proxy VM
VMware
vCenter
OKNOT POSSIBLE
Network
Insight
Proxy VM
VMware
vCenter
Network
Insight
Proxy VM
Network
Insight
Proxy VM
VMware
vCenter
OK
VMware
vCenter
Brick Size VMs Metrics Flow
Records per
Second
VM Type Cores RAM Disk
(Thin
Provisioned)
IOPS
LARGE 6K 5M 100k/s Platform 12 60GB 750GB 250
Proxy 6 12GB 150GB 75
MEDIUM 3k 3M 50k/s Platform 8 32GB 750GB 150
Proxy 4 10GB 150GB 50
• If not collecting IPFix (FLOW) data, supported VMs #s increases
– Large Brick – 10K VMs
– Medium Brick – 6K VMs
• VM limits are for Platform and Proxy VMs
• General guidelines – results may vary
Network Insight Deployment – Single Instance (aka Brick model)
• Data retention – 45 days by default
– Can not be changed via the UI
– File a support ticket to change this setting
• When the system hits 90% disk full in the Platform VM
– Data will no longer be accepted by the Platform VM
– UI continues to work correctly and display “old data”
– Data sources in UI will show processing lag (halted) error messages
Network Insight Deployment – Disk Space
Proxy
VM
Proxy
VM
Network Insight Deployment – Proxy VM Flexibility
• Multiple Proxy VMs can connect to a PlatformVM
• Why would you do this?
– Security zones
– Geo distributed collection sites
– Scale Out – More to come on that! Proxy
VM
Platform
VM
Single Proxy Multi Proxy
OR
Platform
VM
Proxy
VM
Proxy
VM
Platform
VM
Remote Site A Remote Site B
Geo DistributionMain Data Center
Remote Site C
Arkin Confidential 73
Objects Time series store
Metrics Time series store
Configuration Time series Index
Platform VM Cluster – 2N+1Platform VM - 1
UI, REST API
Search Engine
Flows-Rules Analytics
Engine
Flows Store
Analytics Grid
Network Insight Deployment – Scale Out• How do we scale our to larger environments (>6K VMs)?
• Answer: Platform Cluster
– Large Brick only
– Requires 3 Platform VMs minimum
• Turns on HA Automatically (single node failure)
• Linear scale-out
– For ex: 18K VMs == 3 large bricks
– Limits tested
• 50K VMs real datacenter
• 100K VMs simulated datacenter
• This is not OOTB today
– Scale Out is possible but requires manual configurationnot exposed in the product UI
– Contact CMBU for opportunities requiring scale outdeployments ([email protected])
– Will be OOTB in Q4
Arkin Confidential 74
Network Insight Deployment – Scale Out
• Platform VMs are placed into a PlatformCluster
– Virtual IPs “frame” the cluster
• Customer access UI using either:
– Direct IP of any Platform VM
– Virtual IP
• Proxy VM talks to Platform cluster usingVirtual IP
• Contact CMBU PM for opportunitiesrequiring scale out deployments([email protected])
Platform
UI,
API
Platform
UI,
API
Platform
UI,
API
Platform
UI,
API
Platform
UI,
API
Proxy
VM
VIP
VIP
Proxy
VM
Site A Site B
Example architecture scenarios
• Customer Environment - Base Case
– 3000 VMs
– Cisco Nexus gear
– Flows enabled
• Architecture
– 1 Platform/Proxy
– OOTB deployment
Cisco
Nexus
Platform
VM
Proxy VM
vCenter
3K VMs
Example architecture scenarios
• Customer Environment
– 3K VMs across 3 Geo locations
– Each location has its own vCentercontaining 1K VMs
– Flows enabled
• Architecture
– 1 Platform/3 Proxy
– OOTB deployment
vCenter
1K VMs
Platform
VM
Proxy VM
vCenter
1K VMs
Proxy VM Proxy VM
vCenter
1K VMs
Example architecture scenarios
• Customer Environment
– 10K VMs across 2 vCenters
– 4K in one vCenter
– 6K in the second vCenter
– Flows enabled
• Architecture
– 3 Platform Cluster/2 Proxy
– Not OOTB solution (Q4’16)
vCenter
4K VMs
Platform
VM
Proxy VM
vCenter
6K VMs
Proxy VM
Platform
VM
Platform Cluster**
** Remember: Contact CMBU
PM for opportunities requiring
scale out deployments
Platform
VM
Example architecture scenarios
• Customer Environment
– Single vCenter with 15K VMs
– Flows enabled
• Architecture
– Remember…
• Platform cluster can handle 15K (and more) VMs
• A single vCenter can only connect to one Proxy VM
• A Proxy VM can handle 6K VMs (10K VMs withoutflows)
– Therefore…
• Deployment >6K VMs (10K VMs without flows) in onevCenter requires custom solution
• Call for help
– Contact [email protected]
vCenter
15K VMs
Other Architecture Notes
• Physical topologies– Switch vendors: Supports top five leading vendors including Cisco, Arista, Brocade, Juniper,
Dell
– Firewall/security vendors: Palo Alto networks with others on the roadmap
– Customers need to configure vendor/device specific Data sources in product UI
• Netflow (aka IPFIX)– Works with vSphere 5.5 and 6.0 (ESXi 5.5 update 2+ and 6.0 update 1b+)
• Please ensure customer is using the revision levels listed
– No importing or exporting to or from other Netflow tools are offered (on roadmap)
– Limits
• A Proxy VM supports up to 600K unique 4-tuple flows (src IP, Dst IP, Dst port, Protocol). Good for alldeployments so far
• Micro-segmentation Analytics Dashboard (in UI) supports analyzing up to 600K 4-tuple flows. Goodfor all deployments so far
– Customer can narrow the scope using selections on dashboard UI
79
Other Architecture Notes
• Internet Access
– Network Insight appliance Health telemetry data is sent to a VMware managed collectionserver
• No flow information or customer confidential information is collected
– Logs samples are sent to a VMware managed collection server for troubleshooting. Can opt-out
– Can be upgraded, patched over internet from VMware managed server
– Remote troubleshooting
• Customer can turn ON the support tunnel feature to enable remote troubleshooting by VMwareoperations team. OFF by default
– No internet access or firewall? The product will function fine. however patches, upgrades,and troubleshooting log data will not function
80
Network Insight Data Sources
Network Insight Architecture Components: Data Sources
Network Insight Proxy VM (Collector)
Network Insight Platform VM
HTTPS: 443
UI Access
(Google Chrome Only)
HTTPS: 443
Additional ports access
needed for appliance
debugging:
Port 22 (SSH)
Port 5480 (HTTP)
VMware
NSX Edges
VMware
NSX
Manager
Various
Network
vendor
Switches /
Routers
Cisco UCS /
rackmount
VMware
vCenter
VMware
NSX
Controller
VMware
vSphere
(ESX)
HTTP: 443
SSH: 22
SNMP: 161SSH: 22SSH: 22
HTTPS: 443SSH: 22
SNMP: 161HTTPS: 443
Read-Only CredentialsIPFix: 2055
Data Sources
Adding Data sources
• Go to Settings
• Chose Data sources
83
Adding Data sources
• Each Data source type has its own requirements and validation
84
Data Sources: Virtual (vSphere/vCenter)
• Use cases enabled
- Microsegmentation Analytics – YES
- Overlay/Underlay – NO
- NSX operations – NO
• This data source is all you need for an Pre-assessment!!!
85
Data Sources: Virtual (vSphere/vCenter)
86
Vendor &
SystemVersion & Model Connection Details Data Collected
VMware
vSphere
vSphere 5.5 and above
VDS 5.5 and above
For IPFIX Data:
• VMware ESXi 5.5 Update
2 (Build 2068190) and
above
• VMware ESXi 6.0 Update
1b (Build 3380124) and
above
Connect to vCenter APIs over HTTPS (Pull
Data via API)
• Authenticated.
• Read only Permission Reqd.
• 10 minute poll interval
Inventory, Config & Metrics (Ex: VM,
Host, Datastore, Folder, Cluster,
Standard Switch, Distributed Virtual
Switch, Port Groups, DVPG,
Resource Pool, VMKNIC, PNIC,
vNIC, etc)
For IPFIX: Hosts pushes to Network
Insight Proxy on port UDP2055 and
delivers IPFIX data
• To enable IPFIX via Network Insight UI
requires Manage Permission on specific
entities in vCenter: Virtual Distributed Switch,
Distributed Port Group
Flow records
Fetch Host level information via vCLI VIBs, Modules, NSX Controller-Host
connection status, VMKNIC metrics
(ex: Rx, Tx, Dropped)
Data Sources: Virtual (NSX)
• Requires vCenter Data source added first!
• Use cases enabled
- Microsegmentation Analytics – YES
- Overlay/Underlay – Partial (You will get the virtual path only – No underlay componentsshown until physical data sources are added)
- NSX operations – YES
87
Data Sources: Virtual (NSX)
88
Vendor &
SystemVersion & Model Connection Details Data Collected
VMware
NSX
NSX 6.0, 6.1, 6.2
Does not support
• NSX Universal Constructs
• NSX IPFIX Netflow
VMware NSX Manager
• APIs over HTTPS
• Central CLI
• Polling at 10 min interval
• Auto back off if API response time
crosses a threshold
• Requires auditor role user
Controllers, Edges, Logical Switches,
Logical Routers, Security Group,
IPSet, Transport Zones, Translation,
Firewall Rules, Redirect Rules,
Service Groups, Exclusions, Security
Tags, NSX System Events etc
VMware NSX Edge
• Central CLI (6.2)
• Directly over SSH (pre-6.2)
• NSX Edge only supports admin user
account
IP Routes
VMware NSX Controller over SSH
• NSX Controller only supports admin
user account
More Info on Logical Router, Logical
Switch
Data Sources: Physical Gear (Network & Compute)
89
Vendor &
SystemVersion & Model Connection Details Data Collected
Cisco Nexus N5K, N7K, N9K, VSM
N1K
Connect to devices over
• SSH to execute CLI commands
• SNMP
• 5 min poll interval
SSH – Device Info, Ports, Interfaces,
VRFs, Routes, CDP,
GLBP/VRRP/HSRP, MAC etc.
SNMP - Interface Metrics
Cisco UCS
(Chassis, Fabric
Interconnect (FI),
FEX)
Series B Blade Servers,
Series C Rack Servers
Connect to UCS Manager
• HTTPS for APIs
• SSH for UCS Fabric Interconnects
• SNMP service on FI’s
• 5 min poll interval
HTTPS - Chassis, Blade, FEX,
Fabric Interconnect, Path, Ports,
Blade NIC etc.
SSH – FI CDP & Device Info
SNMP - Interface Metrics
Cisco Catalyst
Switches
3000, 3750, 4500, 6000,
6500
Connect to devices over
• SSH to execute CLI commands
• SNMP
• 5 min poll interval
SSH – Device Info, Ports, Virtual
Switch Domain, Interfaces, VRFs,
Routes, CDP, HSRP, MAC etc.
SNMP - Interface Metrics
Data Sources: Physical Gear (Network)
90
Vendor &
SystemVersion & Model Connection Details Data Collected
Dell FORCE10 MXL 10, FORCE10
S6K
Connects to devices over SSH (execute
CLI commands) and SNMP. 5 min poll
interval
SSH – Device Info, Ports,
Interfaces, VRFs, Routes, LLDP,
VRRP, MAC etc.
SNMP - Interface Metrics
Arista 7050TX, 7250QX Connects to devices over SSH (execute
CLI commands) and SNMP. 5 min poll
interval
SSH - Device Info, Ports,
Interfaces, VRFs, Routes, LLDP,
VRRP, MAC etc.
SNMP - Interface Metrics
Brocade VDX 6740, VDX 6940 Connects to devices over SSH (execute
CLI commands) and SNMP. 5 min poll
interval
SSH - Device Info, Ports,
Interfaces, VRFs, Routes, LLDP,
MAC etc.
SNMP - Interface Metrics
Data Sources: Physical Gear (Network)
91
Vendor &
SystemVersion & Model Connection Details Data Collected
Juniper EX3300 Connects to devices over SSH (execute
NETCONF commands) and SNMP. 5 min
poll interval
NETCONF – Device info, Ports,
Interfaces, VRFs, Routes, LLDP,
MAC etc.
SNMP - Interface Metrics
Performance Impact of Data Collection
• Netflow (aka IPFIX)
– Most resource intensive data source.
– VMware states impact on the ESXi host is “less than ½ of 1 percent”
• Performance impact of enabling SSH/CLI/SNMP collection from PhysicalCompute and Network Gear
– These are industry standard ways of collecting data
92
Intelligent Operations for Software-Defined Data Center
931 vRealize Suite components2 Included with vRealize Suite and ships with NSX
vRealize Operations1
Compute Storage Hybrid CloudNetwork & Security
vRealize
Network
Insight
vRealize Log Insight2
Physical/ Virtual/ Cloud Environment
Application
vRealize Business for Cloud1
Physical and Virtual Visibility
Visibility into Overlay and Underlay
vRealize Network Insight includes advanced analytics that can collect and display
configuration data from all the components involved in the overlay and underlay of the
network. Data is collected in realtime. vRealize Network Insight presents this in a smart user
interface, and make problem determination and visibility of the firewall and network configurations
very easy.
Lab 2: Physical and Virtual Visibility (45 minutes)
1. Log in to http://hol.vmware.com
2. Enroll in Lab 1729-SDC-1
3. Perform the lab steps in pages 44 through 86
CONFIDENTIAL96
Advanced NSX Management and Operations
Operational Guidance and Troubleshooting
Lab 3: Advanced NSX Management and Operations (45 minutes)
1. Log in to http://hol.vmware.com
2. Enroll in Lab 1729-SDC-1
3. Perform the lab steps in pages 87 through 102
CONFIDENTIAL99
NSX operations maturity model
Ending State
Beginning State
Blended,
cross-functional
Siloed,
specialized
People
Speed, agility,
standardization
Slow, error prone,
inconsistent
Processes
Converged,
correlated
Domain-specific,
hardware focused
Tooling
Confidential 30