Upload
others
View
10
Download
0
Embed Size (px)
Citation preview
OpenDaylight YANG Data Store:A High-Performance Data Store for SDN and IoT Applications
Jan MedvedWith contributions from Robert Varga, John Burns, Kun Chen, Branislav Janosik
OpenDaylight is an SDN Controller, Right?
SAL/Core
Protocol Plugin
NetconfClient
Protocol Plugin
...
ApplicationNetconfServer
NorthboundRESTCONF ... Application
Protocol Plugins/Adapters
Controller Core
Controller Apps/ServicesREST
...
... OSS/BSS, External Apps
Network Devices
Protocol Plugin
ODL Architecture: an Application Development Platform
Model-Driven SAL(MD-SAL)
NetconfClient
Protocol Plugin
...NetconfServer RESTCONF ApplicationApplication
REST
DataStore
RPCs
Notifications Conceptual Data Tree - Operational
“Kernel”
Apps/Services
Data Change Notifications
Yang Model
OSS, BSS, External Apps Network Devices
Conceptual Data Tree - Config
YANG is ….
• A NETCONF modeling language– Think SMI for NETCONF
• Models semantics and data organization– Syntax falls out of semantics
• Able to model config data, state data, RPCs, and notifications
• Based on SMIng syntax– Text-based
• Email, patch, and RFC friendly
• Also used a Interface Description Language in OpenDaylight
YANG Concepts
leafs
Config/Oper data
types
RPCs Notifications
StandardModels
ProprietaryModels
containers
YANG ....
• Directly maps to XML or JSON content on the wire
• Extensible– Add new content to existing data models
• Without changing the original model– Add new statements to the YANG language
• Vendor extensions and future proofing
• Tools in OpenDaylight to generate Java Code from yang models– Compile and runtime
• See tools at www.yang-central.org
Yang Data Store Challenge: Conceptual Data TreeYang defines a tree address space:• Single inheritance for containers and leafs
Leaf Node
…
Interior Node
Root NodeChildren:• HashMap
(<100)• TrieMap
(ctrie)
The OpenDaylight Data Store…
• Implements the equivalent of a W3C DOM Document tree– No parent axis maintained -> efficient copy-on-write snapshot
• MVCC-based atomic updates in a single-writer, multiple-readers environment1. Modifications prepared concurrently2. Single update thread:
• Requested mods are applied and and made visible to subsequent snapshots3. Efficient physical replication
• Enforcement of structural integrity according to yang schemas
Data Store Operation:Step 1- Transaction Created
Initial snapshot when transaction created Modification tree (populated as mods in the transaction are performed)
Root nodeLogical operation = None
Data Store Operation:Step 2 - Creating a New Node (Write)
Initial snapshot when transaction created Modification tree (populated as mods in the transaction are performed)
Root node Logical operation = Touch
Creating this node in the tree Logical operation = Write
Original node pointers
Data Store Operation:Step 3 - Transaction Seal (Ready)
Initial snapshot when transaction created Modification tree (populated as mods in the transaction are performed)
Creating this node in the tree Logical operation = Write
Original node pointers
Seal transaction:• Check that touched nodes not empty• Check data written into write nodes
Steps 2 and 3 can run concurrently in multiple threads
Data Store Operation: Step 4a - Transaction Prepare (Walk Down)
Initial snapshot when transaction created Modification tree (populated as mods in the transaction are performed)
Creating this node in the tree
Prepare transaction:• Check Subtree Version and Node Version:• Exception if conflict detected
Steps 4 and 5 must be executed in one and the same thread
Current state of the tree
Data Store Operation:Step 4b -Transaction Prepare (Walk Up)
Initial snapshot when transaction created Modification tree (populated as mods in the transaction are performed)
Creating this node in the tree Logical operation = WriteModification Type = Write
Prepare transaction:• Create a copy of the current tree (only copy modified nodes)• Retain old data in nodes (both new and old data available in current state)• Exception if data does not conform to schema
Steps 4 and 5 must be executed in one and the same thread
Current state of the tree
Data Tree Candidate
Data Store Operation:Step 5: Commit
Initial snapshot when transaction created:• If commit successful, only used in transactions that
were created on top of the just committed transaction
• will be garbage collected if not needed anymore
Modification tree (populated as mods in the transaction are performed)
Creating this node in the tree Logical operation = WriteModification Type = Write
Commit transaction:• Atomic move of the root node from data tree candidate to current state
Steps 4 and 5 must be executed in one and the same thread
Current state of the tree -> before transaction
Data Tree Candidate
Root
Step 6: Notify Listeners
Modification tree (populated as mods in the transaction are performed)
Logical operation = WriteModification Type = Write
Commit transaction:• Atomic move of the root node from data tree candidate to current state
Current state of the tree before transaction (only visible from this transaction)
Current state of the tree – after transaction (visible for all subsequent transactions)
Listeners
How to Improve Transaction Performance
• State Compression– Multiple fine-grained transactions compressed into one bigger transaction
• “Fate sharing” -> faster success path, but longer recovery– Transaction Chaining with ping-pong buffer
• Single writer
• Sharding (Parallelism)
Transaction Chaining with Ping-Pong Buffer
Application
TX1
TX2
TX3
TX4
Data Tree
API
1. App creates a TX Chain and issues a bunch of transactions
put();put();
…put();
submit();
TX1TX4
TX2
TX3
Transaction Chaining with Ping-Pong Buffer
Modification Tree 1
Application
TX2
TX3
TX4
Data Tree
TX1
2. First transaction creates a Modification Tree• submit() starts Steps 4-5
Steps 4 & 5 (takes time!)
API
Transaction Chaining with Ping-Pong Buffer
Modification Tree 1
Application
TX2
TX3
TX4
Data Tree
TX1
3. TX2 issued while TX1 submit() being processed
Steps 4 & 5 (takes time!)
API
Modification Tree 2
Transaction Chaining with Ping-Pong Buffer
Modification Tree 1
Application
TX2
TX3
TX4
Data Tree
TX1
3. TX3 issued while TX1 submit() is STILL being processed
Steps 4 & 5 (takes time!)
API
Modification Tree 2
Transaction Chaining with Ping-Pong Buffer
Modification Tree 3
Application
TX2
TX3
Data Tree
TX4
4. TX4 issued while combined TX2 & TX3 being submitted
Steps 4 & 5 (takes time!)
API
Modification Tree 2
0
50000
100000
150000
200000
250000
100000/1 10000/10 1000/100 100/1000 10/10000 1/100000
Tran
sact
ions
/ Se
c
outer-list/inner-list elements
TX-CHAINING/BI
TX-CHAINING/BA
SIMPLE-TX/BI
SIMPLE-TX/BA
Transaction Chain vs Single Transactions Performance
Test Data: List of lists
Node 3
In-MemoryDataStore (IMDS)
ClusteredDataStore (CDS)
Sharding
DefaultShard DSShard1 DS Shard2 DS Shard3 DS
Node 1
Conceptual Data Tree in Node1
Node 2L L L L
Raft decides Shard Leader placement
“Shard Anchor / Shard Root”
Shard4 DS
Data Replication
Shard Performance Testing: Test Setup
Test
Shard1 Shard2 Shard n
…
Round-RobinRandom
Multi-Producer Test
Shard Worker Shard Worker Shard Worker
Write Write Write
• List/Array of x string elements divided into 1 … n shards
• (Test) writes data to Shard Workers (Producers) in Round-Robin or Random fashion
• Worker handles tx responses asynchronously (Futures)
Performance Results: Multiple Producers
0
100
200
300
400
500
600
700
800
900
1000
1 S H A R D 2 S H A R D S 4 S H A R D S 8 S H A R D S 1 6 S H A R D S
PE
RFO
RM
AN
CE
(KD
ATA
/SE
C)
MULTI-THREADED
0 listener
1 listener
10 listeners
100 listeners
IoTDM
26
• What is IoTDM?
– A collection of projects for IOT data management.
• git clone https://git.opendaylight.org/gerrit/p/iotdm.git
• https://wiki.opendaylight.org/view/IoTDM:Main
– Initial IOT standard chosen to implement is onem2m.org
– Model IOT things and their data in the ODL data store
• onem2m resources and resource tree modeled generically in the onem2m.yang file
Other standards in the future such as
OIC, LWM2M, …
Initial strategy for other standards is to adapt /interwork the other standards, normalized to onem2m
26
IoTDM Architecture
27
oneM2M protocols
IOT Thing IOT Thing
oneM2M HTTP, CoAP, MQTT
protocols OIC CoAP protocol
Non-oneM2M protocol plugins
oneM2M data store
oneM2M core
IOT Thing
LWM2M Kafka, ELK
IOT APP
TSDR Other ODL components
oneM2M Data Export Plugins
oneM2M Tree for Performance
28
Base
Device1
Sensor1-1 Sensor1-2
DeviceN
SensorN-1
OpenDaylight Application Model
Node1
S1.1
App 1
Data Broker API
SX.Y Shard Leader replica
SX.Y Shard Follower replica
App 3
Service/Application:• Code + Data (Shards, subtrees of the Conceptual Data Tree)• Service/Application instances SHOULD be co-located with shard replicas• Active Service/Application instance SHOULD be co-located with all Shard Leaders it “owns”• Apps can be grouped for ”fate sharing
App 2
S1.2 S2.1 S2.1 S2.1
Node2
S1.1
App 1
Data Broker API
App 3
S1.2 S2.1 S2.1
Node3
S1.1
App 1
Data Broker API
App 3App 2
S1.2S2.1 S2.1 S2.1
Service
Service
Active App/Service
Standby App/Service
App 2
S2.1
App Group App GroupApp Group
Internal vs. External Data Store
IoTDM Application
Data Broker API
Internal Data Store
IoTDM Application
Data Broker API
External Data Store(Cassandra)
Data Store Performance
0100200300400500600700800900
1000
1 000 5 000 10 000 20 000 50 000 50 000 50 000 50 000 50 000
200 200 200 200 200 300 400 500 1000
Creates/sec
Cassandra Internal Data Store
0
10,000
20,000
30,000
40,000
50,000
60,000
70,000
1 000 5 000 10 000 20 000 50 000 50 000 50 000 50 000 50 000
200 200 200 200 200 300 400 500 1000
Retrieves/sec
Cassandra Internal Data Store
Thank you
Performance Results: Impact of Clustering on CDS
Performance Results: Lessons Learned
• Small transactions are dominated by initialization cost, charged to producer– Affects a single thread’s ability to saturate backend
• Batching effectiveness goes up with backend latency– Many listeners, complex transaction processing, messaging latency
• Listening across shards results in heavy backend contention– Increases latency in notification delivery– Results in more events being batched hence spikes are observable
• Dynamic sharding improves performance with multiple applications– Per-application shards result in better isolation and improved parallelism– Single-threaded applications are unable to fully saturate IMDS
Getting & Running the Performance Test Code
• Clone coreturials:git clone https://git.opendaylight.org/gerrit/coretutorials.git
• Build a distribution with sharding performance tests:cd coreturials/clustering/shardingsimplemvn clean install
• Run the distribution:./karaf/target/assembly/bin/karaf –Xmx4096m
• Run the test script: cd coreturials/clustering/scripts/sharding./shardingbenchmark.py –help (will print all the parameters in the script)More info in coreturials/clustering/scripts/sharding/site/asciidoc/scripts-user-manual.adoc
ODL Yang Resources• YangTools main page:
– https://wiki.opendaylight.org/view/YANG_Tools:Main
• Code Generation demo– https://wiki.opendaylight.org/view/Yang_Tools:Code_Generation_Demo
• Java “Binding Specification”:– https://wiki.opendaylight.org/view/YANG_Tools:YANG_to_Java_Mapping
• DLUX• Main page: https://wiki.opendaylight.org/view/OpenDaylight_dlux:Main• YangUI: https://wiki.opendaylight.org/view/OpenDaylight_dlux:yangUI-user
• Controller:– Swagger UI Explorer:
• http://localhost:8181/apidoc/explorer/index.html– DLUX (Yangman):
• http://localhost:8181/dlux/index.html
Node 1
Existing Clustered Data Store - Details
AppConcurrent
DataBroker
DistributedData Store
OPERATIONAL
DistributedData Store
CONFIGShard
Manager
OPERATIONAL
ShardManager
CONFIG
Shard 1
Shard 2
AkkaClustering
…
Node 2
Node 3
Shard 1
Shard 2
…
Shard 1
Shard 2
…
FrontEnd BackEnd
Supervise
MemberInfo
TxTx
Node 1
Data Store Evolution in Boron and Beyond
App
Node 2
Node 3
Shard 1
Shard n
…
Shard 1
Shard n
…
DAGP/C
DOMDT
ShardingService
!!!
y.d.a. = Yang Data APIDAB/PC – Directed Acyclic Graph Producer/Consumer
IMDS Shard n
IMDS Shard 1y.d.a.DataTree
CDS Shard n
CDS Shard 1
FrontEnd BackEnd
Shard ManagerOPERATIONAL
Shard ManagerCONFIG
Shard n
Shard 1
Akka Clustering
Supervise
Member Info
CDS Node Agent
Creates y.d.a.DataTree
y.d.a.DataTree