Upload
wendy-williamson
View
212
Download
0
Embed Size (px)
Citation preview
Computational Computational ResiliencyResiliency
Steve J. Chapin, Susan OlderSteve J. Chapin, Susan Older
Center for Systems AssuranceCenter for Systems Assurance
Syracuse UniversitySyracuse University
Gregg IrvinGregg Irvin
Mobium EnterprisesMobium Enterprises
24 July 2001 Not for Public Release
Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release
Recap: What isRecap: What isComputational Computational Resiliency?Resiliency?
The ability to sustain application operation The ability to sustain application operation and dynamically restore the level and dynamically restore the level
of assurance during an attack.of assurance during an attack.
Application-centric self defense, builtApplication-centric self defense, builton replication, migration, functionalityon replication, migration, functionality
mutation, and camouflage.mutation, and camouflage.
Computational ResiliencyComputational Resiliency
Mission CriticalApplication
Attack
Degraded Application sufficiently Improved by
Resiliency to perform Mission Critical Function
Techniques applied to correct situation
ComputationalResiliency
Result ofAttack
Degraded Application trying to perform Mission Critical
Function
Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release
Multi-Faceted ApproachMulti-Faceted Approach
Theoretical frameworkTheoretical framework reason about conformance to policyreason about conformance to policy
Computational resiliency libraryComputational resiliency library dynamic application managementdynamic application management
System software support System software support scheduling/policy frameworksscheduling/policy frameworks
Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release
Computational Computational Resiliency LibraryResiliency Library Dynamic multithreadingDynamic multithreading MigrationMigration ReplicationReplication CamouflageCamouflage Functionality reconfigurationFunctionality reconfiguration Policy-based managementPolicy-based management
Example of CRLibExample of CRLib
16 2x Pentium
16 2x Pentium
16 2x Pentium
16 Alpha
Firewall
Intel 8x SMP
Intel 8x SMP
SGI Origin
3Com Superstack 3300
3Com Superstack 3300
3Com Superstack 3300
3Com Superstack 3300
"The Net"
“Safe Zone”OASIS protection
“The Wild”limited protection
The Benign StateThe Benign State
16 2x Pentium
16 2x Pentium
16 2x Pentium
16 Alpha
Firewall
Intel 8x SMP
Intel 8x SMP
SGI Origin
3Com Superstack 3300
3Com Superstack 3300
3Com Superstack 3300
3Com Superstack 3300
"The Net"
Dudley’s job(low priority)
Bullwinkle’s jobRocky’s job
The AttacksThe Attacks
16 2x Pentium
16 2x Pentium
16 2x Pentium
16 Alpha
Firewall
Intel 8x SMP
Intel 8x SMP
SGI Origin
3Com Superstack 3300
3Com Superstack 3300
3Com Superstack 3300
3Com Superstack 3300
"The Net"
Snidely attacks: blocked atfirewall
Dudley does nothing.
The AttacksThe Attacks
16 2x Pentium
16 2x Pentium
16 2x Pentium
16 Alpha
Firewall
Intel 8x SMP
Intel 8x SMP
SGI Origin
3Com Superstack 3300
3Com Superstack 3300
3Com Superstack 3300
3Com Superstack 3300
"The Net"
Natasha attacks Rocky; caught by IDS.
The AttacksThe Attacks
16 2x Pentium
16 2x Pentium
16 2x Pentium
16 Alpha
Firewall
Intel 8x SMP
Intel 8x SMP
SGI Origin
3Com Superstack 3300
3Com Superstack 3300
3Com Superstack 3300
3Com Superstack 3300
"The Net"
Rocky’s job migrates back into safe zone;Dudley must give up resources.
The AttacksThe Attacks
16 2x Pentium
16 2x Pentium
16 2x Pentium
16 Alpha
Firewall
Intel 8x SMP
Intel 8x SMP
SGI Origin
3Com Superstack 3300
3Com Superstack 3300
3Com Superstack 3300
3Com Superstack 3300
"The Net"
Boris attacks Bullwinkle’s job.Some attacks succeed.
The AttacksThe Attacks
16 2x Pentium
16 2x Pentium
16 2x Pentium
16 Alpha
Firewall
Intel 8x SMP
Intel 8x SMP
SGI Origin
3Com Superstack 3300
3Com Superstack 3300
3Com Superstack 3300
3Com Superstack 3300
"The Net"
Bullwinkle’s job employs camouflage,decoys, and migration.
Groups and ReplicationGroups and Replication
Group
Processor
One group per One group per computational computational tasktask
User selects User selects replication level, replication level, other policiesother policies
Group mapped Group mapped across processorsacross processors
Periodic liveness Periodic liveness checkschecks
Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release
Theory Framework: Theory Framework: GoalsGoals Understand the interplay among Understand the interplay among
core aspects of CRLibcore aspects of CRLib Groups, locations, resources, Groups, locations, resources,
schedules, …schedules, … Reason about effects of Reason about effects of
configuration and policy choicesconfiguration and policy choices Reason about applications’ Reason about applications’
conformance to desired behaviorconformance to desired behavior
Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release
Framework BasicsFramework Basics Build on existing mobile calculi Build on existing mobile calculi
-Calculus, Mobile Ambients, Join--Calculus, Mobile Ambients, Join-CalculusCalculus
Capture essential features of CRLibCapture essential features of CRLib ReplicationReplication MigrationMigration ReconfigurationReconfiguration CamouflageCamouflage
Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release
A A -Calculus Primer-Calculus Primer Collection of Collection of namesnames
Represent information: vRepresent information: values, alues, communication links (channels), codecommunication links (channels), code
Have scopeHave scope Message-based communicationMessage-based communication
receipt of a value on xreceipt of a value on xtransmission of y along xtransmission of y along x
Information mobility: information Information mobility: information can be passed beyond original can be passed beyond original scopescope
yx
yx
Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release
Finding a Service Finding a Service ProviderProvider
Client wants to find a service Client wants to find a service provider:provider:
1.1. Query the Service Directory, include Query the Service Directory, include a SASE. a SASE.
2.2. Wait for response.Wait for response.
3.3. Upon receipt, submit request.Upon receipt, submit request.
0... reqspspaddraddrquery
Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release
Handling Service Handling Service RequestsRequests Service Directory repeatedly responds Service Directory repeatedly responds
to queries, arbitrarily choosing provider.to queries, arbitrarily choosing provider.
Service providers wait for requests.Service providers wait for requests.
crabraararaquery .!
jobDOjobb .! jobDOjoba .!
jobDOjobc .!
Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release
crabraararaquery .!
jobDOjobb .! jobDOjoba .!
jobDOjobc .!
bbccaa
0... reqspspaddraddrquery
queryquery
Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release
0.. reqspspaddr
crabraararaquery .!
jobDOjobb .! jobDOjoba .!
jobDOjobc .!
bbccaa
caddrbaddraaddr
addraddr
a b ca b c
Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release
0.reqb
crabraararaquery .!
jobDOjobb .! jobDOjoba .!
jobDOjobc .!
bbccaa
0
bb
Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release
crabraararaquery .!
jobDOjobb .! jobDOjoba .!
jobDOjobc .!
bbccaa
reqDO
0
Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release
Initial QuestionsInitial Questions What are the primary entities, as What are the primary entities, as
well as the relationships among well as the relationships among them?them? Groups, locations, failuresGroups, locations, failures External events: DEFCON changesExternal events: DEFCON changes Scheduling policiesScheduling policies Application policies Application policies
What is the most appropriate way What is the most appropriate way to integrate those components?to integrate those components? And at what abstraction level?And at what abstraction level?
Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release
In Progress: Two Calculi In Progress: Two Calculi Higher-level calculus that Higher-level calculus that
incorporates the CRLib APIincorporates the CRLib API Captures groups, policies, etc.Captures groups, policies, etc.
Lower-level calculus that provides Lower-level calculus that provides semantics for higher-level calculussemantics for higher-level calculus Captures abstract implementation Captures abstract implementation
details. details.
Soundness of the translation will Soundness of the translation will provide validation. provide validation.
Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release
A Thought ExperimentA Thought ExperimentSuppose there are two tasks, A and Suppose there are two tasks, A and
B, working in parallel:B, working in parallel: A’s replication level: 4A’s replication level: 4 B’s replication level: 2B’s replication level: 2 Three processors: P1 P2 P3Three processors: P1 P2 P3
Resulting behavior (modulo Resulting behavior (modulo robustness) should be similar to robustness) should be similar to system with single copies of A and system with single copies of A and B.B.
Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release
Open QuestionsOpen Questions How do we define “similar”, much How do we define “similar”, much
less prove it?less prove it? CorrectnessCorrectness PerformancePerformance RobustnessRobustness
What are sufficiently high-level yet What are sufficiently high-level yet informative performance informative performance measures?measures? How to model camouflage?How to model camouflage?
Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release
Back to CRLib: StatusBack to CRLib: Status Multiple platformsMultiple platforms
Windows NT/2000, Linux, SGI IRIX, Windows NT/2000, Linux, SGI IRIX, SolarisSolaris
Heterogeneous resource Heterogeneous resource management methodsmanagement methods Load-balancing across heterogeneous Load-balancing across heterogeneous
networksnetworks Performance improvement by factor of 3Performance improvement by factor of 3
Demo this eveningDemo this evening
Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release
In ProgressIn Progress Adding support for Byzantine Adding support for Byzantine
failuresfailures User-level option for authenticated User-level option for authenticated
messagesmessages Based on Lamport-Shostak-Pease Based on Lamport-Shostak-Pease
algorithmsalgorithms Greater resiliency needed for Greater resiliency needed for
nonauthenticated messagesnonauthenticated messages Evaluating cost of replicationEvaluating cost of replication
Compare to standard checkpointingCompare to standard checkpointing
Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release
Next Steps for ProjectNext Steps for Project Tool for user policy expressionTool for user policy expression
Choices for replication/recovery methods, Choices for replication/recovery methods, agreement protocols, message-passing agreement protocols, message-passing schemes schemes
State-dependent policy specified via “chinese State-dependent policy specified via “chinese menu” approachmenu” approach
Scheduling frameworkScheduling framework Schedulers that understand CR policies, Schedulers that understand CR policies,
resulting resource demands, user/process resulting resource demands, user/process priorities priorities
Build on previous MESSIAHS and Legion workBuild on previous MESSIAHS and Legion work Finalize core CR calculi; turn to analysis Finalize core CR calculi; turn to analysis
techniquestechniques
Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release
Open IssuesOpen Issues Cost/benefit analysis of CRCost/benefit analysis of CR
How much protection do we provide if How much protection do we provide if the attacker knows what we’re trying the attacker knows what we’re trying to do?to do?
How much is performance affected by How much is performance affected by message load, active replication, message load, active replication, etc. ?etc. ?
Potential integration with other Potential integration with other OASIS projectsOASIS projects