13
Recovery-Oriented Computing Discovering Correctness Constraints for Self- Management of System Configuration Emre Kıcıman and Yi-Min Wang [email protected], [email protected] Software Infrastructures Group Microsoft Research Stanford University

Recovery-Oriented Computing Discovering Correctness Constraints for Self-Management of System Configuration Emre Kıcıman and Yi-Min Wang [email protected],

Embed Size (px)

Citation preview

Page 1: Recovery-Oriented Computing Discovering Correctness Constraints for Self-Management of System Configuration Emre Kıcıman and Yi-Min Wang emrek@cs.stanford.edu,

Recovery-Oriented Computing

Discovering Correctness Constraints for Self-Management of System Configuration

Emre Kıcıman and Yi-Min Wang

[email protected], [email protected]

Software Infrastructures Group Microsoft Research

Stanford University

Page 2: Recovery-Oriented Computing Discovering Correctness Constraints for Self-Management of System Configuration Emre Kıcıman and Yi-Min Wang emrek@cs.stanford.edu,

2 ROC Retreat, January 12, 2004Emre Kıcıman

IntroductionIntroduction

Managing configuration systems is really hardManaging configuration systems is really hardConfig errors largest category of operator mistakes Config errors largest category of operator mistakes

[Oppenheimer][Oppenheimer]

Network problems: BGP Routing, DSN overload Network problems: BGP Routing, DSN overload [Mahajan,Brownlee][Mahajan,Brownlee]

Windows registry: -> instability, unexpected behavior [Ganapathi]Windows registry: -> instability, unexpected behavior [Ganapathi]

End-goal is “self-managing” configuration systemEnd-goal is “self-managing” configuration systemKeep system from entering illegal state; fix problems that sneak Keep system from entering illegal state; fix problems that sneak

past.past.

Part of solution: Monitor for misconfigurationsPart of solution: Monitor for misconfigurationsBut what do we monitor for?But what do we monitor for?

- Configuration that matches known problems- Configuration that matches known problems

- Configuration that does - Configuration that does not matchnot match known good state known good state

Glean: learning rules that describe good configuration Glean: learning rules that describe good configuration statestate

Page 3: Recovery-Oriented Computing Discovering Correctness Constraints for Self-Management of System Configuration Emre Kıcıman and Yi-Min Wang emrek@cs.stanford.edu,

3 ROC Retreat, January 12, 2004Emre Kıcıman

Background: Windows RegistryBackground: Windows Registry

● Windows registry is a configuration databaseWindows registry is a configuration database● Settings for hardware, operating system, installed Settings for hardware, operating system, installed

applications, user account info, user preferencesapplications, user account info, user preferences● Even, temporary info for Internet cache files, etc.Even, temporary info for Internet cache files, etc.● Hierarchical set of keys and valuesHierarchical set of keys and values

Page 4: Recovery-Oriented Computing Discovering Correctness Constraints for Self-Management of System Configuration Emre Kıcıman and Yi-Min Wang emrek@cs.stanford.edu,

4 ROC Retreat, January 12, 2004Emre Kıcıman

Glean ApproachGlean Approach

Goal: Discover correctness constraints for Goal: Discover correctness constraints for monitoring sys.monitoring sys.Analyze snapshots of “believed-good” registriesAnalyze snapshots of “believed-good” registries

1. Find configuration class1. Find configuration class● Def'n: repeated groups of settings that share common Def'n: repeated groups of settings that share common

structurestructure● E.g., file type information, ActiveX registrations, ...E.g., file type information, ActiveX registrations, ...

2. Look for invariants on and dependencies on each 2. Look for invariants on and dependencies on each classclass

● Size constraintsSize constraints● Enumeration constraint -> Enumeration constraint -> ● Reference constraint -> Key Reference constraint -> Key ∈∈ {identifies instance of C } {identifies instance of C }

Page 5: Recovery-Oriented Computing Discovering Correctness Constraints for Self-Management of System Configuration Emre Kıcıman and Yi-Min Wang emrek@cs.stanford.edu,

5 ROC Retreat, January 12, 2004Emre Kıcıman

Configuration ClassesConfiguration Classes

● Def'n: Repeated groups of settings that share Def'n: Repeated groups of settings that share common structurecommon structure

● Ex: Both Ex: Both \hklm\Software\classes\.jpeg\hklm\Software\classes\.jpeg and and

\hklm\Software\classes\.gif\hklm\Software\classes\.gif

have the subkeyshave the subkeys

{perceivedtype, persistenthandler,content type, openwitprogids}{perceivedtype, persistenthandler,content type, openwitprogids}● ... same structure is shared by other registered image types.... same structure is shared by other registered image types.

● Measure similarity based on substructure of a keyMeasure similarity based on substructure of a key

● Ignore hierarchical structure during class discoverIgnore hierarchical structure during class discover● Allows finer-granularity config classesAllows finer-granularity config classes● Allows discovery of classes across user accounts, Allows discovery of classes across user accounts,

backups, ...backups, ...

Page 6: Recovery-Oriented Computing Discovering Correctness Constraints for Self-Management of System Configuration Emre Kıcıman and Yi-Min Wang emrek@cs.stanford.edu,

6 ROC Retreat, January 12, 2004Emre Kıcıman

Discovering Config ClassesDiscovering Config Classes

1. Filter out keys with little substructure1. Filter out keys with little substructure● They won't be useful during comparisonThey won't be useful during comparison

2. Use data clustering to group similar keys 2. Use data clustering to group similar keys togethertogether

● Measure distance by # of common subkeysMeasure distance by # of common subkeys● Stop when we hit some thresholdStop when we hit some threshold

3. Each cluster with >N keys is one config class3. Each cluster with >N keys is one config class

4. Look at the common subkeys that define each 4. Look at the common subkeys that define each clustercluster

● Use these common subkeys as the “name” of the Use these common subkeys as the “name” of the classclass

Page 7: Recovery-Oriented Computing Discovering Correctness Constraints for Self-Management of System Configuration Emre Kıcıman and Yi-Min Wang emrek@cs.stanford.edu,

7 ROC Retreat, January 12, 2004Emre Kıcıman

Naming Instances of ClassesNaming Instances of Classes

Use differing strings in the hierarchy as identifiersUse differing strings in the hierarchy as identifiers

\hklm\Software\classes\interface\\hklm\Software\classes\interface\{F08400BB-0960-47F4-9E12-{F08400BB-0960-47F4-9E12-591DBF370546}591DBF370546}

\hklm\Software\classes\interface\\hklm\Software\classes\interface\{D93A191C-525A-43BC-ACFD-{D93A191C-525A-43BC-ACFD-7EF494143CF4}7EF494143CF4}

\hklm\Software\classes\interface\\hklm\Software\classes\interface\{BF955013-A875-439D-A4E7-{BF955013-A875-439D-A4E7-A3BBDF12AA4F}A3BBDF12AA4F}

......\hklm\Software\\hklm\Software\classes\interfaceclasses\interface\\**

... can be more complicated than just a single id per key ...... can be more complicated than just a single id per key ...

\hku\*\software\microsoft\windows\*\bags\*\shell\hku\*\software\microsoft\windows\*\bags\*\shell

Page 8: Recovery-Oriented Computing Discovering Correctness Constraints for Self-Management of System Configuration Emre Kıcıman and Yi-Min Wang emrek@cs.stanford.edu,

8 ROC Retreat, January 12, 2004Emre Kıcıman

Configuration Classes in a RegistryConfiguration Classes in a Registry

● Start with a typical registry snapshotStart with a typical registry snapshot● 179,776 keys179,776 keys

● Filter out all keys with < 3 subkeysFilter out all keys with < 3 subkeys● 24,907 keys24,907 keys

● Data clusteringData clustering● 1287 keys don't cluster at all, 657 clusters w/ 2 keys1287 keys don't cluster at all, 657 clusters w/ 2 keys

● 909 Configuration Classes909 Configuration Classes● largest class has 5657 keys largest class has 5657 keys (\HKLM\Software\CLASSES\(\HKLM\Software\CLASSES\

INTERFACE\*)INTERFACE\*)

● mean size of class = 24 keys; median = 6;mean size of class = 24 keys; median = 6;

Page 9: Recovery-Oriented Computing Discovering Correctness Constraints for Self-Management of System Configuration Emre Kıcıman and Yi-Min Wang emrek@cs.stanford.edu,

9 ROC Retreat, January 12, 2004Emre Kıcıman

Correctness ConstraintsCorrectness Constraints

● InternalInternal constraints describe structure within a constraints describe structure within a config classconfig class

● ExternalExternal constraints describe dependencies constraints describe dependencies between config classesbetween config classes

● ... and between any registry key and a class... and between any registry key and a class

Page 10: Recovery-Oriented Computing Discovering Correctness Constraints for Self-Management of System Configuration Emre Kıcıman and Yi-Min Wang emrek@cs.stanford.edu,

10 ROC Retreat, January 12, 2004Emre Kıcıman

Internal ConstraintsInternal Constraints

● Size constraintSize constraint● Template: sizeof( i.subkey ) = X Template: sizeof( i.subkey ) = X | i | i ∈ ∈ CC● need to discover values for C, subkey and Xneed to discover values for C, subkey and X

● Enumeration constraintEnumeration constraint● Template: i.subkey Template: i.subkey ∈∈ {x {x

11,x,x22,x,x33,...} | i ,...} | i ∈ ∈ CC

● need to discover values for C, subkey and {xneed to discover values for C, subkey and {x11,x,x22,x,x33,...},...}

1. For each constraint, we start w/a template rule.1. For each constraint, we start w/a template rule.

2. Iterate through configuration classes2. Iterate through configuration classes

3. Fill in the template based on snapshot data -> 3. Fill in the template based on snapshot data -> hypothesishypothesis

4. Test hypothesis and either accept or reject.4. Test hypothesis and either accept or reject.

Page 11: Recovery-Oriented Computing Discovering Correctness Constraints for Self-Management of System Configuration Emre Kıcıman and Yi-Min Wang emrek@cs.stanford.edu,

11 ROC Retreat, January 12, 2004Emre Kıcıman

Internal Constraints Summary Internal Constraints Summary ResultsResults

● 2646 SizeConstraint rules2646 SizeConstraint rules● 212 size=0 rules212 size=0 rules● Ex., sizeof(Ex., sizeof(...\certificatetemplatecache\*\RENEWALOVERLAP) = ...\certificatetemplatecache\*\RENEWALOVERLAP) =

88

● 2250 EnumConstraint rules2250 EnumConstraint rules● Ex. \hklm\System\controlset*\services\* \TYPE Ex. \hklm\System\controlset*\services\* \TYPE ∈ ∈ {16, 32}{16, 32}

● Next step: Cross-registry validationNext step: Cross-registry validation1. Generate rules from each of N registry snapshots1. Generate rules from each of N registry snapshots

2. Reject or generalize rules which aren't true across 2. Reject or generalize rules which aren't true across snapshotssnapshots

Page 12: Recovery-Oriented Computing Discovering Correctness Constraints for Self-Management of System Configuration Emre Kıcıman and Yi-Min Wang emrek@cs.stanford.edu,

12 ROC Retreat, January 12, 2004Emre Kıcıman

Reference ConstraintsReference Constraints

● We want to discover any keys that We want to discover any keys that referrefer to an to an instance of a configuration classinstance of a configuration class

● E.g., “default printer” keys must match “printer config” E.g., “default printer” keys must match “printer config” namename

● Template: k Template: k ∈∈ ID(i) | i ID(i) | i ∈ ∈ CC

1. Take all values in registry and put into HashTable1. Take all values in registry and put into HashTable

2. for each configuration rule C2. for each configuration rule C

3. Pull out all values from HT that equal an ID of 3. Pull out all values from HT that equal an ID of instance(C)instance(C)

(no summary results yet, but “default printer” ex. is (no summary results yet, but “default printer” ex. is real)real)

Page 13: Recovery-Oriented Computing Discovering Correctness Constraints for Self-Management of System Configuration Emre Kıcıman and Yi-Min Wang emrek@cs.stanford.edu,

13 ROC Retreat, January 12, 2004Emre Kıcıman

SummarySummary

● Infer likely correctness constraintsInfer likely correctness constraints● ... based on good snapshots of registries... based on good snapshots of registries● Take advantage of extra structure of configuration Take advantage of extra structure of configuration

classesclasses● Fill in simple template rules to generate likely constraintsFill in simple template rules to generate likely constraints

● Easily generate 1000s of rulesEasily generate 1000s of rules● Much easier than writing by handMuch easier than writing by hand● Spot checked rules make intuitive senseSpot checked rules make intuitive sense

● Next stepsNext steps● Validate rules across many registriesValidate rules across many registries● Q: Will these rules help detect real problems? Analyze Q: Will these rules help detect real problems? Analyze

problem reports from Product Support.problem reports from Product Support.