Upload
phoenix-solway
View
213
Download
0
Tags:
Embed Size (px)
Citation preview
1
IV&V Facility
Using Fractal Analysis to Monitor and Model Software Aging
Mark Shereshevsky, Bojan Cukic, Jonathan Crowell,
Vijai Gandikota
West Virginia University
(WVU UI: Fractal Study of Resource Dynamics in Real Time Operating Systems)
2
IV&V Facility
Overview
• Introduction and motivation
• Fractality of resource utilization measures in operating systems
• Modeling software aging
• Experimental results
• Summary
3
IV&V Facility
Introduction
• “Software aging" phenomenon implies that the state of the software system degrades with time.
• The degradation manifests itself in performance decline (excessive paging and swapping activity etc.), possibly leading to crash/hang failures or both.
• Degradation is caused, in particular, by the exhaustion of the operating system resources, such as the number of unused memory pages, the number of disk blocks available for page swapping, etc.
4
IV&V Facility
Earlier Studies of Resource Exhaustion
• Vaidyanathan and Trivedi describe the behavior of operating system recourses as a function of time.• Slope (trend) depends on the workload state of the
system.• Workload dynamics is modeled as semi-Markov
process.
• In many workload states the dynamics of the resources demonstrates very high variance resulting in very broad confidence intervals.• The highly irregular and oscillatory behavior of the
data makes most trend model insufficient.
5
IV&V Facility
Our Research Objectives
• Investigate correlation between fractal properties of the resource data and the system’s workload.
• Develop fractal-based model of the resource exhaustion process.• Apply it to real-time operating systems.
• Investigate possibility of using such model for predicting system outages and for preventive maintenance planning.
6
IV&V Facility
Goal of the Study
• Can resource exhaustion be predicted? – Interested in monitoring approaches, suitable
for NASA deep space probes.
• Can fractal theory help?– Does system usage dynamics display fractal
behavior over time?– Analyze patterns of fractality in OS resources
and establish connection with the resource exhaustion.
7
IV&V Facility
Initial Data Collection: Memory Resources
• sml_mem - mem reserved for small requests
• lg_mem - mem reserved for large requests
• sml_alloc - mem allocated for small requests
• lg_alloc - mem allocated for large requests
• freemem - pages of free memory
• freeswap - swap space on disk
• First data collected from a department’s sun server, Sept. 15
- Sept 22, 2001
8
IV&V Facility
Fractality of Memory Resources
• Can this be used to predict a system Can this be used to predict a system crash ?crash ?
9
IV&V Facility
|)log(|
|))()(log(|inflim)(0 h
tfhtftH
hf
HÖlder Exponent of a Function
• HE characterizes the degree of local “burstiness” (fractality) of the function.
• The lower (closer to 0) the HE, the “wilder” the local oscillations.
• For a smooth function HE = 1 (or higher).
10
IV&V Facility
Plots of Data With Hölder Exponent
realMemoryFree data from SUN server (high workload);
Hölder exponents for the data sets.
11
IV&V Facility
Hölder Exponent Hystogram: An Example
The histogram of Hölder exponent for realMemoryFree (high workload).
12
IV&V Facility
Recent Data Collection
• Windows 2000 system stress tool used.
• 2 computers networked together, – One barraged the other with workload.
• The stress load was increased until a crash occurred.
13
IV&V Facility
Selecting Parameters for Monitoring
• Over a hundred OS parameters monitored.• We selected the three which:
– Do not have smooth or locally constant behavior;– Do not represent “per-unit-of-time” quantity (such as
system_calls_per_sec );– Do not have very high (over 0.9) mutual correlations.
• Selected parameters (resources):– Available_bytes;– Pool_paged-allocs;– System_cache_resident_bytes.
• We combine the parameters into a 3-dimensional “resource vector” and monitor its fractal dynamics.
14
IV&V Facility
Recent Experiments: Some Plots
Available Bytes, Pool Paged Allocs, Sys Cache Resident Bytes, and Multi-dimensional Hölder exponent
15
IV&V Facility
Observations and Hypotheses
• As the stress increases, HÖlder exponent decreases (fractality increases).
• The decrease of HÖlder exponent may be viewed as quantitative measure of resource exhaustion.
• Fractality tends to change in jumps.– Most of our experiments show two noticeable
drops in HÖlder exponent before crash occurs.
16
IV&V Facility
Multidimensional Hoelder Exponents
Series 03041141
0
0.5
1
1.5
2
1
137
273
409
545
681
817
953
1089
1225
1361
1497
1633
1769
1905
2041
2177
2313
2449
Series 02281510
0
0.5
1
1.5
2
1
15
5
30
9
46
3
61
7
77
1
92
5
10
79
12
33
13
87
15
41
16
95
18
49
20
03
21
57
Series 02161112
0
0.5
1
1.5
2
1
340
679
1018
1357
1696
2035
2374
2713
3052
3391
3730
4069
4408
4747
5086
5425
Series 02281631
0
0.5
1
1.5
2
1
195
389
583
777
971
116
5
135
9
155
3
174
7
194
1
213
5
232
9
252
3
271
7
291
1
310
5
17
IV&V Facility
Can Crashes Be Anticipated?
• Conjecture: the second “fractal jump” observed during the system’s operation signals a dangerous level of resource exhaustion which may lead to crash. However, there is still enough time for graceful shutdown of system.
• Problems:• Detection of “jumps” in noisy HE signal.• What is optimal shutdown time strategy (shut it
down immediately? Let the system run? For how long?).
19
IV&V Facility
Automatic Detection of “Fractal Jumps”
The HE plots with pink lines indicating “fractal The HE plots with pink lines indicating “fractal jumps”.jumps”.
Series 03041141
0
0.5
1
1.5
2
1
137
273
409
545
681
817
953
1089
1225
1361
1497
1633
1769
1905
2041
2177
2313
2449
Series 02281510
0
0.5
1
1.5
2
1
140
279
418
557
696
835
974
1113
1252
1391
1530
1669
1808
1947
2086
2225
Series 02161112
0
0.5
1
1.5
2
1
340
679
1018
1357
1696
2035
2374
2713
3052
3391
3730
4069
4408
4747
5086
5425
Series 02281631
0
0.5
1
1.5
2
1
195
389
583
777
971
116
5
135
9
155
3
174
7
194
1
213
5
232
9
252
3
271
7
291
1
310
5
20
IV&V Facility
Summary
• Is the “theory of the 2nd fractal jump” viable?– How long does the system have to live after the 2nd
jump?
– Develop a strategy for automatic preventive shut-down of the system based on the “fractal jumps” detection.
• Collect more and “better” data.– Allow load increases and decreases.
• Explore the possibility to incorporate other parameters into the analysis framework.
• Port the analysis into a real-time environment.– NASA simulated testbeds, ARTS II processor (ISR).