Upload
quinn-jorden
View
227
Download
0
Tags:
Embed Size (px)
Citation preview
Certification of Computational Results
Greg Bronevetsky
Background
• Technique proposed by • Gregory F. Sullivan• Dwight S. Wilson• Gerald B. Masson
• All from Johns Hopkins CS Department.
Overview
• Trying to do fault detection without the severe overhead of replication.
• Certification Trails are a manual approach that has that programmer provide additional code to have the program check itself.
• A program generates a certification trail that details its work.
• A checker program can use this trail to verify that the output is correct in asymptotically less time.
• Several examples provided. No automation.
Roadmap
• We will cover some algorithms to which the Certification Trails technique has been applied• Sorting• Convex Hull• Heap Data Structures
• The addition of Certification Trails and the creation of the Checker is done manually by the programmer in all cases.
Trail for Sorting
• In order to verify the output of a sorting algorithm we must check that• The sorted items are a permutation of the
original input items.• The sorted items appear in a non-decreasing
order in the sorter's output.
• Thus, the trail should contain all the items in their original order, each labeled with its location in the sorted list.
Sorting Checker
• A Sorting Checker must:• Use the labels to place all elements into their
sorted spots and verify that this results in a non-decreasing order.
• Verify that no two elements are placed in the same location in the ordered list.
• The Sorter takes O(n2) or O(n log n) time.• The Checker takes O(n) time.• Checker is asymptotically faster than Sorter.
Convex Hull Problem
Given a set of points on a 2D plane, find a subset of points that forms a convex hull around all the points.
Convex Hull: Step 1
P1 is the
point with the least x-coordinate.
P6
P2
P8
P3
P5
P1
P7
P4
Points sorted in order of increasing slope relative to P1
Convex Hull: Invariant
P6
P2
P8
P3
P5
P1
P7
P4
All the points not on the Hull are inside a triangle formed by P1 and two
successive points on the Hull.
Convex Hull: Invariant
P6
P2
P8
P3
P5
P1
P7
P4
We know that P3 is not a Hull point because the clockwise angle between
lines and
is ≥ 180º.
≥ 180º
P2P3 P3P4
Convex Hull: Invariant
P6
P2
P8
P3
P5
P1
P7
P4
< 180º
Note that if clockwise angle between lines and
is < 180º, then P
3 is a Hull pointP2P3 P3P4
Convex Hull Algorithm
• Add P1, P
2 and P
3 to the Hull.
(Note: P1, P
2 and P
n must be on the Hull.)
• For Pk = P
4 to P
n• ... trying to add P
k to the Hull ...
• Let QA and Q
B be the two points most recently
added to the Hull:• While the angle formed by Q
A, Q
B and P
k ≥180
• remove QB from the Hull since it is inside the triangle: P1, QA, Pk.
• Add Pk to the Hull.
Trail for Convex Hull
• Augment Program to • Output {q
1, q
2, ..., q
m} = the indexes of the points
on the hull.• Output a proof of correctness for {x
1, x
2, ..., x
r} =
all points not on the Hull in the form of the triangle that contains it.
Point not on Convex Hull 3 Surrounding PointsP
3P
1, P
2, P
4
P7
P1, P
6, P
8
Convex Hull Checker
Checker must check that:• There is a 1-1 correspondence between input
points and {q1, q
2, ..., q
m} U {x
1, x
2, ..., x
r}.
• All points in the triangle proofs correspond to input points.
• Each point in in the triangle proofs actually lies in the given triangle.
• Every triple of supposed Hull points forms a convex angle.
• There is a unique locally maximal point on the hull.
Asymptotic Runtimes
• Original Convex Hull Algorithm takes O(n log n) time to sort and the Hull construction loop takes only O(n) time.O(n log n)-time total.
• Convex Hull Checker runs thru the set of points once for each check.O(n)-time total.
• Checker asymptotically faster than Original.
Certification Trails for Data Structures
• Lets have a data structure for storing value/key pairs, ordered lexicographically:(key, val) < (key', val') iff val<val' or (val=val' and key<key')
• Operations:• member(key): returns whether key is mapped to
some val.• insert(key, val): inserts a pair (key, val) into the
data structure.• delete(key): deletes the pair that contains key.
Data Structure Specs
• Data Structure Operations• changekey(key, newval): executed when the pair
(key, oldval) exists in the data structure. Removes this pair and inserts the pair (key, newval)
• deletemin(): deletes the smallest pair (according to the ordering). Returns “empty” if the data structure contains no pairs.
• predecessor(key): returns the key of the pair thatimmediately precedes key's pair or “smallest”if there is no such pair.
• empty(): returns whether the data structure is empty.
Data Structure Implementation
• Such a Data Structure can be implemented via an AVL tree, a red-black tree or a b-tree.
• Most operations will take O(log n) time.• We can augement implementations to
generate a certification trail:• insert(key, val): output the key of the
predecessor of the newly inserted pair (key, val). If there is no predecessor, output “smallest”.
• changekey(key, newval): output predecessor of the new pair (key, newval). If there is no predecessor, output “smallest”.
Data Structure Checker
• A Checker for any program using the above data structure can use the certification trail to implement a much faster data structure.
• All operations can be done in O(1) time.
• Resulting program will be faster than original program. Maybe asymptotically faster.
Optimized Data Structure
• A doubly linked list of (key, val) pairs, sorted according to the pair ordering relation.
• An array indexed by keys, containing pointers to (key, val) pairs corresponding to the indexes.
• The first pair (with key=0) contains value=sm, which is defined to be smaller than any other possible value.
Optimized Data Structure
• Optimized data structure operations:• insert(key, val):
• Read from trail prec_key = the key of the pair preceding the new (key, val) pair.
• Check that it is a valid index.• Look at the pair pointed to by array[prec_key].
• Verify that it is ≠null.• Place the (key, val) pair at index key, following the
(prec_key, prec_val) pair. • Check that before the insert() array[key] was =null.• Ensure that (key, val) is greater than its
predecessor and less than its successor.
Optimized Insert Example
Result of the call insert(5, 62)
Optimized Data Structure
• Optimized data structure operations:• delete(key): Remove the pair pointed to by array[key].
• Ensure that array[key]≠null.• changekey(key, newval): Call delete(key), followed by
insert(key, newval). These calls will check all necessary conditions.
• deletemin(): Look at the pair that follows the pair (0,sm) (pointed to by array[0]).
• If no such pair, return “empty”. • Else, if there exists pair (key, val), then remove it and
set array[key] to null.• empty(): Return whether there is a pair following the pair
(0,sm).
Optimized Data Structure
• Optimized data structure operations:• member(key): return whether array[key]=null.• predecessor(key):
• Look at the pair pointed to by array[key].• Follow its backward link to its predecessor pair.• If the predecessor pair is (0,sm) then return “smallest”.• Else, return the key field of that pair.
• Note that all the operations can be done in O(1) time.
Shortest Path
• A Shortest Path algorithm was implemented using the above algorithm.
• The original program used the original data structure that produced a certification trail.
• The checker version was identical to the original except that its data structure was the optimized version that used the trail.
• Original runtime = O(m•log n)• Checker runtime = O(m)• (m=number of edges, n=number of nodes)
Performance: Sort
• Basic Algorithm – Sorting algorithm with no certification trails.
• 1st Execution – Sorter that produces certification trail.• 2nd Execution – Checking algorithm that uses the trail.• Speedup – factor of improvement of 2nd vs Basic.• %Savings – of 1st + 2nd trails execution over running
Basic twice.
Size Basic Speedup % SavingsAlgorithm (Generates Trail) (Uses Trail)
10000 0.28 0.30 0.04 7.00 39.29%50000 1.80 1.90 0.19 9.47 41.94%
100000 3.96 4.08 0.41 9.66 43.31%500000 23.95 24.69 2.14 11.19 43.99%1000000 50.23 51.57 4.38 11.47 44.31%
1st Execution 2nd Execution
Performance: Sort
Size Basic Speedup % SavingsAlgorithm (Generates Trail) (Uses Trail)
10000 0.28 0.30 0.04 7.00 39.29%50000 1.80 1.90 0.19 9.47 41.94%
100000 3.96 4.08 0.41 9.66 43.31%500000 23.95 24.69 2.14 11.19 43.99%1000000 50.23 51.57 4.38 11.47 44.31%
1st Execution 2nd Execution
Performance: Convex HullSize Basic Speedup % Savings
Algorithm (Generates Trail) (Uses Trail)5000 0.61 0.62 0.07 8.73 43.62%10000 1.33 1.34 0.14 9.56 44.54%25000 3.68 3.68 0.36 10.22 45.12%50000 7.68 7.74 0.71 10.75 44.94%
100000 16.23 16.30 1.43 11.35 45.39%200000 33.93 34.37 2.84 11.94 45.16%
1st Execution 2nd Execution
Performance: Shortest PathSize Basic Speedup % Savings(n,m) Algorithm (Generates Trail) (Uses Trail)
100,1000 0.04 0.05 0.02 2.00 12.50%250,2500 0.15 0.16 0.06 2.50 26.67%500,5000 0.31 0.33 0.11 2.82 29.03%
100,10000 0.70 0.76 0.23 3.04 29.29%2000,20000 1.58 1.67 0.45 3.51 32.91%2500,25000 2.06 2.15 0.55 3.75 34.47%
1st Execution 2nd Execution
Summary of Experiments
• The overhead of generating a certification trail is about 2%.
• The checker run is much faster than the original. It can be run on much slower hardware or use a formally verified language.
Application to Byzantine Failures
• Current technique is completely manual. No known way to automatically convert a program to generate a trail.
• We may develop libraries that use the Certification Trails technique, allowing us to catch errors in a large fraction of a program.
• Door open to Failure Recovery: when an error is detected the checker goes back to using original code to redo the work.