Static Analysis
Wesner Moise
Lang.NET 2009 Symposium
Tuesday, April 14, 2008
Who am I?
• Ex-developer in Microsoft Excel for PivotTables
• “Smart Software” blog
• Founder of SoftPerson, LLC
develops desktop applications enabled with symbolic AI
• NStatic - static analysis software
Static Analysis
• What is static analysis?
Software that identifies errors in source code prior to execution at runtime.
• Why use static analysis?
Bugs found early, when cost to fix is low
No performance overhead of runtime checking
Covers all code uniformly including those on rarely executed code paths
NStatic
Innovative .NET static software analysis
Friendlier
Debugger-like experience
Smarter & Faster
Human-like analysis for errors in a using smarts rather than brute-force search
NStatic Debugger-like Experience
Error List
Solution Explorer
Locals
Debugger-like Experience
• Richer error information than provided by other static analysis tools
Example: Simple command line output“file.cs(100): null pointer referenced”
• Familiar debugger windowsLocals, Call Stack, Immediate, Watch, Breakpoints
Simulated exception object displayed in Local Window
• Live symbolic executionProperty values in local and watch windows
Method calls in immediate windows
Debugger Experience
Assumptions Window
Locals Window
Source
Debugger Experience
Properties
Virtual call
Execution Highlighting
• Error path highlighted with orange arrows.
• Skipped code is dimmed.
• Important conditions highlighted with green circles and red strikeouts for truth and falsity.
Execution Highlighting
Non-linear and other
complicated execution
highlighting.
Debugger Experience
• Call Stack for Interprocedural analysis
• Handles various types of function calls
Recursion
Virtual methods
Interface methods
Closures
Iterators
IL Interpretation
• Interprets system & 3rd-party libraries
• Parameter validation
Smarter & Faster Analysis
Product is driven by the belief that
• “Anything a human can do a computer can do!”
Emulate the human.
• Don’t try to create fancy data structures and algorithms.
• Represent real world ideas faithfully and approach the problem as a human would.
Automating Human Thought
One of the major themes of the past century has been the growing replacement of human thought by computer programs. Whole areas of business, scientific, medical, and governmental activities are now computerized, including sectors that we humans had thought belonged exclusively to us. … Computers can fly airplanes; they can supervise and execute manufacturing processes, diagnose illnesses, play music, publish journals, etc.
The frontiers of human thought are being pushed back by automated processes, forcing people, in many cases, to relinquish what they had previously been doing, and what they had previously regarded as their safe territory, but hopefully at the same time encouraging them to find new spheres of contemplation that are in no way threatened by computers.
A=B, Zeilberger (200x)
Automating Human Thought
Extended Static Checker
“The horizontal line in Figure 1 labeled the “decidability ceiling” reflects the well-known fact that the static detection of many errors of engineering importance (including array bounds errors, null dereferences, etc.) is undecidable. Nevertheless, we aim to catch these errors, since in our engineering experience, they are targets of choice after type errors have been corrected, and the kinds of programs that occur in undecidability proofs rarely occur in practice. To be of value, all a checker needs to do is handle enough simple cases and call attention to the remaining hard cases, which can then be the focus of a manual code review.”
Automating Human Thought
Diminishing Returnsof Brute Force Search
• Longer analysis finds fewer bugs per buck
• Bugs found through deep chains of reasoning harder for humans to understand
• Search is like optimizing for random problems
Number of simulation paths sampled by Prefix
(X) against Scan Time and Bug count (Y)
Automatic Human Thought
“It is well known that for many NP-complete problems, such as K-Sat, etc., typical cases are easy to solve; so that computationally hard cases must be rare”
Order parameter & Phase Transition• Hard instances occur around
critical value of order parameter.• Critical value separates solvable
and unsolvable (easy) instances.• At critical value, mix of ‘water and
ice’
“Where the REALLY Hard Problems Are?” (Taylor, 1991)
Automating Human Thought
• “Exploring the Computational Tradeoff of more Reasoning and Less Searching” (Bacchus, 2002)
• A successful deterministic preprocessor used in SAT competition
Based on Binary Hyper-Resolution & Equality Reduction
Look-ahead technique (similar to path consistency in constraint solving)
Powerful enough to solve a number of SAT problems without search
Smarter & Faster Analysis
Based primarily on symbolic computation applied to programs -- computer algebraic techniques aided by deterministic reasoning.
Smarter & Faster Analysis
Computer Algebra
• Discrete Calculus
• Recurrences
• Term Rewriting
• Generating Functions
• Various Techniques
Automated Reasoning
• Paramodulation
• Binary Hyper-resolution
• Equality Reduction
• Conflict Analysis
• Modal Logic
Smarter & Faster Analysis
NStatic Other Software
ProgramEncoding
Functions Logic, Flow graphs
Primary Analysis Computer Algebra Theorem Proving, Sat Solving, Model Checking
Search Term-rewriting, Look ahead
Nondeterministic Search
Symbolic Analysis Symbolic Often not symbolic
NStatic Execution
1) Programs are first converted from imperative form into functions on state.
An exact expression is produced for any program variable instantly.
Expression is simplified into a normal form.
“Denotational Semantics: A Methodology for Language Development” (David Schmidt, 1986)
NStatic Execution
Loop structures
converted to
equivalent
lambda
expressions.
NStatic Execution
2) Expressions are then simplified algebraically to normal forms under the current set of assumptions
Recursion Analysis
Recursive lambda expressions are…
… normalized to a single-argument recursive function taking an iteration argument (using the μsearch operator in recursion theory)
… use combinatorics (aka generating functions) to solve
… composed together rather than simply using beta reduction
Equation Solving
NStatic uses techniques from theorem proving and constraint solving to solve equations, boolean expressions, and inductive proofs.
Specifications
• Specification Keywords
Precondition
Postconditions
Asserts
• Interaction with “Code Contracts”
Utilize Microsoft namespace for “Code contracts”
Potential Operators
• Modal operators: necessary, possiblebased entirely on the modal axioms, not Kripke models
• Quantifiersforall, exists
• Temporal operatorsnecessary to express liveness and safety propertieseventually (F), always (G), untiltransformed to an equivalent expression, which could potentially be as large as the program itself.
Halting Problem?
• halting etransforms itself to an equivalent expression without the original halting operatorreturns true, false, <nontermination>or a symbolic expression evaluating to one of the three values
• Example:Collatz conjecture program• returns essentially the same function with a boolean result
Halting counterexample• returns <nontermination>
Benefits of Symbolic Computation
algorithm synthesis
algorithm verification
termination analysis of algorithms
timing analysis
complexity analysis of algorithms
extraction of specifications from algorithms generation of inductive assertion for algorithms algorithm transformations
query languages