1 Static and Dynamic Protection from Vulnerabilities for Web Applications Benjamin Livshits SUIF...

Static and Dynamic Protection from Vulnerabilities for Web Applications

Benjamin Livshits

SUIF Compiler GroupComputer Science Lab

Stanford University

• blogger.com cracked Aug. 2005• Firefox marketing site hacked Jul. 2005• MS UK defaced in hacking attack Jul. 2005• Hacker hits Duke system Jun. 2005• MSN site hacked in South Korea Jun. 2005• MSN site hacking went undetected for days Jun. 2005• Phishers manipulate SunTrust site to steal data Sep. 2004• Tower Records settles charges over hack attacks Apr. 2004• Western Union Web site hacked Sep. 2000

Real-Life Hacking Stories

• 75% of all security attacks today are at the application level*

• 97% of 300+ audited sites were vulnerable to Web application attacks*

• $300K average financial loss from unauthorized access or info theft**

• Average $100K/hour of downtime lost

* Source: Gartner Research

*Source: Computer Security Institute survey

Key Insight• Bugs in application code lead to vulnerabilities• Vulnerabilities lead to security breaches

– Information leaks = stolen sensitive data– Write access to unauthorized data = fraud

Simple Web App

• Web form allows user to look up account details• Underneath – Java Web app. serving requests

• Happy-go-lucky SQL statement:

• Leads to SQL injection– One of the most common Web application

vulnerabilities caused by lack of input validation• But how?

– Typical way to construct a SQL query using concatenation

– Looks benign on the surface– But let’s play with it a bit more…

SQL Injection Example

String query = “SELECT Username, UserID, Password FROM Users WHERE

username =“ + user + “ AND

password =“ + password;

Injecting Malicious Data (1)

query = “SELECT Username, UserID, Password

FROM Users WHERE

Username = 'bob'

AND Password = ‘********‘”

submit

FROM Users WHERE

Username = 'bob‘--

‘AND Password = ‘ ‘”

submit

FROM Users WHERE

Username = 'bob‘; DROP Users--

‘AND Password = ‘‘”

Summary of Attacks Techniques

1. Inject (taint sources)

• Parameter manipulation• Hidden field manipulation• Header manipulation• Cookie poisoning• Second-level injection

2. Exploit (taint sinks)

• SQL injections• Cross-site scripting• HTTP request splitting• HTTP request smuggling• Path traversal• Command injection

1. Header manipulation + 2. HTTP splitting = vulnerability

Input and output validation are at the core of the issue

Which Vulnerabilities are Most Prevalent?

• http://www.SecurityFocus.com – 500 vulnerability reports– A week of Nov. 26th – Dec. 3rd, 2005

Denial of service

Privilege escalation

Unauthorized access

Memory corruption

Temporary file manipulation

File Include

Heap overflow

Remote access

Password management

Race conditions

File upload issues

Input/output validation

Authentication bypass

Encryption issues

Focusing on Input/Output Validation

SQL Injection

Buffer overrunHTML Injection

Information disclosure

Code execution

Path traversal

Format string

Integer overlow

HTTP response splitting

Other input validation

Cross-site scripting

• SQL injection and cross-site scripting are most prevalent• Buffer overruns are losing their market share

Overview of Our Approach

Overview

Static

Dynamic

Experiments

Extensions

Conclusions

Future

Overview of Our Approach

• Targets developers• Finds vulnerabilities early in development

cycle• Sounds, so finds all vuln. of a particular type• Can be run after every build ensuring

continuous securityStatic analysis

Dynamic analysis• Targets system administrators• Prevents vulnerabilities from doing harm• Safe mode for Web application execution• Can quarantine suspicious actions,

application continues to run• No false positives

• Dynamic analysis incurs an overhead• Static results optimize dynamic overhead• Overhead often drops from 50% to under 1%

Describing Vulnerabilities

• Described using PQL [OOPSLA’05]– General language for

describing events on objects– Used for other applications

• Finding memory leaks • Mismatched method pairs• Serialization errors• Unsafe password

manipulation• Finding optimization

opportunities• etc.

• Simple example– SQL injections caused by

parameter manipulation– Looks like a code snippet

query simpleSQLInjection returns object String param, derived; uses object HttpServletRequest req; object Connection con; object StringBuffer temp; matches { param = req.getParameter(_);

temp.append(param); derived = temp.toString();

con.executeQuery(derived); }

Parametermanipulation

SQLinjection

• Real queries are longer and more involved– Describing all vulnerabilities we are looking for: 159 lines of

PQL– But it is suitable for pretty much all J2EE applications

Analysis Framework Architecture

Specificationin PQL

Sources

Derivations

Static analysis Dynamic analysis

Warnings Instrumentedapplications

Importance of a Sound Solution

• Sound solution can detect all bugs– Sound means the tool cannot miss a bug– No warnings reported => no bugs– Especially attractive for security– Provide guarantees about security posture of an

application

• A sound solution is hard to get– Tension between soundness and precision

• Soundness means “lots of false positives” to some• With care a sound solution can remain precise

– Need to analyze all of the program • Hard because Java allows dynamic class loading

– Need to have a complete spec of what to look for

• Soundness statement

Our analysis finds all vulnerabilities in statically analyzed code that are captured by the specification

Static Analysis

Overview

Static

Dynamic

Experiments

Extensions

Conclusions

Future

Why Pointer Analysis?

// get Web form parameterString param = request.getParameter(...);

// execute querycon.executeQuery(query);

• Imagine manually auditing an application– Two statements somewhere in the program– Can these variables refer to the same object?

• Question answered by pointer analysis...

Pointers in Java?

• Java references are pointers in disguise

public String getRawParameter(String name) throws ParameterNotFoundException { String[] values = request.getParameterValues(name); if (values == null) { throw new ParameterNotFoundException(name + " not found"); } else if (values[0].length() == 0) { throw new ParameterNotFoundException(name + " was empty"); } return (values[0]); }

Taint PropagationString session.ParameterParser.getRawParameter(String name)

String session.ParameterParser.getRawParameter(String name, String def)

ParameterParser.java:586

public String getRawParameter(String name, String def) { try {

return getRawParameter(name); } catch (Exception e) {

return def; }} ParameterParser.java:570

String user = s.getParser().getRawParameter( USER, "" );StringBuffer tmp = new StringBuffer();tmp.append("SELECT cc_type, cc_number from user_data WHERE userid = '“);tmp.append(user);tmp.append("'“);query = tmp.toString();Vector v = new Vector();try { ResultSet results = statement3.executeQuery( query );...

Element lessons.ChallengeScreen.doStage2(WebSession s)

ChallengeScreen.java:194

What Does Pointer Analysis Do for Us?

• Statically, the same object can be passed around in the program:– Passed in as parameters– Returned from functions– Deposited to and retrieved from data structures– All along it is referred to by different variables

• Pointer analysis “summarizes” these operations:– Doesn’t matter what variables refer to it– We can follow the object throughout the program

Pointer Analysis Background• Question:

– Determine what objects a given variable may refer to– A classic compiler problem for 20 years+

• Until recently, sound analysis implied lack of precision– We want to have both soundness and precision

• Context-sensitive inclusion-based pointer analysis – Whaley and Lam [PLDI’04]– Recent breakthrough in pointer analysis technology– An analysis that is both scalable and precise– Context sensitivity greatly contributes to the precision

Importance of Context Sensitivity (1)

String id(String str) { return str;}

tainted

untainted

• Distinguishing between different calling contexts

Importance of Context Sensitivity (2)

String id(String str) { return str;}

tainted

untaintedtainted

Excessive tainting!!

Pointer Analysis Object Naming

• Static analysis approximates dynamic behavior

• Some approximation is necessary– Unbounded number of dynamic objects– Finite number of static entities for analysis

• Allocation-site object naming – de facto standard– Dynamic objects are represented by the line of

code (allocation site) that allocates them– Can be imprecise

• Two dynamic objects allocated at the same site have the same static representation

• Works well most of the time, but not always

Imprecision with Default Object Naming

700: String toLowerCase(String str) { … 725: return new String(…);726: }

foo.java:45

bar.java:30

String.java:7251

700: String toLowerCase(String str) { … 725: return new String(…);726: }

String.java:7252

String.java:725

• All objects returned by String.toLowerString() are allocated in the same place

Improved Object Naming

• We introduced an enhanced object naming 1. Containers: HashMap, Vector, LinkedList,…2. Factory functions:

String.toLowerCase(), ...

• Very effective at increasing precision– Avoids false positives in all apps but one– All FPs caused by a single factory method– Improving naming further gets rid of all FPs

Simple SQL Injection Query Translated

query simpleSQLInjection returns object String param, derived; uses object HttpServletRequest req; object Connection con; object StringBuffer temp; matches { param = req.getParameter(_);

temp.append(param); derived = temp.toString();

con.executeQuery(derived); }

• PQL is automatically translated into Datalog– Syntax-driven translation– Obviates the need for hand-written Datalog

Analysis in Datalog

simpleSQLInjection(hparam, hderived ) :–ret(i1, v1),call(c1, i2, "HttpServletRequest.getParameter"),pointsto(c1, v1, hparam),

actual(i2, v2, 0), actual(i2, v3, 1),call(c2, i2, "StringBuffer.append"),pointsto(c2, v2, htemp),pointsto(c2, v3, hparam),

actual(i3, v4, 0), ret(i3, v5),call(c3, i3, "StringBuffer.toString"),pointsto(c3, v4, htemp),pointsto(c3, v5, hderived),

actual(i4, v6, 0), actual(i4, v7, 1),call(c4, i4, "Connection.execute"),pointsto(c4, v6, hcon),pointsto(c4, v7, hderived).

Dynamic Analysis

Overview

Static

Dynamic

Experiments

Extensions

Conclusions

Future

Preventing Vulnerabilities

query main()returns object Object source, sink;uses

object java.sql.Connection con;object java.sql.Statement stmt;

matches { source := UserSource();sink := StringPropStar(source);

} replaces

con.prepareStatement(sink)with SQL.SafePrepare(con, source,

sink);

replaces stmt.executeQuery(sink)with SQL.SafeExecute(stmt, source,

sink);

PQL Instrumentation Engine

• PQL is translated into bytecode instrumentation– State machines interpret PQL queries and run alongside

program– Keep track of partial matches– Execute recovery code on a query match when necessary

• Provides a “safe execution mode” for Web applications– Smart and customizable (compare to perl –T)– User can insert recovery code

• “Finding Application Errors and Security Flaws Using PQL: a Program Query Language”, Michael Martin, Benjamin Livshits, and Monica S. Lam, [OOPSLA’05]

{x=y=o3}

t.append(x)

y := x

y := derived(t)

{x=y=o3}

}{x=o3

t=x.toString() y := derived(t)ε

Experimental Results

Overview

Static

Dynamic

Experiments

Extensions

Conclusions

Future

Benchmarks for Our Experiments

• Benchmark suite: Stanford SecuriBench– Publicly available (Google: SecuriBench)– “Defining a Set of Common Benchmarks for Web

Application Security”, Benjamin Livshits, Workshop on Defining the State of the Art in Software Security Tools, 2005

– SecuriBench Micro is coming out soon

• Widely used programs– Suite of 9 large open-source Java benchmark applications– Most are blogging/bulletin board applications– Installed at a variety of Web sites– Thousands of users combined

• Applied out static & dynamic analysis to these applications

– Reused the same J2EE PQL query for all– Statically : Found bugs, Measured false positives– Dynamically : Prevented bugs, Measured the overhead

Benchmark Statistics

Benchmark Version

File coun

Analyzed

classes

jboard 0.3 90 17,542 264

blueblog 1 32 4,191 306

webgoat 0.9 77 19,440 349

blojsom 1.9.6 61 14,448 428

personalblog 1.2.6 39 5,591 611

snipsnap1.0-BETA-

1 445 36,745 653

road2hibernate 2.1.4 2 140 867

pebble 1.6-beta1 333 36,544 889

roller 0.9.9 276 52,089 989

Total 1,35

5186,73

0 5,356• Real-life large open source Web applications• Released as Stanford Securibench

Classification of Errors (1)

SinksSources

SQL injectio

HTTP splitting

Cross-site

scripting

Path travers

Header manipulation 0 6 4 0 10

Parameter manipulation 6 5 0 2 13

Cookie poisoning 1 0 0 0 1

Non-Web inputs 2 0 0 3 5

Total 9 11 4 5 29

SinksSources

SQL injectio

HTTP splitting

Cross-site

scripting

Path travers

Total 9 11 4 5 29

SinksSources

SQL injectio

HTTP splitting

Cross-site

scripting

Path travers

Total 9 11 4 5 29• Total of 29 vulnerabilities found• We’re are sound: all analysis versions report them

Validating the Vulnerabilities

• Reported issues back to program maintainers– Most of them responded– Most reported vulnerabilities confirmed as

exploitable

• More that a dozen code fixes

• Often difficult to convince that a statically detected vulnerability is exploitable– Had to convince some people by writing exploits– Library maintainers blamed application writers for

the vulnerabilities

False Positives

Remaining 12 false positives for the most precise analysis version

Defaultnaming

Improved

Naming

Contextinsensitiv

Least precise

Contextsensitive

Most precise

Instrumented Executables

• Experimental confirmation– Found and prevented error in our experiments– Blocked exploits at runtime (2 SQL injections)

• Naïve implementation: – Instrument every string operation– Overhead is relatively high– Use static information to narrow down the scope of

instrumentation

• Overhead:– Unoptimized version: 9-125%, 57%

average– Optimized version: 1-37%, 14% average– Static optimization removes 82-99% of instr.

points

Extensions

Overview

Static

Dynamic

Experiments

Extensions

Conclusions

Future

Beyond The Basics

Reflection

Derivation methods

Specification discovery

Containers & Factories

Precision

Completeness

[Usenix Security ’05]

[APLAS ’05]

[FSE ’05, to be published]

[to be published]

Object sensitivity, etc. [to be published]

bddbddb

Our analysis finds all vulnerabilities in statically analyzed code that are captured by the specification

Reflection

Overview

Static

Dynamic

Experiments

Extensions

Conclusions

Future

The Issue of Reflection

• Most analyses for Java ignore reflection– Fine approach for a while: SpecJVM hardly uses reflection at

• Can no longer get away with this– Reflection extremely common:

• JBoss, Tomcat, Eclipse, etc. are reflection-based• Same is true about Web apps. EJBs are entirely reflection-based

• Call graph is incomplete– Code not analyzed => bugs are missing– Ignoring reflection misses ½ application & more

Missing portions of the callgraph (# methods)

jgap freetts gruntspud jedit columba jfreechart

Reflection Resolution Algorithm

• Developed the first call graph construction algorithm to explicitly deal with the issue of reflection– Uses points-to analysis for call graph discovery– Finds specification points– Type casts in program are used to reduce specification effort

• Applied to 6 large apps, 190,000 LOC combined– About 95% of calls to Class.forName are resolved at least

partially without any specs– There are some “stubborn” calls that require user-provided

specification or cast-based approximation– Cast-based approach reduces the specification burden

• Reflection resolution significantly increases call graph size: – As much as 7X more methods– Adds 7,000+ new methods in for some benchmarks

• “Reflection Analysis for Java”, Benjamin Livshits, John Whaley, and Monica S. Lam, [APLAS’05]

Derivation Routines

Overview

Static

Dynamic

Experiments

Extensions

Conclusions

Future

Flow of Taint

Security violation

source

StringBuffer.append

String.substrin

source

How do we know what these are?

Finding Derivation Routines

• Many standard derivation routines such as – String.toLowerCase(), String.replace(…),

String.insert(…), String.substring(…), String.concat(…)

– StringBuffer.append(…),...– StringTokenizer.nextElement(…),

StringTokenizer.nextToken(…),...

• Many application-specific derivation routines as well– Many methods that manipulate string values– Involve low-level character operations

• Derivation routines can play different roles– Depends on the analysis (sources & sinks)– Some of them work as derivation routines– Others are sanitizers

Derivation Routines (1)

public static String filterNewlines(String s) {if (s == null) {

return null;}StringBuffer buf = new StringBuffer(s.length());// loop through characters and replace if necessaryint length = s.length();for (int i = 0; i < length; i++) {

switch (s.charAt(i)) {case ’\n’:

break;default :

buf.append(s.charAt(i));}

}return buf.toString();

Derivation Routines (2)

public static String removeNonAlphanumeric(String str){

StringBuffer ret = new StringBuffer(str.length());

char[] testChars = str.toCharArray();

for (int i = 0; i < testChars.length; i++){

// MR: Allow periods in page links

if (Character.isLetterOrDigit(testChars[i]) ||

testChars[i] == ’.’)

ret.append(testChars[i]);

return ret.toString();

Derivation Routines Statistics

String methodschar[] methods

JDK App Total JDK App Total

jboard 7 0 7 7 1 8

blueblog 9 0 9 7 0 7

webgoat 8 5 13 11 0 11

blojsom 14 0 14 11 0 11

personalblog 9 3 12 7 0 7

snipsnap 15 5 20 19 1 20

road2hibernate 12 6 18 12 0 12

pebble 21 1 22 12 0 12

roller 0 0 0 0 0 0

Totals 95 20 115 86 2 88• Developed a specialized analysis that computes method summaries – “return value depends on a parameter i”– “parameter i depends on parameter j”

• Deals with character assignment, etc.

Conclusions

Overview

Static

Dynamic

Experiments

Extensions

Conclusions

Future

Conclusions

• Web application security is a huge problem– SQL injections, cross-site scripting, etc. are dominating

vulnerability reports

• Hybrid static & dynamic solution– Static: detection early in development cycle– Dynamic: exploit prevention and recovery

• Found several dozen bugs:– Most fixed by developers right away– Prevent exploits at runtime– Significant reduction in overhead with static– Working on analyzing more code

• Extensions (common to most bug finding tools)– Reflection– User-defined derivation descriptors– Specification completeness

Project Status

• Griffin Security Project– http://suif.stanford.edu/~livshits/work/griffin/

• Stanford SecuriBench & Stanford Securibench Micro– http://suif.stanford.edu/~livshits/securibench

• PQL language and dynamic instrumentation framework– http://pql.sourceforge.net/

• bdddbdd program analysis system– http://bddbddb.sourceforge.net/

• joeq Java compiler infrastructure– http://joeq.sourceforge.net/

References (1)

Publications:

1. Reflection Analysis for Java. Benjamin Livshits, John Whaley and Monica S. Lam Presented at the Third Asian Symposium on Programming Languages and Systems, Tsukuba, Japan, November, 2005.

2. Finding Application Errors and Security Flaws Using PQL: a Program Query Language.Michael Martin, Benjamin Livshits, and Monica S. Lam Presented at the 20th Annual ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications, San Diego, California, October 2005.

3. DynaMine: Finding Common Error Patterns by Mining Software Revision Histories. Benjamin Livshits and Thomas Zimmermann Presented at the ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE 2005), Lisbon, Portugal, September 2005.

4. Defining a Set of Common Benchmarks for Web Application Security.Benjamin LivshitsPosition paper on Stanford SecuriBench for the Workshop on Defining the State of the Art in Software Security Tools, Baltimore, August 2005.

5. Finding Security Vulnerabilities in Java Applications with Static Analysis. Benjamin Livshits and Monica S. Lam In Proceedings of the Usenix Security Symposium, Baltimore, Maryland, August 2005.

6. Locating Matching Method Calls by Mining Revision History Data.Benjamin Livshits and Thomas Zimmermann In Proceedings of the Workshop on the Evaluation of Software Defect Detection Tools, Chicago, Illinois, June 2005.

References (2)7. Context-Sensitive Program Analysis as Database Queries.

Monica S. Lam, John Whaley, Benjamin Livshits, Michael Martin, Dzintars Avots, Michael Carbin, Christopher Unkel. In Proceedings of Principles of Database Systems (PODS), Baltimore, Maryland, June 2005.

8. Improving Software Security with a C Pointer Analysis. Dzintars Avots, Michael Dalton, Benjamin Livshits, Monica S. Lam.In Proceedings of the 27th International Conference on Software Engineering (ICSE), May 2005

9. Turning Eclipse Against Itself: Finding Bugs in Eclipse Code Using Lightweight Static Analysis. Benjamin LivshitsIn Proceedings of the Eclipsecon Research Exchange Workshop, March 2005

10. Findings Security Errors in Java Applications Using Lightweight Static Analysis. Benjamin Livshits.In Annual Computer Security Applications Conference, Work-in-Progress Report, November 2004.

11. Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs. Benjamin Livshits and Monica S. LamIn Proceedings of the 11th ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE-11), September 2003.

Technical Reports:1. Reflection Analysis for Java.

Benjamin Livshits, John Whaley, and Monica S. Lam

2. Turning Eclipse Against Itself: Improving the Quality of Eclipse Plugins. Benjamin Livshits

3. Finding Security Vulnerabilities in Java Applications with Static Analysis. Benjamin Livshits and Monica S. Lam

Future Paper Plans

1. Analyzing Sanitization Routines

2. Learning Specification from Runtime Histories

3. Analyze Sources of Imprecision in Datalog

4. Serialization or Cloning Analysis

5. Analysis of Parsers Written in Java

6. Partitioned BDDs to Enhance Scalability of

bddbddb

7. Attack Vectors in Library Code

8. Using Model Checking to Break Sanitizers

9. Applying Model-checking to Servlet Interaction

The End.

Extra slides follow

Benjamin Livshits, John Whaley, and Monica S. Lam

Overview

Static

Dynamic

Experiments

Extensions

Conclusions

Future

Frequency of Vulnerabilities

0 10 20 30 40 50 60 70 80 90 100

% of applications vulnerable

Extension Manipulation

Insuff icient Authorization

Content Spoofing

SQL Injection

Verbose Error Messages

Abuse of Functionality

automatic code scanning tools

Web Application Security Space

manual code reviews

black-box testing solutions

application firewalls

manual penetration testing

client-side protection

Cost of Web App Security Breaches

• Average $303K financial loss from unauthorized access*

• Average $355K financial loss from theft of proprietary info*

• Estimated $400B USD/year total cost of online fraud ad abuse**

* Source: Computer Security Institute survey** Source: US Department of Justice report

Other Extensions

• Using machine learning techniques to complete the PQL specification– It’s difficult to get a specification that’s 100% complete– If it’s not, some bugs are missed– “DynaMine: Finding Common Error Patterns by Mining Software

Revision Histories”, Ben Livshits and Thomas Zimmermann, [FSE’05]

– Especially true with custom sanitization routines

• Partitioned BDDs for better scalability– Higher precision requirements push scalability limits of bddbddb

tool– One hope is to use Partitioned BDDs (POBDDs) to scale the

problem better

• Applying model-checking to servlet interaction– Our analysis relies on a harness that we automatically generate– Only finds bugs that appear within a single interaction

1 Static and Dynamic Protection from Vulnerabilities for Web Applications Benjamin Livshits SUIF...

Documents

Finding Malware on a Web Scale Ben Livshits Microsoft Research Redmond, WA

Migrant Vulnerabilities

Finding Network Vulnerabilities. 2 Objectives Define vulnerabilities Name the common categories of vulnerabilities Discuss common system and network vulnerabilities

Server vulnerabilities

Ben Livshits, Paruj Ratanaworabhan, and Ben Zorn Microsoft Research Redmond, WA

Vulnerabilities and Performance Analysis over … · Vulnerabilities and Performance Analysis over Fingerprint Biometric Authentication Network ... print sensors vulnerabilities with

Boule de Suif

Vladimir Livshits Maricopa Association of Governments May, 2009

Common Vulnerabilities

network vulnerabilities

Vulnerabilities - CordAid

Compounded Vulnerabilities in Social Institutions: Vulnerabilities as Kinds

ICT Vulnerabilities

KATSOVA M.M. , LIVSHITS М.А. Sternberg Astronomical Institute, Moscow State University

Michael Martin, Ben Livshits, Monica S. Lam Stanford University First presented at OOPSLA 2005

Detection & prevention of vulnerabilities in web applications · 2020. 3. 20. · vulnerabilities. 1.1 Common Web Application Vulnerabilities The risk of web application vulnerabilities

Internet banking safeguards vulnerabilities - 2016.appsec.eu · • Vulnerabilities and best practices – transaction authorization vulnerabilities, trusted recipients feature abuses,

Ben Livshits and Úlfar Erlingssonlivshits/papers/ppt/plas07.pdf · Web application vulnerabilities more common than ever before The usual suspects: code injection vulnerabilities

Reflection Analysis for Java Benjamin Livshits, John Whaley, Monica S. Lam Stanford University

VoIP VULNERABILITIES