Upload
miroslav-stampar
View
14.420
Download
3
Tags:
Embed Size (px)
DESCRIPTION
These are the slides from a talk "sqlmap - Under the Hood" held at PHDays 2013 conference (Russia / Moscow 23rd–24th May 2013) by Miroslav Stampar.
Citation preview
sqlmap – Under the Hood
Miroslav Štampar([email protected])
sqlmap – Under the Hood
Miroslav Štampar([email protected])
PHDays 2013, Moscow (Russia) May 23, 2013 2
BigArray
Support for huge table dumps (e.g. millions of rows)
Raw data needs to be held somewhere before being processed (and eventually stored)
In-memory was a good enough choice until recent years (user appetites went bigger)
Avoidance of MemoryError Memory mapping into smaller chunks/pages
(e.g. 4096 entries) Temporary files are used for storing chunks O(1) read/write access (page table principle)
PHDays 2013, Moscow (Russia) May 23, 2013 3
HashDB Storage of resumable session data at
centralized place (local SQLite3 database) Non-ASCII values are automatically
serialized/deserialized (pickle) INSERT INTO storage VALUES (LONG(MD5(target_url || key || MILESTONE_SALT)[:8]), stored_value)
MILESTONE_SALT is changed whenever there is a change in HashDB mechanism that is bringing incompatibility with previous versions
key uniquely describes storage_value for a given target_url (e.g.: KB_INJECTIONS, SELECT banner FROM v$version WHERE ROWNUM=1, etc.)
PHDays 2013, Moscow (Russia) May 23, 2013 4
Payloads
XML format (xml/payloads.xml) Tag type <boundary> used for storage of all
possible prefix and suffix formations (<prefix>, <suffix>) together with context sensitive information (subtags <level>, <clause>, <where> and <ptype>)
Tag type <test> used for storage of data required for successful testing and usage of each SQL injection payload type (subtags <title>, <stype>, <level>, <risk>, <clause>, <where>, <vector>, <request> and <response>)
PHDays 2013, Moscow (Russia) May 23, 2013 5
Payloads (2)
<boundary>
<level>1</level>
<clause>1</clause>
<where>1,2</where>
<ptype>1</ptype>
<prefix>)</prefix>
<suffix>AND ([RANDNUM]=[RANDNUM]</suffix>
</boundary>
PHDays 2013, Moscow (Russia) May 23, 2013 6
Payloads (3)<test> <title>Microsoft SQL Server/Sybase AND error-based - WHERE or HAVING clause (IN)</title> <stype>2</stype> <level>2</level> <risk>0</risk> <clause>1</clause> <where>1</where> <vector>AND [RANDNUM] IN (('[DELIMITER_START]'+([QUERY])+'[DELIMITER_STOP]'))</vector> <request> <payload>AND [RANDNUM] IN (('[DELIMITER_START]'+(SELECT (CASE WHEN ([RANDNUM]=[RANDNUM]) THEN '1' ELSE '0' END))+'[DELIMITER_STOP]'))</payload> </request> <response> <grep>[DELIMITER_START](?P<result>.*?)[DELIMITER_STOP]</grep> </response> <details> <dbms>Microsoft SQL Server</dbms> <dbms>Sybase</dbms> <os>Windows</os> </details></test>
PHDays 2013, Moscow (Russia) May 23, 2013 7
Queries
XML format (xml/queries.xml) Tag type <dbms> used for storage of all DBMS
specific SQL formations required for successful enumeration (subtags <users>, <passwords>, <dbs>, <tables>, <columns>, <dump_table>, etc.) and resulting data (pre)processing (subtags <cast>, <length>, <isnull>, <count>, <substring>, <concatenate>, etc.)
Each enumeration subtag has an <inband> and <blind> form used in respective techniques
PHDays 2013, Moscow (Russia) May 23, 2013 8
Queries (2)<dbms value="MySQL"> <cast query="CAST(%s AS CHAR)"/> <length query="CHAR_LENGTH(%s)"/> <isnull query="IFNULL(%s,' ')"/> <delimiter query=","/> <limit query="LIMIT %d,%d"/> … <passwords> <inband query="SELECT user,password FROM mysql.user" condition="user"/> <blind query="SELECT DISTINCT(password) FROM mysql.user WHERE user='%s' LIMIT %d,1" count="SELECT COUNT(DISTINCT(password)) FROM mysql.user WHERE user='%s'"/> </passwords>
…
PHDays 2013, Moscow (Russia) May 23, 2013 9
Multithreading Multithreading implemented wherever
applicable (option --threads) Techniques covered: boolean-based blind,
error-based and partial UNION query Deliberately turned off for techniques: time-
based and stacked (lots of reasons) Each thread covers a part of value in case of
boolean-based blind In other techniques, each thread covers one
enumerated entry Also, implemented for brute force column/table
name search and crawling
PHDays 2013, Moscow (Russia) May 23, 2013 10
Direct connection
Direct connection to DBMS (option -d) python sqlmap.py -d “mysql://root:[email protected]:3306/testdb”
Support for: Microsoft SQL Server, MySQL, Oracle, PostgreSQL, SQLite, Microsoft Access, Firebird, SAP MaxDB, Sybase, IBM DB2
Using of 3rd party connectors (e.g. python-pymssql, pymysql, cx_Oracle, python-psycopg2, etc.)
SQLAlchemy used as an alternative
PHDays 2013, Moscow (Russia) May 23, 2013 11
Load request(s) from file
Load HTTP request(s) from a textual file (option -r)
Supporting RAW request format (any MITM proxy can be used to catch one)
Particularly usable in requests with large content body (e.g. POST)
Load and parse log files (option -l) Supporting Burp and WebScarab log formats Unlimited number of parsed HTTP requests
(using only unique ones)
PHDays 2013, Moscow (Russia) May 23, 2013 12
Content type detection
Automatic detection of (specialized) request content types
Supporting SOAP, JSON and (generic) XML For example:
--data="{ \"pid\": 4412, \"id\": 1, \"action\": \"do\"}"
--data="<request><pid>4412</pid> <id>1</id><action>do</action></request>"
Appropriate exploitation of parameter values In case of non-supported format(s), custom
injection mark (*) can be used
PHDays 2013, Moscow (Russia) May 23, 2013 13
Site crawling/form searching
Collect usable (on site) target links (option --crawl)
User defines crawling depth (e.g. 3) limiting search based on distance from starting page
Optional form searching at visited pages (switch --forms)
Arbitrary filling of missing form data Reparation of non-HTML compliant pages for
easier processing
PHDays 2013, Moscow (Russia) May 23, 2013 14
Mnemonics
Usage of mnemonics for faster setting up of sqlmap options and switches (option -z)
Longer (original):python sqlmap.py --flush-session --threads=4 --ignore-proxy --batch --banner -u …
Shorter (using mnemonics):python sqlmap.py -z “flu,thre=4,ign,bat,ban” -u …
Highly generic prefix based recognition (e.g. -z “flu,bat,ban” is interpreted the same as -z “flush,batc,bann”)
PHDays 2013, Moscow (Russia) May 23, 2013 15
Keep-alive
HTTP persistent connection (switch --keep-alive)
Opposed to new connection for every single request/response pair
Slightly adapted 3rd party module keepalive and adjusted for multi-threading
Connection pool – reusage of existing target connection(s) where applicable
Reduced network congestion (fewer TCP connections), reduced latency (no handshaking), faster enumeration, etc.
PHDays 2013, Moscow (Russia) May 23, 2013 16
Tor
Support for The Onion Router (Tor) online anonymity network (switch --tor)
Concealing identity and network activity Used against surveillance and (targeted) traffic
sniffing Configurable Tor proxy type (option --tor-type)
and port number (option --tor-port) DNS leakage is prevented (no DNS requests
outside of Tor) Available safety check for proper usage of Tor
(switch --check-tor)
PHDays 2013, Moscow (Russia) May 23, 2013 17
Domain name resolution caching
DNS resolution request is done by default for each HTTP request (from Python HTTP dedicated modules – e.g. httplib)
Noticeable slowdown in some cases (e.g. excessive network latency)
Problem noticed and reported by (nagging) users (looking into Wireshark traffic captures)
Problem patched at the lowest level (method socket.getaddrinfo(*args, **kwargs) is encapsulated for caching)
PHDays 2013, Moscow (Russia) May 23, 2013 18
Authentication methods
Implemented support for authentication methods: basic, digest, NTLM and certificate (options --auth-type, --auth-cred and --auth-cert)
python sqlmap.py -u “http://192.168.21.129/vuln.php?id=1” --auth-type=basic --auth-cred=”testuser:testpass”
Handling HTTP status code 401 (Unauthorized) Authorization headers are being cached (where
applicable)
PHDays 2013, Moscow (Russia) May 23, 2013 19
Reflection detection and removal
Noisy response resulting from request reflection
Query results for: 1%20AND%201%3D1
Can cause problems in detection phase Particularly problematic for boolean-based
blind technique (fuzzy page comparison) Automatic detection of reflected payload value
and marking with predefined constant value Query results for: __REFLECTED_VALUE__
PHDays 2013, Moscow (Russia) May 23, 2013 20
Dynamicity detection and removal
Noisy response resulting from sporadically changing content (e.g. ads, banners, etc.)
Can cause problems in both detection and enumeration phase
Particularly problematic for boolean-based blind technique
Automatic detection and marking of dynamic parts (info held in internal knowledge base)
In best case, automatic recognition and usage of string value appearing only in True responses (option --string)
PHDays 2013, Moscow (Russia) May 23, 2013 21
Content filtering Occasionally pages are bulked with non-textual
content (CSS styles, comments, JavaScript, HTML tags, embedded objects, etc.)
Changes regarding boolean-based blind technique are usually affecting only one small textual part (e.g. table entry)
Optional filtering of non-textual content (switch –text-only)
For example: <html>...<td>Tooth fairy</td>...</html> is filtered to ...Tooth fairy...
Better detection and less trash(y) results
PHDays 2013, Moscow (Russia) May 23, 2013 22
Wizard mode
For beginner users and script kiddies (switch --wizard)
Questions asked:Target URLPOST data (if any) Injection difficulty (Normal/Medium/Hard)Enumeration (Basic/Intermediate/All)
Infamous for Comodo Brazil breach (March 2011) – attackers posted wizard mode console output to the Pastebin
PHDays 2013, Moscow (Russia) May 23, 2013 23
Level/risk of detection
Number of requests per each parameter in testing phase can grow from 10 up to 10K
To prevent unnecessary noise and speed up the testing time, tests are classified by level and risk
Level (option --level) represents (passing) possibility/usability of the test case (higher level means lower possibility)
Risk (option --risk) represents potential damage that the test case can cause (higher risk means higher potential damage)
PHDays 2013, Moscow (Russia) May 23, 2013 24
Heuristic SQL injection checks
Recognition of the backend DBMS if error message can be provoked with arbitrary invalid SQL sequence (e.g. ())'”(''”')
In case that the parameter value is integer and response for (e.g.) 1 is the same as for (2-1), there is a good chance that the target is vulnerable
In case of detected boolean-based blind technique, DBMS specific queries are used (e.g. (SELECT 0x616263)=0x616263) to potentially move focus to a particular DBMS in further tests
PHDays 2013, Moscow (Russia) May 23, 2013 25
Type casting detection
Type casting is an efficient way for dealing with SQL injection on numeric values
$query = "SELECT * FROM log WHERE id=" . intval($_GET['id']);
Implemented automatic detection of such cases
In case that the parameter value is integer and response for (e.g.) 1 is the same as for 1foobar, there is a good chance that the target is using integer casting
User is warned of a potentially “futile” run
PHDays 2013, Moscow (Russia) May 23, 2013 26
Fingerprinting Web server is being fingerprinted by known
HTTP headers, cookie values, etc. DBMS is being fingerprinted through error
message parsing, banner parsing and tests with version specific payloads (obtained from release notes and reference manuals)
For example, cookie value ASP.NET_SessionId is specific for ASP.NET/IIS/Windows platform, while TO_SECONDS(950501)>0 check should work only on MySQL >= 5.5.0
Detailed DBMS version check is done only if switch -f/--fingerpint is used
PHDays 2013, Moscow (Russia) May 23, 2013 27
Suhosin-patch detection Open source patch for PHP, protecting web
server from “insecure PHP practices” suhosin.get.max_value_length (default: 512), suhosin.post.max_value_length, etc.
Causing problems in enumeration phase when payloads are big (e.g. enumerating column names)
After the detection phase single payload (depending on detected techniques) is sent having size greater than 512 (e.g. 1 AND 6525 = … 6525)
User is warned in case of False response
PHDays 2013, Moscow (Russia) May 23, 2013 28
WAF/IDS/IPS detection
Sending one “suspicious” request (in form of dummy parameter value) and checking for response change(s) when compared to original (switch --check-waf)
WAF scripts (switch --identify-waf) do a through checking, each focusing on peculiarities of a particular product
For example, WebKnight responds with HTTP status code 999 on detected suspicious activity
Currently there are 29 WAF scripts (airlock.py, barracuda.py, bigip.py, etc.)
PHDays 2013, Moscow (Russia) May 23, 2013 29
WAF/IDS/IPS bypass
Tamper scripts (option --tamper) do changes on injected payload before it's being sent
User has to choose appropriate one(s) based on collected knowledge of target's behavior and/or detected WAF/IDS/IPS product
If required, a chain of tamper scripts can be used (e.g. --tamper=”between, ifnull2ifisnull”)
Currently there are 36 tamper scripts (apostrophemask.py, apostrophenullencode.py, appendnullbyte.py, etc.)
PHDays 2013, Moscow (Russia) May 23, 2013 30
String value escaping
Each string value inside payload is automatically escaped (quoteless format) depending on targeted DBMS
For example: 1 ... AND username=”root”-- is in case of MySQL escaped to 1 ... AND username=0x726f6f74--
Avoidance of filter-based escaping functions (e.g. addslashes)
Adding implicit dependence to targeted DBMS Payload obfuscation (harder noticeability in
target log files)
PHDays 2013, Moscow (Russia) May 23, 2013 31
Evaluation of custom code
Custom Python code can be evaluated before each request (option --eval)
In such code, each request parameter is accessible as a local variable
All resulting variable values are included into the request as new parameter values
--eval="import hashlib;hash=hashlib.md5(id).hexdigest()"
www.target.com/vuln.php?id=1 AND 1=1&hash=7f134e52836a00e26493e690ed8aa735
PHDays 2013, Moscow (Russia) May 23, 2013 32
Fuzzy page comparison
Used (mostly) in boolean-based blind technique
Gestalt pattern matching (Ratcliff-Obershelp algorithm)
Supported by standard Python module difflib Class SequenceMatcher Method ratio() (or faster quick_ratio())
giving a measure of the sequences’ similarity as a float in range [0, 1]
True result if ratio() > 0.98 when compared with original page
PHDays 2013, Moscow (Russia) May 23, 2013 33
Definite page comparison
Used mostly in boolean-based blind technique When fuzzy page comparison fails (e.g. too
much page dynamicity) and user is able to distinguish True from False responses by himself (non-n**b)
String to match when result should be recognized as True (option --string)
Regular expression to match … (option --regex) Compare HTTP codes (switch --code) Compare HTML titles (switch --title)
PHDays 2013, Moscow (Russia) May 23, 2013 34
Null connection Sometimes there is no need for retrieval of
whole page content (size can be enough) Boolean-based blind technique 3 methods: Range, HEAD and “skip-read” Range: bytes=-1
Content-Range: bytes 4789-4790/4790 HEAD /search.aspx HTTP/1.1
Content-Length: 4790
Both are resulting (if applicable) with either empty or 1 char long response
Method “skip-read” retrieves only HTTP headers looking for Content-Length
PHDays 2013, Moscow (Russia) May 23, 2013 35
False positive detection
False positives are highly undesirable Specific for boolean-based blind and time-
based blind techniques False positive tests are done in cases when
only one of those techniques is detected Set of trivial mathematical checks performed to
see if target can “respond” correctly For example:
(123+447)=570319>(519+110)(654+267)>854
PHDays 2013, Moscow (Russia) May 23, 2013 36
Delay detection Detection of “artificial” delay Statistical comparison with normal response
times Response time must fit under the Gaussian bell
curve to be marked as “normal” Is <current_response_time> > avg(<normal_response_times>)+7*stdev(<normal_response_times>)?
If answer is yes, probability that we are dealing with “artificial” delay is 99.9999999997440%
Especially useful when heavy queries are used (not knowing expected delay value)
PHDays 2013, Moscow (Russia) May 23, 2013 37
Delay detection (2)
PHDays 2013, Moscow (Russia) May 23, 2013 38
UNION query column # UNION query requires knowledge of number of
columns (N) for vulnerable SQL statement Two methods used: ORDER BY and statistical
(same principle as in delay detection) ORDER BY N+1 should respond noticeably
different (preferably with error message) than for ORDER BY N (binary searched)
In statistical method responses for candidates (UNION SELECT NULL, NULL,...) are compared to original (not injected) response
Right one is the one that seems “not normal” (having ratio outside the Gaussian bell curve)
PHDays 2013, Moscow (Russia) May 23, 2013 39
Output prediction
Inference techniques (boolean-based blind and time-based blind) require optimization wherever and whenever possible
In certain cases prediction(s) can be made Checking if current retrieved entry shares same
prefix with previous retrieved entr(ies) For example DROP ANY ROLE has same prefix as DROP ANY RULE (one request per checked character compared to bit-by-bit retrieval)
Using common output values too (e.g. information_schema, phpmyadmin, etc.)
PHDays 2013, Moscow (Russia) May 23, 2013 40
Brute forcing identifier names In case of missing schema (e.g. deleted information_schema) brute force search is required (e.g. 1=(SELECT 1 FROM users))
Searching for common table names (switch --common-tables)
Searching for common column names (switch --common-columns)
Conducted automated search and parsing of resulting SQL files for chosen Google dorks (e.g. ext:sql “CREATE TABLE”)
Collected most frequent 3.3K table names and 2.5K column names
PHDays 2013, Moscow (Russia) May 23, 2013 41
Pivot dump table Some DBMSes (e.g. Microsoft SQL Server) don't
have OFFSET/LIMIT query mechanism making enumeration problematic in non-UNION query techniques
Column with most DISTINCT values is automatically chosen as the pivot column
Pivot's first value bigger than previous (e.g. SELECT MIN(id) WHERE id > ' ') is retrieved
Entries for other columns (e.g. SELECT name WHERE id=1) are being retrieved using current pivot value
Iterative process
PHDays 2013, Moscow (Russia) May 23, 2013 42
International letters
Добрый день Россия Page encoding is parsed from Content-Type
HTTP header, Content-Type meta HTML header or heuristically detected (3rd party module chardet)
RAW target response is automatically decoded to Unicode (using detected page encoding)
In case of inband techniques (UNION query and error-based) results with international letters are already supported if decoding went properly
PHDays 2013, Moscow (Russia) May 23, 2013 43
International letters (2)
In case of inference techniques (boolean-based blind and time-based blind) characters are being inferred already in their Unicode form
Potential problems occur when stored data and/or database connector use different (non-compatible) charset than target's response
In case of unsuccessful decoding of international letters (e.g. gibberish output) charset can be enforced (option --charset)
PHDays 2013, Moscow (Russia) May 23, 2013 44
Hex encoding retrieved data
All supported DBMSes have capabilities to encode resulting data to hexadecimal format (switch --hex)
Most useful in cases when (parts of) results are potentially lost (e.g. binary data in inband techniques)
Retrieved data is automatically decoded to its original (non-hexadecimal) format
Such binary content is checked for known formats (usign 3rd party module magic) and (if recognized) stored to output files
PHDays 2013, Moscow (Russia) May 23, 2013 45
Dump format
Dumped table content can be stored in 3 different formats: CSV (default), HTML and SQLite (option --dump-format)
In CSV format each row is represented by one line and each column entry is being separated by a predefined separator character (e.g. ,)
In HTML format dump is stored into a visually recognizable (browser) table
In SQLite format dump is “replicated” to a locally stored SQLite3 database giving a possibility of (among others) running queries against it
PHDays 2013, Moscow (Russia) May 23, 2013 46
Password cracking Implemented support for detection and
wordlist-based cracking of 14 different commonly used hash algorithms
MySQL (newer and older), MsSQL (newer and older), Oracle (newer and older), PostgreSQL, MD5, SHA1, etc.
Automatic analysis of retrieved passwords (--passwords) and table dumps (--dump)
(Optional) common suffix forms (1, 123, etc.) Multiprocessed attack (# of CPUs) 1M MySQL hash guesses in under 10 seconds
on 4 core Intel Xeon W3550 @ 3.07GHz
PHDays 2013, Moscow (Russia) May 23, 2013 47
Large dictionary support
Distributed access in multiprocessing environment
Support for huge dictionaries (chunk read) Support for dictionary lists Support for ZIP compressed dictionaries Included custom built and compressed
dictionary (1.2M entries) based on highly popular and publicly available dumps, like RockYou, Gawker, Yahoo, etc.
PHDays 2013, Moscow (Russia) May 23, 2013 48
Stagers and backdoors
Stagers are used for uploading arbitrary (binary) files (e.g. UDF files, backdoors, etc.)
Backdoors are used for OS command execution (switches --os-cmd and --os-shell)
Prerequisite is that one of known SQL file write methods can be used (e.g. INTO DUMPFILE, EXEC xp_cmdshell 'debug.exe < dump.src', etc.)
4 different platforms supported: ASP, ASP.NET, JSP and PHP
Stored in “cloaked” format (preventing local AV triggering) inside shell directory
PHDays 2013, Moscow (Russia) May 23, 2013 49
Metasploit integration Automatized creation, upload and run of
Metasploit shellcode payload (switch --os-pwn) User can choose payload (Meterpreter, shell
or VNC), connection (reverse TCP, reverse HTTP, etc.) and encoder type (no encoder, Call+4 Dword XOR Encoder, etc.)
shellcodeexec(.exe) is being uploaded along with (non-compiled) Metasploit shellcode payload using stager or other means
Metasploit CLI is being run at the host machine Payload is being executed at the target
machine connecting back to the host machine
PHDays 2013, Moscow (Russia) May 23, 2013 50
Second order SQL injection
Occurs when provided user data stored at one place is being used in vulnerable SQL statement at the other place
Similar to permanent XSS User can explicitly set the location where to
look for the response (option --second-order) Effectively doubling number of required
requests
PHDays 2013, Moscow (Russia) May 23, 2013 51
DNS exfiltration Out-of-band SQL injection technique using DNS
resolution mechanism (option --dns-domain) Fake DNS server instance is automatically
being made at the host machine SQL injection payloads being sent are
deliberately provoking DNS resolution mechanism at the target machine
Provoked DNS requests carry results of a query Fake DNS server instance intercepts requests
and responds with dummy resolution answers Requires registration of a nameserver for the
used domain pointing to the host machine
PHDays 2013, Moscow (Russia) May 23, 2013 52
Output purging
Output directory can be (optionally) “safely” removed (switch --purge-output)
Content of all contained files (sessions, logs, dumps, etc.) is being overwritten with random data
Files truncated and renamed to random values (sub)directories renamed to random values At the end, whole output directory tree is being
removed
PHDays 2013, Moscow (Russia) May 23, 2013 53
Questions?