45
IDA and digital security Hex-Rays Ilfak Guilfanov

CODE BLUE 2014 : [Keynote] IDA and digital security by Ilfak Guilfanov

Embed Size (px)

Citation preview

IDA and digital security Hex-Rays Ilfak Guilfanov

2 (c) 2014 Ilfak Guilfanov

Presentation Outline

 About me  Digital security is more important than ever  Evolution of development tools  Story of IDA  What we could do to improve the situation  Your feedback

3 (c) 2014 Ilfak Guilfanov

About myself

 Started a programmer career more than 27 years ago  The author of IDA Pro and Decompiler  Founder and CEO of Hex-Rays  I have passion about programming and beautiful code

– I'm not a reverse engineer but a software developer  I believe that efficient and robust software makes our lives

better

4 (c) 2014 Ilfak Guilfanov

Digital security

 We store more and more personal data digitally: – Medical records – Bank account info – Communications (emails, sms, etc) – Photos and videos – etc...  More and more (all?) decisions are taken based on digital

data  Virtually all our devices are connected to each other, which

makes them powerful and vulnerable at the same time

5 (c) 2014 Ilfak Guilfanov

Security: current state

 Far from perfect  Most our software has vulnerability holes due to:

– Design flaws (security was not built in from the start) – Buggy implementations – Poor or no testing – Changed environments  We will hear more stories about thousands of stolen credit

cards, disclosed sensitive information, etc

6 (c) 2014 Ilfak Guilfanov

Everything is lost? Nope

 I think that the situation is improving over time – The software improves – Most new systems get designed with security in mind – Society better handles digital security related issues

7 (c) 2014 Ilfak Guilfanov

Development methods are constantly improving

 Compilers become better (the very first compilers would not issue any warnings!)  We have more compilers (clang, for example)  Programming languages evolve (initially c++ had no

templates)  Better programming paradigms and design patterns  Better software revision systems  Agile and other development approaches

8 (c) 2014 Ilfak Guilfanov

Software analysis and testing

 There are many source code analyzers on the net (25 years ago there was just “lint”)  Many testing frameworks appeared too  Debuggers, fuzzers, valgrind, etc

9 (c) 2014 Ilfak Guilfanov

How difficult was it to develop software in the past?

 Overall, much harder than today. One proof of this difficulty is the software size (measured as the number of source lines of code, not as the size of the binary image)  Example, MS Windows (data from wikipedia)

Year Operating System SLOC (mil) 1993 Windows NT 3.1 4-5 1994 Windows NT 3.5 7-8 1996 Windows NT 4.0 11-12 2000 Windows 2000 More than 29 2001 Windows XP 45 2003 Windows Server

2003 50

10 (c) 2014 Ilfak Guilfanov

Principal reasons of difficulty

 Missing or inefficient development tools: editors, compilers, debuggers, code analyzers, testers, etc.  Slower computers  No memory protection or program isolation provided by the

operating system  Immature programming paradigms and methodologies  Lack of program verification and testing tools  Let us take a tour

11 (c) 2014 Ilfak Guilfanov

Editor 'ed'

 A text editor with virtually no visual feedback  Yet considered a powerful tool:

– It has regular expressions – Programmable  Even modern Linux distributions still include it

12 (c) 2014 Ilfak Guilfanov

Compilers

 No or little warnings  Could produce buggy code  Poor code optimization  However, as a software developer I still consider compiler

writers as semi-gods :)  Turbo C compilers from Borland were a breath of fresh air

13 (c) 2014 Ilfak Guilfanov

Debuggers

 My favorite was Turbo Debugger. Compared to other debuggers it was a feast to the eyes  Powerful, robust, can display the source code  Supported hardware and conditional breakpoints

14 (c) 2014 Ilfak Guilfanov

SoftICE: system level debugger

 Can debug the entire operating system  Very powerful  Its popularity was a disadvantage in some cases, anti-

SoftICE tricks were employed by some software

(from wikipedia)

15 (c) 2014 Ilfak Guilfanov

Disassemblers

 Debuggers were not enough in some cases – Programs get bigger – Algorithms became more complex – Anti-debugger tricks were used more often – More detailed analysis was required, especially for viruses – Compiler bugs required to check non-executable (object) files  Disassemblers would:

– Analyze the program in depth – Show cross references – Assign meaningful names to functions and data – etc...

16 (c) 2014 Ilfak Guilfanov

The most popular disassembler: Sourcer

 Sourcer from V Communications was a great tool  It was like magic for a newcomer, in fact

– It would tell apart code from data – It would assign meaningful names and comments

17 (c) 2014 Ilfak Guilfanov

Sourcer: batch mode disassembler

 The biggest shortcoming: it was a batch mode program  Could handle programs of

limited size  Would occasionally misidentify

code or data  Slow for big programs

18 (c) 2014 Ilfak Guilfanov

Interactivity is the answer

 There was a disassembler called Dis*Doc from RJ Swantek  I haven't used it myself so can not tell you much  But I liked the idea very much:

– No need to wait for the results – The user can browse the listing and annotate it – The user can guide the disassembler by marking locations as code or data – WYSIWIG (what you see is what you get) was a la mode at that time (remember 'ed'?)  This was the reason why I decided to create IDA

19 (c) 2014 Ilfak Guilfanov

Initial design and implementation

 I tried a few approaches and rewritten the code at least 4 times before I hit the right thing  The result was either too heavy and slow, either too

lightweight and limited  Remember about 640KB memory and slow processors!  I needed a robust and fast database

20 (c) 2014 Ilfak Guilfanov

Database choice

 Requirements: – Fast – Capable of storing variable sized objects – Robust  I tried the available databases like Paradox from Borland but

quickly abandoned the idea, they were way too slow  Fortunately my friend Pavel Rousnak implemented a B-tree

engine  We are still using his database in IDA, upgraded and

improved over many times but still the same code

21 (c) 2014 Ilfak Guilfanov

IDA 0.1, the first public version

 It took me 6 months to implement version 0.1  The basic functionality was present but the user interface

was ugly  It supported only x86 instructions  Yet it was interactive and working!

22 (c) 2014 Ilfak Guilfanov

IDA v2.09: nice text interface

 IDA v2.09 was released in 1994  TurboVision library from Borland fixed the user interface  It was robust, supported 3 processor families (x86, i51, and

z80), 8 input file types, had a built-in C like language  Since it was already over 500KB, it was a heavily overlayed

program. I was saving every byte

23 (c) 2014 Ilfak Guilfanov

IDA v3.5b: with symbol files

 It was released in 1996  It had symbol files (IDS), could run on OS/2, MS DOS

extender, had loadable modules, etc  There were many other releases I won't mention in order not

to bore you

24 (c) 2014 Ilfak Guilfanov

IDA Roadmap

 My initial plans for IDA were really ambitious. They included: – AI (artificial intelligence) with a LISP like language – Building a binary program optimizer on top of IDA – Using IDA for binary translation – Building some kind of knowledge database about common program snippets and their meanings – IDA would point out suspicious or problematic parts of the code (vulnerability scanner?) – Etc,etc,etc  However, with ever growing users of IDA I was simply

overwhelmed by the user requests and bug fixes  Even today it is like this

25 (c) 2014 Ilfak Guilfanov

Datarescue and IDA

 I was lucky that Mr. Pierre Vandevenne got interested in IDA. His contribution can not be overestimated  Datarescue converted my

hobby project into a commercial program in 1996  The first GUI version of IDA

was built there, in 1999  We made a long and very

interesting way together  BTW, Pierre found the lady we

use for IDA logo

26 (c) 2014 Ilfak Guilfanov

PC Magazine: Technical Excellence Award

 In 2001 IDA Pro was nominated as a finalist of the Annual Awards for Technical Excellence  We went to Las Vegas to participate in the

award ceremony  We lost the competition... to Microsoft's

Visual Studio .NET  It was still fun :)

27 (c) 2014 Ilfak Guilfanov

IDA and pirates

 Unfortunately we were plagued by piracy  There were more pirates than legitimate users  Pirates were eating our time and resources  A typical conversation would start with a compliment from a

stranger; he would ask for a “little help” in the second message  It was even boring, so predictable  I do not understand when clever people pirate software and

then shamelessly ask for help. Probably they aren't that clever after all

28 (c) 2014 Ilfak Guilfanov

IDA piracy map 2006

 Just a map of places where a pirated version of IDA was used (circa 2006)

(from www.datarescue.com)

29 (c) 2014 Ilfak Guilfanov

Decompiler: a plugin on top of IDA

 Was greatly inspired by Cristina Cifuentes' PhD thesis on decompilation  After reading the thesis it was clear how to build a

decompiler  But the devil is in the details... many subproblems were still

not solved. For them: – Come up with an idea how to solve it – Implement it – Test it – Throw away and start over if it did not work  “Wash, rinse, repeat” – for years... (I liked it!)  The first attempts were made in 1998 or even earlier  The first public version appeared in 2007

30 (c) 2014 Ilfak Guilfanov

Decompiler details

 Decompilation is a complex problem, insolvable in general  Very time consuming to

develop  Seemingly minor design

mistakes haunt and hinder development  One has to cut corners in

order to come up with a working decompiler  Question: which corners to

cut?

31 (c) 2014 Ilfak Guilfanov

Hex-Rays

 Unfortunately Pierre decided to quit in 2007  I had to continue with the decompiler alone  Hex-Rays quickly became a strong and passionate team

– We do care about our code – We want to publish as bug free software as we can – We care about our users

32 (c) 2014 Ilfak Guilfanov

Why IDA, after all

 I created IDA because there was no interactive and robust disassembler at that time; on the other hand there was a strong need in such a tool  I kept maintaining IDA all these years because

– IDA helps to solve some problems we face, like viruses – IDA improves our digital security – IDA users are very nice people in general (legit ones)  Like any tool IDA can be used for lowly deeds. Examples:

– Cracking software – Stealing code and algorithms

33 (c) 2014 Ilfak Guilfanov

IDA as a seeing aid

 I usually compare it to a microscope  Basically useless to general

public but indispensable to professionals  Requires skills to use it

efficiently

34 (c) 2014 Ilfak Guilfanov

Who uses IDA?

 This is a frequent question  I can only mention some users categories:

– Anti-virus companies – Security oriented organizations – Governments and military – Hobbyists – Shady persons of all kinds – Pirates (the dogs bark but the caravan goes on)  Overall it is a motley crew

35 (c) 2014 Ilfak Guilfanov

How IDA improves digital security

 Our legit users are white hats (or at least they pretend to be so :)  IDA itself is not in the spotlight and stays in shadow but many

of our users are famous security researchers  We are glad they we can help them with their tasks  We want IDA to be safe for them (and for all our users)

36 (c) 2014 Ilfak Guilfanov

How we improve security of IDA

 We run tests  We compile our code with various compilers on different

platforms  We use code reviews  We use lint, valgrind, and other verification tools  User reports are handled by the developers (there are no

first/second help lines). This ensures that developers really suffer from their bugs :)

37 (c) 2014 Ilfak Guilfanov

Testing IDA

 We continuously work on improving our coding style  We keep adding more test cases. Every new reported bug

ideally creates a new test case  We keep adding more testing methods

– We have an extensive test suite for our analysis engine – We recently added tests for the user interface – We have a constantly growing set of decompiler tests – Our decompilation test suite is about 500GB (only output files)  Virtually every day we add a couple of new test cases  There are dedicated computers for running tests  We have a bug bounty program for critical bugs  We know that we are still not testing IDA enough

38 (c) 2014 Ilfak Guilfanov

Bug bounty

 The idea is simple: if you find a critical bug in IDA, we will pay you a bounty  Many other companies do that; we think that it is really a

good idea  It is difficult to come up with a reliable exploit for IDA:

– IDA kernel is personalized for each user – We use ASLR – IDA randomizes the heap at the start – We use stack canaries – Our stack is not executable  Nevertheless we offer bounties for memory corruption bugs

even if there is no POC code

39 (c) 2014 Ilfak Guilfanov

Things to improve in the nearest future

 We have to add a fuzzer to out test methods  A good static code analyzer is in the plans (in the past we

used one quite famous one but were disappointed)  More tests for the debuggers (since we support remote

debugging for all platforms, there are 24 possible combinations of local and remote computers)  More tests for IDA Python

40 (c) 2014 Ilfak Guilfanov

Why is security hard? Because we are blind

 Making a watertight liquid recipient is hard for a blind person. He has to palpate it entirely to ensure there are no holes  If he misses a tiniest hole, the liquid will leak out  The same for us with security: we are essentially blind to

security holes and can see only the most obvious ones  This means that humans are hopelessly bad when it comes

to security, at least today  We need help, we need tools so we can see the light

41 (c) 2014 Ilfak Guilfanov

Computers to rescue us

 Since us humans can not cope with the task, our only hope are computers  Computers can

– Test our software – Monitor its use – Prove its correctness – Serve as a seeing aid – Eventually computers would develop software

42 (c) 2014 Ilfak Guilfanov

Testing

 Unfortunately it is impossible to test all cases  Not an excuse to abandon testing altogether  Testing must be continuous  Test as many different aspects as possible  Think as an attacker, try to foresee the possible scenarios  Keep adding tests for all newly discovered bugs  Write a test case before fixing the bug

43 (c) 2014 Ilfak Guilfanov

Monitoring systems

 The digital world changes over time  New threats and attack vectors are discovered  We must monitor our systems  Many solutions exist: Tripwire, Nessus, OSSEC, …  Simple custom scripts have the advantage of being

unknown to the attackers

44 (c) 2014 Ilfak Guilfanov

Proving the software correctness

 Software verification tools  Need support from the programming languages

– C++ is not the best language for verification – If not, at least good coding practices must be adopted – Unfortunately this comes with a price to pay (MISRA et other guidelines)  Code generation tools  Eventually these tools will become mainstream

45 (c) 2014 Ilfak Guilfanov

Thank you!

Thank you for your attention! Questions?