14
LLVM Michel Guillet [email protected] @guilletmichel 1

LLVM Internal Architecture par Michel Guillet

Embed Size (px)

Citation preview

LLVMMichel Guillet

[email protected] @guilletmichel

1

What is LLVM ?• LLVM stood for: Low Level Virtual Machine

• “The LLVM Project is a collection of modular and reusable compiler and toolchain technologies” — llvm.org

• LLVM Core: backend, assembly code generator for 10+ CPUs (60+ for GCC)

• LLDB: debugger

• Clang: C/C++/Objective-C frontend + static analyser

2

History• March 22, 1987: GGC 0.9 — First beta version

• Circa 2000: LLVM Project started

• Circa 2005: C. Lattner hired by Apple

• July 18, 2007: GGC 4.2.1 — Last GPLv2 version

• Crica 2010: C. Lattner began working on Swift

• October 12, 2011: Xcode 4.1 — LLVM only

3

Why GCC became an issue ?

• GPLv3: Apple would have been forced to open Xcode and other build tool related to GCC

• Not modular: Poor integration into the IDE

• 7+ MLOC: Too big, too old

• Objective-C: Not a priority for GCC team, slow improvements

4

But what is “compilation” ?• Compilation: transformation of a source code into

another source code (of a lesser abstraction)

• Example: C into x86 assembly or Ada into PowerPC assembly

• Transpilation: compilation into a langage of similar abstraction

• Example: CoffeeScript into Javascript or LESS into CSS

5

You !

Front end (Clang)

Back end (LLVM)

But what is “compilation” ?

CPU (x86, ARM)

1) Algorithm

2) Implementation (C, Java)

3) IR (GIMPLE, LLVM IR)

4) Assembly (x86-64, PPC, ARM)

5) Microcode

6

– 2

bar 3

=

int foo +

4 x

Front End

1. Lexical analysis

2. Syntax analysis

3. Semantic analysis

int foo = 4 + 2 * ( bar - 3 )

int foo = 4 + 2 * ( bar - 3 )

do we know “int” ? does “foo” already exist ?

do we know “x” ?

is this an int ?

do we know “bar” as this stage ? is this an int ?7

– 2

bar 3

=

int foo +

4 x

Front End4. Optimization

5. IR generation

int foo = 4 + 2 * ( bar - 3 )

bar 2

=

int foo -

x 2

%tmp = mul i32 %bar, 2%foo1 = sub i32 %tmp, 2

8

11110010 11100101 1110101111010000 11000100 1100011000110000 00100000 0110111100111100 00111100 0010000001101110 01100111 01110100

Back End

• Optimization

• Assembly generation

%tmp = mul i32 %bar, 2%foo1 = sub i32 %tmp, 2

%0 = add i32 %bar, %bar%foo1 = sub i32 %0, 2

9

This is not new !

• GCC has a similar design…

• … but it’s not modular

• You cannot hack you way in

• LLVM was built modularly from the beginning

Why Modularity is a big deal

• No need to deal with CPU specificities

• Creating a new language is easier (reduce skill set)

• Adding new CPU support is easier (do it one time)

• Creating tools around languages is easier (Clang Static Analyzer, Refactoring)

Webkit FTL

https://www.webkit.org/blog/3362/introducing-the-webkit-ftl-jit/

Q&A

13

Thanks!

14