36
Dynamic Compilation using LLVM Alexander Matz Institute of Computer Engineering University of Heidelberg [email protected]

Dynamic Compilation using LLVM - Heidelberg University

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Dynamic Compilation using LLVM - Heidelberg University

Dynamic Compilation using LLVM

Alexander Matz

Institute of Computer Engineering

University of Heidelberg

[email protected]

Page 2: Dynamic Compilation using LLVM - Heidelberg University

Outline

Motivation

Architecture and Comparison

Just-In-Time Compilation with LLVM

Runtime/profile-guided optimization

LLVM in other projects

Conclusion and Outlook

2

Page 3: Dynamic Compilation using LLVM - Heidelberg University

Motivation

Traits of an ideal compiler

• Fast compilation

• (Compile-link-execute model)

• Platform and source independency

• Low startup latency of resulting executable

• Low runtime overhead

• Aggressive optimization

• Zero-effort adaption to patterns of use at runtime

No current system has all these traits

LLVM aims to fill the gap

3

Page 4: Dynamic Compilation using LLVM - Heidelberg University

What is LLVM

LLVM (“Low Level Virtual Machine”) consists of:

• A Virtual Instruction Set Architecture not supposed to actually run on

a real CPU or Virtual Machine

• A modular compiler framework and runtime environment to build,

run and, most importantly, optimize programs written in arbitrary

languages with LLVM frontend

Primarily designed as a library, not as a „tool“

4

Page 5: Dynamic Compilation using LLVM - Heidelberg University

Existing Technologies

5

Page 6: Dynamic Compilation using LLVM - Heidelberg University

Existing technologies

Statically compiled and linked (C/C++ etc.)

Virtual Machine based (Java, C# etc.)

Interpreted (Javascript, Perl)

6

Page 7: Dynamic Compilation using LLVM - Heidelberg University

Existing technologies

Statically compiled and linked (C/C++ etc.)

• Static machine code generation early on

• Platform dependent

• Optimization over different translation units (.c files) difficult

• Optimization at link time difficult (no high level information available)

• Profile-guided optimization requires change of build model

• Optimization at run-time not possible at all

Virtual Machine based (Java, C# etc.)

Interpreted (Javascript, Perl)

7

Page 8: Dynamic Compilation using LLVM - Heidelberg University

Existing technologies

Statically compiled and linked (C/C++ etc.)

Virtual Machine based (Java, C# etc.)

• Keep high level intermediate representation (IR) for as long as

possible

• „Lazy“ machine code generation

• Platform independent

• Allows aggressive runtime optimization

• Only few (fast) low level optimizations possible on that IR

• Just-In-Time-compiler has to do all the hard and cumbersome work

Interpreted (Javascript, Perl)

8

Page 9: Dynamic Compilation using LLVM - Heidelberg University

Existing technologies

Statically compiled and linked (C/C++ etc.)

Virtual Machine based (Java, C# etc.)

Interpreted (Javascript, Perl)

• No native machine code representation generated at all

• Platform independent

• Fast build process

• Optimizations difficult in general

9

Page 10: Dynamic Compilation using LLVM - Heidelberg University

Architecture and Comparison

10

Page 11: Dynamic Compilation using LLVM - Heidelberg University

LLVM System Architecture

LLVM aims to combine the advantages without keeping

the disadvantages by

• Keeping a low level representation (LLVM IR) of the program at all

times

• Adding high level information to the IR

• Making the IR target and source indepent

Source: Lattner, 2002

11

Page 12: Dynamic Compilation using LLVM - Heidelberg University

Distinction

Difference to statically compiled and linked languages

Type information is preserved through whole lifecycle

Machine code generation is the last step and can also

happen Just-In-Time

12

Page 13: Dynamic Compilation using LLVM - Heidelberg University

Distinction

Difference to VM based languages

LLVM IR is not supposed to run on a VM

IR much more low level (no runtime or object model)

No guaranteed safety (programs written to misbehave still

misbehave)

13

Page 14: Dynamic Compilation using LLVM - Heidelberg University

Benefits

Low Level IR

High Level Type Information

Modular/library approach revolving around LLVM IR

14

Page 15: Dynamic Compilation using LLVM - Heidelberg University

Benefits

Low Level IR

• Potentially ALL programming languages can be translated into LLVM

IR

• Low level optimizations can be done early on

• Machine code generation is cheap

• Mapping of generated machine code to corresponding IR is simple

High Level Type Information

Modular/library approach revolving around LLVM IR

15

Page 16: Dynamic Compilation using LLVM - Heidelberg University

Benefits

Low Level IR

High Level Type Information

• Allows data structure analysis on whole program

• Examples of now possible optimizations

• Pool allocators for complex types

• Restructuring data types

• Used in another project to prove programs as safe (Control-C,

Kowshik et al., 2003)

Modular/library approach revolving around LLVM IR

16

Page 17: Dynamic Compilation using LLVM - Heidelberg University

Benefits

Low Level IR

High Level Type Information

Modular/library approach revolving around LLVM IR

• All optimization modules can be reused in every project using the

LLVM IR

• Not limited to specific targets (like x86), see other projects using

LLVM

• Huge synergy effects

17

Page 18: Dynamic Compilation using LLVM - Heidelberg University

Just-In-Time Compilation with LLVM

18

Page 19: Dynamic Compilation using LLVM - Heidelberg University

Just-In-Time Compilation with LLVM

Lazy machine code generation at runtime

All target independant optimizations already done at this

point

Target specific optimizations are applied here

Supposed to keep both native code and LLVM IR with

additional information on mapping between them

Currently two options on x86 architectures with

GNU/Linux

19

Page 20: Dynamic Compilation using LLVM - Heidelberg University

Just-In-Time Compilation with LLVM

Clang (LLVM frontend) as drop in replacement for gcc

Results in statically linked native executable (much like

with gcc)

No LLVM IR kept, no more optimizations after linking

Executable performance comparable to gcc

20

Page 21: Dynamic Compilation using LLVM - Heidelberg University

Just-In-Time Compilation with LLVM

Clang as frontend only

Results in runnable LLVM bitcode

No native code kept, but bitcode still optimizable

Target specific optimizations are applied automatically

Higher startup latency

21

Page 22: Dynamic Compilation using LLVM - Heidelberg University

Runtime/profile-guided optimization

22

Page 23: Dynamic Compilation using LLVM - Heidelberg University

Runtime/profile-guided optimization

All optimizations that can not be predicted at compile/link

time (patterns of use/profile)

Needs instrumentation (=performance penalty)

Examples for profile-guided optimizations

• Identifying frequently called functions and optimize them more

aggressively

• Rearranging basic code blocks to leverage locality and avoid jumps

• Recompiling code making risky assumptions (sophisticated but

highest performance gain)

23

Page 24: Dynamic Compilation using LLVM - Heidelberg University

Runtime/profile-guided optimization

Statically compiled and linked approach:

• Compile-link-execute becomes Compile-link-profile-compile-link-

execute

• In most cases the developers, not the users, profile the application

• Still no runtime optimization

Result: Profile-guided optimization is skipped most of the

time

24

Page 25: Dynamic Compilation using LLVM - Heidelberg University

Runtime/profile-guided optimization

VM based languages approach:

• High level representation kept at all times

• Runtime environment profiles the application in the field without

manual effort

• Hot paths analyzed and optimized (Java HotSpot)

• Expensive optimizations and code generation compete for cpu

cycles with running application

25

Page 26: Dynamic Compilation using LLVM - Heidelberg University

Runtime/profile-guided optimization

LLVM approach (goal):

• Low level representation is kept

• Runtime environment profiles the application in the field

• Cheap optimizations are done at runtime

• Expensive optimizations are done during idle

26

Page 27: Dynamic Compilation using LLVM - Heidelberg University

Runtime/profile-guided optimization

Result (ideal)

• Many optimization are already done on LLVM IR before execution

• Runtime and offline optimizers adapt to use pattern and become

more dormant over time

• No additional development effort necessary

Current limitations (on x86 + GNU/Linux)

• No actual optimization at runtime, JIT-Compiler invoked at startup

once and does not adapt to patterns of use during execution

• Profile-guided optimization possible, but only between runs

• Instrumentation needs to be manually enabled/disabled

27

Page 28: Dynamic Compilation using LLVM - Heidelberg University

LLVM in other projects

28

Page 29: Dynamic Compilation using LLVM - Heidelberg University

LLVM in other projects

Ocelot: Allows PTX (CUDA) kernels to run on

heterogeneous Systems containing various GPUs and

CPUs

PLANG: Similar project, but limited to execution of PTX

kernels on x86 CPUs

OpenCL to FPGA compiler for the Viroquant project by

Markus Gipp

29

Page 30: Dynamic Compilation using LLVM - Heidelberg University

Ocelot/PLANG

Idea: Bulk Synchronous Parallel programming model fits

many-core-trend perfectly

• GPU Applications partitioned without technical limitations in mind

(thousands or millions of threads, think of PCAM)

• Threads are reduced and mapped to run on as many (CPU-)cores

as available (<100)

• Automatic mapping to available cores brings back automatic

speedup with newer CPUs/GPUs

30

Page 31: Dynamic Compilation using LLVM - Heidelberg University

Ocelot/PLANG

Both projects can be seen as LLVM frontends when used for x86 exclusively

Benefits • „Easy“ implementation, PTX is similar to LLVM IR

• Most optimization can be taken „as is“

• x86 code generation already available

Drawbacks • Information is lost when transforming PTX to LLVM IR

• Big software overhead due to GPU features not being present in CPUs

CUDA PTX LLVM IR X86 native nvcc

Ocelot/PLANG LLVM

31

Page 32: Dynamic Compilation using LLVM - Heidelberg University

OpenCL to FPGA compiler for Viroquant

Can be seen as LLVM backend

Benefits

• Compiler already available (OpenCL treated as plain C)

• Again, optimizations can be taken as is

Drawbacks

• Translation from LLVM IR to VHDL is very complex

OpenCL LLVM IR VHDL Code Clang/LLVM VHDL Code generator

32

Page 33: Dynamic Compilation using LLVM - Heidelberg University

Conclusion and Outlook

33

Page 34: Dynamic Compilation using LLVM - Heidelberg University

Conclusion

Mature compiler framework used in many projects and as

an alternative to gcc (comparable performance)

Interesting new language independent optimizations

Some important features still missing (for x86)

• Actual runtime optimization

• Profile-guided optimization without manual intervention

• Keeping native code with LLVM IR to reduce startup latency

Not yet the „ideal“ compiler

34

Page 35: Dynamic Compilation using LLVM - Heidelberg University

Outlook

Missing features can be expected to be implemented

• Very active user/developer base (including Apple and NVIDIA)

• Clean codebase, no legacy issues (as opposed to gcc)

• Modular/library approach leverages synergy effects

Performance already (mostly) on level with gcc generated

code, may outperform gcc in the future

Projects like Ocelot, PLANG and VHDL Compiler may

help to overcome burdens of increasingly complex

systems

Profile guided optimization reduces need for good branch

prediction in CPUs -> easier, more energy efficient CPUs

35

Page 36: Dynamic Compilation using LLVM - Heidelberg University

Thank you