26
An Overview of a Compiler Y.N. Srikant Department of Computer Science and Automation Indian Institute of Science Bangalore 560 012 NPTEL Course on Principles of Compiler Design Y.N. Srikant Compiler Overview

An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

An Overview of a Compiler

Y.N. Srikant

Department of Computer Science and AutomationIndian Institute of Science

Bangalore 560 012

NPTEL Course on Principles of Compiler Design

Y.N. Srikant Compiler Overview

Page 2: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

Outline of the Lecture

About the courseWhy should we study compiler design?Compiler overview with block diagrams

Y.N. Srikant Compiler Overview

Page 3: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

About the Course

A detailed look at the internals of a compilerDoes not assume any background but is intensiveDoing programming assignments and solving theoreticalproblems are both essentialA compiler is an excellent example of theory translated intopractice in a remarkable way

Y.N. Srikant Compiler Overview

Page 4: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

Why Should We Study Compiler Design?

Compilers are everywhere!Many applications for compiler technology

Parsers for HTML in web browserInterpreters for javascript/flashMachine code generation for high level languagesSoftware testingProgram optimizationMalicious code detectionDesign of new computer architectures

Compiler-in-the-loop hardware development

Hardware synthesis: VHDL to RTL translationCompiled simulation

Used to simulate designs written in VHDLNo interpretation of design, hence faster

Y.N. Srikant Compiler Overview

Page 5: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

About the Complexity of Compiler Technology

A compiler is possibly the most complex system softwareand writing it is a substantial exercise in softwareengineeringThe complexity arises from the fact that it is required tomap a programmer’s requirements (in a HLL program) toarchitectural detailsIt uses algorithms and techniques from a very largenumber of areas in computer scienceTranslates intricate theory into practice - enables toolbuilding

Y.N. Srikant Compiler Overview

Page 6: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

About the Nature of Compiler Algorithms

Draws results from mathematical logic, lattice theory, linearalgebra, probability, etc.

type checking, static analysis, dependence analysis andloop parallelization, cache analysis, etc.

Makes practical application ofGreedy algorithms - register allocationHeuristic search - list schedulingGraph algorithms - dead code elimination, registerallocationDynamic programming - instruction selectionOptimization techniques - instruction schedulingFinite automata - lexical analysisPushdown automata - parsingFixed point algorithms - data-flow analysisComplex data structures - symbol tables, parse trees, datadependence graphsComputer architecture - machine code generation

Y.N. Srikant Compiler Overview

Page 7: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

Other Uses of Scanning and Parsing Techniques

Assembler implementationOnline text searching (GREP, AWK) and word processingWebsite filteringCommand language interpretersScripting language interpretation (Unix shell, Perl, Python)XML parsing and document tree constructionDatabase query interpreters

Y.N. Srikant Compiler Overview

Page 8: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

Other Uses of Program Analysis Techniques

Converting a sequential loop to a parallel loopProgram analysis to determine if programs are data-racefreeProfiling programs to determine busy regionsProgram slicingData-flow analysis approach to software testing

Uncovering errors along all pathsDereferencing null pointersBuffer overflows and memory leaks

Worst Case Execution Time (WCET) estimation andenergy analysis

Y.N. Srikant Compiler Overview

Page 9: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

Language Processing System

Y.N. Srikant Compiler Overview

Page 10: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

Compiler Overview

Y.N. Srikant Compiler Overview

Page 11: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

Compilers and Interpreters

Compilers generate machine code, whereas interpretersinterpret intermediate codeInterpreters are easier to write and can provide better errormessages (symbol table is still available)Interpreters are at least 5 times slower than machine codegenerated by compilersInterpreters also require much more memory than machinecode generated by compilersExamples: Perl, Python, Unix Shell, Java, BASIC, LISP

Y.N. Srikant Compiler Overview

Page 12: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

Translation Overview - Lexical Analysis

Y.N. Srikant Compiler Overview

Page 13: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

Lexical Analysis

LA can be generated automatically from regular expressionspecifications

LEX and Flex are two such tools

LA is a deterministic finite state automatonWhy is LA separate from parsing?

Simplification of design - software engineering reasonI/O issues are limited LA aloneLA based on finite automata are more efficient to implementthan pushdown automata used for parsing (due to stack)

Y.N. Srikant Compiler Overview

Page 14: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

Translation Overview - Syntax Analysis

Y.N. Srikant Compiler Overview

Page 15: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

Parsing or Syntax Analysis

Syntax analyzers (parsers) can be generated automaticallyfrom several variants of context-free grammarspecifications

LL(1), and LALR(1) are the most popular onesANTLR (for LL(1)), YACC and Bison (for LALR(1)) are suchtools

Parsers are deterministic push-down automataParsers cannot handle context-sensitive features ofprogramming languages; e.g.,

Variables are declared before useTypes match on both sides of assignmentsParameter types and number match in declaration and use

Y.N. Srikant Compiler Overview

Page 16: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

Translation Overview - Semantic Analysis

Y.N. Srikant Compiler Overview

Page 17: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

Semantic Analysis

Semantic consistency that cannot be handled at theparsing stage is handled hereType checking of various programming languageconstructs is one of the most important tasksStores type information in the symbol table or the syntaxtree

Types of variables, function parameters, array dimensions,etc.Used not only for semantic validation but also forsubsequent phases of compilation

Static semantics of programming languages can bespecified using attribute grammars

Y.N. Srikant Compiler Overview

Page 18: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

Translation Overview - Intermediate Code Generation

Y.N. Srikant Compiler Overview

Page 19: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

Intermediate Code Generation

While generating machine code directly from source codeis possible, it entails two problems

With m languages and n target machines, we need to writem × n compilersThe code optimizer which is one of the largest andvery-difficult-to-write components of any compiler cannot bereused

By converting source code to an intermediate code, amachine-independent code optimizer may be writtenIntermediate code must be easy to produce and easy totranslate to machine code

A sort of universal assembly languageShould not contain any machine-specific parameters(registers, addresses, etc.)

Y.N. Srikant Compiler Overview

Page 20: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

Different Types of Intermediate Code

The type of intermediate code deployed is based on theapplicationQuadruples, triples, indirect triples, abstract syntax treesare the classical forms used for machine-independentoptimizations and machine code generationStatic Single Assignment form (SSA) is a recent form andenables more effective optimizations

Conditional constant propagation and global valuenumbering are more effective on SSA

Program Dependence Graph (PDG) is useful in automaticparallelization, instruction scheduling, and softwarepipelining

Y.N. Srikant Compiler Overview

Page 21: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

Translation Overview - Code Optimization

Y.N. Srikant Compiler Overview

Page 22: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

Machine-independent Code Optimization

Intermediate code generation process introduces manyinefficiencies

Extra copies of variables, using variables instead ofconstants, repeated evaluation of expressions, etc.

Code optimization removes such inefficiencies andimproves codeImprovement may be time, space, or power consumptionIt changes the structure of programs, sometimes of beyondrecognition

Inlines functions, unrolls loops, eliminates someprogrammer-defined variables, etc.

Code optimization consists of a bunch of heuristics andpercentage of improvement depends on programs (may bezero also)

Y.N. Srikant Compiler Overview

Page 23: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

Examples of Machine-Independant Optimizations

Common sub-expression eliminationCopy propagationLoop invariant code motionPartial redundancy eliminationInduction variable elimination and strength reductionCode opimization needs information about the program

which expressions are being recomputed in a function?which definitions reach a point?

All such information is gathered through data-flow analysis

Y.N. Srikant Compiler Overview

Page 24: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

Translation Overview - Code Generation

Y.N. Srikant Compiler Overview

Page 25: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

Code Generation

Converts intermediate code to machine codeEach intermediate code instruction may result in manymachine instructions or vice-cersaMust handle all aspects of machine architecture

Registers, pipelining, cache, multiple function units, etc.Generating efficient code is an NP-complete problem

Tree pattern matching-based strategies are among the bestNeeds tree intermediate code

Storage allocation decisions are made hereRegister allocation and assignment are the most importantproblems

Y.N. Srikant Compiler Overview

Page 26: An Overview of a Compiler - IIT Hyderabadramakrishna/Compilers-Jan... · Optimization techniques - instruction scheduling Finite automata - lexical analysis Pushdown automata - parsing

Machine-Dependent Optimizations

Peephole optimizationsAnalyze sequence of instructions in a small window(peephole) and using preset patterns, replace them with amore efficient sequenceRedundant instruction eliminatione.g., replace the sequence [LD A,R1][ST R1,A] by [LDA,R1]Eliminate “jump to jump” instructionsUse machine idioms (use INC instead of LD and ADD)

Instruction scheduling (reordering) to eliminate pipelineinterlocks and to increase parallelismTrace scheduling to increase the size of basic blocks andincrease parallelismSoftware pipelining to increase parallelism in loops

Y.N. Srikant Compiler Overview