67
Crystal internals Part 1

Crystal internals (part 1)

Embed Size (px)

Citation preview

Crystal internalsPart 1

Is a compiler a hard thing?

At Manas we usually do webapps

Let’s talk about webapps...

Let’s talk about webapps...● HTML/CSS/JS● React/Angular/Knockout● Ruby/Erlang/Elixir● Database (mysql/postgres)● Elasticsearch● Redis/Sidekiq/Background-jobs● Docker, capistrano, deploy, servers

Let’s talk about webapps...● HTML/CSS/JS● React/Angular/Knockout● Ruby/Erlang/Elixir● Database (mysql/postgres)● Elasticsearch● Redis/Sidekiq/Background-jobs● Docker, capistrano, deploy, servers

Easy…?

Let’s talk about compilers...● HTML/CSS/JS● React/Angular/Knockout● Ruby/Erlang/Elixir● Database (mysql/postgres)● Elasticsearch● Redis/Sidekiq/Background-jobs● Docker, capistrano, deploy, servers

Easy!

Let’s talk about compilers...

Let’s talk about compilers...

No, let’s talk about usual programs

No, let’s talk about usual programsINPUT -> [PROCESSING…] -> OUTPUT

No, let’s talk about compilersSOURCE CODE -> [PROCESSING…] -> EXECUTABLE

No, let’s talk about compilersSOURCE CODE -> [PROCESSING…] -> EXECUTABLE

How do we go from source code to an executable?

Traditional stages of a compilerclass Foo

def bar

1 + 2

end

end

● Lexer: [“class”, “Foo”, “;”, “def”, “bar”, “;”, “1”, “+”, “2”, “;”, “end”, “;”, “end”]● Parser: ClassDef(“Foo”, body: [Def.new(“bar”)])● Semantic (a.k.a “type check”): make sure there are no type errors● Codegen: generate machine code

Let’s start with the codegen phaseGoal: generate efficient assembly code for many architectures (32 bits, 64 bits, intel, arm, etc.)

● Generating assembly code is hard● Generating efficient assembly code is harder● Generating assembly code for many architectures is hard/tedious/boring

Let’s start with the codegen phaseGoal: generate efficient assembly code for many architectures (32 bits, 64 bits, intel, arm, etc.)

● Generating assembly code is hard● Generating efficient assembly code is harder● Generating assembly code for many architectures is hard/tedious/boring

Thus: writing a compiler is HARD! :-(

Let’s start with the codegen phaseGoal: generate efficient assembly code for many architectures (32 bits, 64 bits, intel, arm, etc.)

● Generating assembly code is hard● Generating efficient assembly code is harder● Generating assembly code for many architectures is hard/tedious/boring

Thus: writing a compiler is HARD! :-(

Well, not anymore...

CodegenWith LLVM, we generate LLVM IR (internal representation) instead of assembly, and LLVM takes care of generating efficient assembly code for us!

The hardest part is solved :-)

define i32 @add(i32 %x, i32 %y) {

%0 = add i32 %x, %y

ret i32 %0

}

Codegen: LLVM (example)

LLVM provides a nice API to generate IRrequire "llvm"

mod = LLVM::Module.new("main")

mod.functions.add("add", [LLVM::Int32, LLVM::Int32], LLVM::Int32) do |func|

func.basic_blocks.append do |builder|

res = builder.add(func.params[0], func.params[1])

builder.ret(res)

end

end

puts mod

● Lexer● Parser● Semantic

Remaining phases

● Kind of easy: go char by char until we get a keyword, identifier, number, etc.● We won’t go into implementation details...

Lexer

● Kind of easy: go token by token and create a tree of expressions● This tree is called AST: Abstract Syntax Tree● An AST is like a directed, acyclic graph● We won’t go into implementation details...

Parser

● This is the fundamental piece of the compiler● It takes an AST as input and analyzes it● Analysis can result in:

○ Declaring types: for example “class Foo; end” will declare a type Foo○ Checking methods: for example “Foo.bar” will check that “Foo” is a declared type and that the

method “bar” exists in it, and has the correct arity and types○ Giving each non-dead expression in the program a type○ Gathering some info for the codegen phase: for example know the local variables of a method,

and their type

Semantic

● The interesting part of the compiler is the semantic phase● It’s just about processing an AST● In Crystal’s compiler you just need to know one language: Crystal!● No HTML/CSS/JS/JSX/etc.● No untyped, dynamic languages: no Ruby/Erlang/Elixir. Type safe!● Stuff is processed in memory● No databases, no Elasticsearch, no Redis

Semantic

● The interesting part of the compiler is the semantic phase● It’s just about processing an AST● In Crystal’s compiler you just need to know one language: Crystal!● No HTML/CSS/JS/JSX/etc.● No untyped, dynamic languages: no Ruby/Erlang/Elixir. Type safe!● Stuff is processed in memory● No databases, no Elasticsearch, no Redis

Writing a compiler is easier than writing a web app! ^_^

Semantic

● The interesting part of the compiler is the semantic phase● It’s just about processing an AST● In Crystal’s compiler you just need to know one language: Crystal!● No HTML/CSS/JS/JSX/etc.● No untyped, dynamic languages: no Ruby/Erlang/Elixir. Type safe!● Stuff is processed in memory● No databases, no Elasticsearch, no Redis

Writing a compiler is easier than writing a web app! ^_^

(Or at least it’s more fun :-P)

Semantic

Directory layout● src/compiler/crystal

○ command/○ syntax/○ semantic/○ macros/○ codegen/○ tools/○ compiler.cr○ types.cr○ program.cr

Directory layout● src/compiler/crystal

○ command/ : the command line interface○ syntax/ : lexer, parser, ast, visitor, transformer○ semantic/ : type declaration, method lookup, etc.○ macros/ : macro expansion logic○ codegen/ : codegen○ tools/ : doc generator, formatter, init○ compiler.cr : combines syntax + semantic + codegen○ types.cr : all possible types in Crystal (Int32, String, unions, custom types, etc.)○ program.cr : holds definitions of a program (holds Int32, String, etc.)

Directory layout● src/compiler/crystal : ~43K LOC

○ command/ : ~300LOC○ syntax/ : ~10K LOC○ semantic/ : ~12K LOC○ macros/ : ~2K LOC○ codegen/ : ~6K LOC○ tools/ : ~7K LOC○ compiler.cr : ~300LOC○ types.cr :~2K LOC○ program.cr : ~300 LOC

Directory layout● src/compiler/crystal : ~43K LOC

○ command/ : ~300LOC○ syntax/ : ~10K LOC○ semantic/ : ~12K LOC○ macros/ : ~2K LOC○ codegen/ : ~6K LOC○ tools/ : ~7K LOC○ compiler.cr : ~300LOC○ types.cr :~2K LOC○ program.cr : ~300 LOC

About 14K LOC to analyze source code.

Directory layout● src/compiler/crystal : ~43K LOC

○ command/ : ~300LOC○ syntax/ : ~10K LOC○ semantic/ : ~12K LOC○ macros/ : ~2K LOC○ codegen/ : ~6K LOC○ tools/ : ~7K LOC○ compiler.cr : ~300LOC○ types.cr :~2K LOC○ program.cr : ~300 LOC

About 14K LOC to analyze source code.One big Rails app at Manas has 14K LOC in “./app”

Directory layout● src/compiler/crystal : ~43K LOC

○ command/ : ~300LOC○ syntax/ : ~10K LOC○ semantic/ : ~12K LOC○ macros/ : ~2K LOC○ codegen/ : ~6K LOC○ tools/ : ~7K LOC○ compiler.cr : ~300LOC○ types.cr :~2K LOC○ program.cr : ~300 LOC

About 14K LOC to analyze source code.One big Rails app at Manas has 14K LOC in “./app”A compiler can’t be that hard! ;-)

Show me the code

Show me the code# src/compiler/crystal/compiler.cr

def compile(source : Source | Array(Source), output_filename : String) : Result

source = [source] unless source.is_a?(Array)

program = new_program(source)

node = parse program, source

node = program.semantic node, @stats

codegen program, node, source, output_filename unless @no_codegen

Result.new program, node

end

Show me the code# src/compiler/crystal/compiler.cr

def compile(source : Source | Array(Source), output_filename : String) : Result

source = [source] unless source.is_a?(Array)

program = new_program(source) node = parse program, source

node = program.semantic node, @stats

codegen program, node, source, output_filename unless @no_codegen

Result.new program, node

end

Show me the code# src/compiler/crystal/compiler.cr

def compile(source : Source | Array(Source), output_filename : String) : Result

source = [source] unless source.is_a?(Array)

program = new_program(source) node = parse program, source

node = program.semantic node, @stats

codegen program, node, source, output_filename unless @no_codegen

Result.new program, node

end

What is a program?

Program● Holds all types and top-level methods for a given compilation● For example, if I compile “class Foo; end” and you compile “class Bar; end”,

the first program will have a type named “Foo”, and the second one won’t (but it will have a type named “Bar”)

● It lets us test the compiler more easily, because we can use different Program instances for each snippet of code that we want to test

● In contrast of having global variables holding all of a program’s data● A Program is passed around in all phases of a compilation (except lexing and

parsing, which don’t need semantic info)

Show me the code# src/compiler/crystal/compiler.cr

def compile(source : Source | Array(Source), output_filename : String) : Result

source = [source] unless source.is_a?(Array)

program = new_program(source)

node = parse program, source # from source to Crystal::ASTNode node = program.semantic node, @stats

codegen program, node, source, output_filename unless @no_codegen

Result.new program, node

end

What is a program?

Show me the code# src/compiler/crystal/compiler.cr

def compile(source : Source | Array(Source), output_filename : String) : Result

source = [source] unless source.is_a?(Array)

program = new_program(source)

node = parse program, source

node = program.semantic node, @stats # Semantic! :-) codegen program, node, source, output_filename unless @no_codegen

Result.new program, node

end

What is a program?

Semantic● The entry point for semantic analysis is in

src/compiler/crystal/semantic.cr● Other files are in src/compiler/crystal/semantic/● The file semantic.cr has comments that explain the overall algorithm :-)

Semantic: overall algorithm● top level: declare classes, modules, macros, defs and other top-level stuff● new methods: create `new` methods for every `initialize` method● type declarations: process type declarations like `@x : Int32`● check abstract defs: check that abstract defs are implemented● class_vars_initializers: process initializers like `@@x = 1`● instance_vars_initializers: process initializers like `@x = 1`● main: process "main" code, calls and method bodies (the whole program).● cleanup: remove dead code and other simplifications● check recursive structs: check that structs are not recursive (impossible to

codegen)

Semantic: overall algorithm Note!

● This algorithm didn’t come from the Skies (nor from a textbook, nor from a paper)

● It’s not written in stone!● It can definitely be improved: readability,

performance, etc.

Note!

● It’s actually more like this…

Semantic: overall algorithm

SemanticBut before looking at each phase, we need to learn about the most useful pattern for analyzing an AST...

The Visitor pattern

require "compiler/crystal/syntax"

class SumVisitor < Crystal::Visitor

getter sum = 0

def visit(node : Crystal::NumberLiteral)

@sum += node.value.to_i

end

def visit(node : Crystal::ASTNode)

true # true: continue visiting children nodes

end

end

ast = Crystal::Parser.parse("foo(1 + 2, 3, [4])")

visitor = SumVisitor.new

ast.accept(visitor)

puts visitor.sum

The Visitor pattern● We define a visit method for each node of interest● We process the nodes● We return true if we want to process children, false otherwise● Example: if we only want to process class declarations, we could just define

visit(node : Crystal::ClassDef) and define some logic there (and return true, because of nested class definitions)

● A visitor abstracts over the way nodes are composed● ...though in many cases, for semantic purposes, we need and use the way a

node is composed (for example, to analyze a call we need to know the argument types, so we check the arguments, not all children in a generic way)

Semantic: overall algorithm● top level: declare classes, modules, macros, defs and other top-level stuff● new methods● type declarations● check abstract defs● class_vars_initializers● instance_vars_initializers● main● cleanup● check recursive structs

Top level: declare classes, modules, macros, defs...# src/compiler/crystal/semantic/top_level_visitor.cr

class Crystal::TopLevelVisitor < Crystal::SemanticVisitor # ...end

● Located at semantic_visitor.cr● This is a base visitor used in most of the phases of the semantic analysis● It keeps track of the “current type”● For example in “class Foo; class Bar; baz; end; end”, “current type” starts at

the top-level (the Program). When “class Foo” is found, the current type becomes “Foo” (we search “Foo” in the current type). When “class Bar” is found, the current type becomes “Foo::Bar” (we search “Bar” in the current type). When “baz” is found, it will be looked up inside the current type.

● But initially there’s no “Foo” inside the current type (the Program). Who defines it? … The top-level visitor!

Crystal::SemanticVisitor

● Located at top_level_visitor.cr● Defines classes, methods, etc.● Given “class Foo; class Bar; baz; end; end”...● current_type starts at Program● When “class Foo” is found (ClassDef), we check if “Foo” exists in the current

type. If not, we create it. If it exists with a different type (if it’s a module), we give an error.

● We attach this type “Foo” to the AST node ClassDef. SemnticVisitor will use this in every subsequent phase.

● … the “baz” call is not analyzed here (unless it’s a macro)

Crystal::TopLevelVisitor

Crystal::TopLevelVisitor● Many other things done in this visitor: methods and macros are added to

types, aliases and enums are defined, etc.● Question: why are methods and macros defined at this phase?

● The “inherited” macro hook must be processed as soon as “Bar < Foo” and “Baz < Foo” are found

● The macro expands to “do_something”, which must expand to “def foo; 1; end”

● This must happen before we continue processing Baz’s body: “def foo; 3; end” must win and be the method found when doing “Baz.new.foo”

● Conclusion: methods, macros and hooks must be defined in the first pass, when defining types. Additionally, macros might be looked up in types in this same pass (like “do_something”)

● SemanticVisitor takes care to look up and expand calls that resolve to macro calls

When should macros be defined and expandedclass Foo

macro inherited

do_something

end

macro do_something

def foo; 1; end

end

end

class Bar < Foo; end

class Baz < Foo

def foo; 3; end

end

puts Bar.new.foo # => 1

puts Baz.new.foo # => 3

Method overloads● Crystal methods are very powerful! For example: optional type restrictions,

different number of arguments, default arguments, splat, etc.● When methods are added to types we need to:

○ Know if a method replaces (redefines) an old method○ Track whether a method is “stricter” than another method, to quickly know, given a call

argument types, in which order they are going to be tested

Method restrictionsdef foo(x : Int32)

puts 1

end

def foo(x)

puts 2

end

foo(1)

foo('a')

● Given foo(1), both methods match it. However, the first overload should be invoked because it has a stronger restriction than the second overload.

● If we define the methods in a different order, it still works the same

● This is because an argument with a type restriction is stronger than one without one. We say that the first one is a restriction of the second one (we should probably rename this to use stronger)

● This applies to types too: Int32 is stronger than Int32 | String. And Bar is stronger than Foo, if Bar < Foo .

● Given two methods with the same name, if all arguments of a method are stronger than the others’, the whole method is stronger and should come first. Each type stores an ordered list of methods indexed by method name, with this notion.

● If the methods are both stronger than each other, they have the same restriction.

Method restrictionsdef foo(x : Int32)

puts 1

end

def foo(x)

puts 2

end

foo(1)

foo('a')

● This logic is located at restrictions.cr● A lot of cases to consider: generics, tuples, splats, etc.● The code and algorithms could probably use a simpler, unified logic

and a cleanup, but first all of these concepts and definitions must be defined much more formally

Semantic: overall algorithm● top level● new methods: create `new` methods for every `initialize` method● type declarations● check abstract defs● class_vars_initializers● instance_vars_initializers● main● cleanup● check recursive structs

● Located at new.cr● TopLevelVisitor creates a `new` class method for every `initialize` method it

finds (the logic for this is also in new.cr)● Classes that end up without an `initialize` need a default, argless `self.new`

method● This phase is a bit messy right now because of some missing things related to

generics…

Semantic: new methods

class Foo

def initialize(x : Int32)

@x = x

end

# Generated from the above

def self.new(x : Int32)

instance = allocate

instance.initialize(x)

if instance.responds_to?(:finalize)

::GC.add_finalizer(instance)

end

end

end

Semantic: new methods

Semantic: overall algorithm● top level● new methods● type declarations: process type declarations like `@x : Int32`● check abstract defs● class_vars_initializers● instance_vars_initializers● main● cleanup● check recursive structs

● Located at type_declaration_processor.cr (and type_declaration_visitor.cr and type_guess_visitor.cr)

● Combines info gathered by these two visitors to declare the type of instance and class variables.

● TypeDeclarationVisitor deals with explicit type declarations● TypeGuessVisitor tries to “guess” the type of instance and class variables

without an explicit type annotations (for example @x = 1 and @x = Foo.new)

Semantic: type declarations

Semantic: overall algorithm● top level● new methods● type declarations● check abstract defs: check that abstract defs are implemented● class_vars_initializers● instance_vars_initializers● main● cleanup● check recursive structs

● Located at abstract_def_checker.cr● Not a visitor, but traverses all types, and for those that have abstract defs

checks that subclasses or including modules defined those methods

Semantic: check abstract defs