80
Gianluca Costa Introduction to Erlang http://gianlucacosta.info/

Introduction to Erlang

Embed Size (px)

Citation preview

Gianluca Costa

Introduction to Erlang

http://gianlucacosta.info/

Part 1

The journey begins

Preface

Elegance always matters, especially when creating software: why should distributed, sophisticated real-time systems be an exception?

This introduction describes Erlang - its syntax and ideas - with no claim of completeness: for further information and details, please refer to the online documentation and books.

Special thanks

Special thanks to:

● Professor Paolo Bellavista, for his valuable advice and suggestions

● Francesco Cesarini and Simon Thompson, authors of the book «Erlang Programming» which, together with Erlang’s documentation, inspired this work

● Ericsson, for inventing such an interesting language and making it open source

Is something not C++ fast enough?

● «[...]if your target system is a high-level, concurrent, robust, soft real-time system that will scale in line with demand, make full use of multicore processors, and integrate with components written in other languages, Erlang should be your choice» (cit. “Erlang Programming”)

● Case studies revealed that Erlang systems proved to be more performant and stable than the equivalent C++ versions, especially under heavy loads

● In a range of contexts, the gist of the problem is finding smart algorithms, not focusing on bit-layer performances

● Erlang is a language designed to tackle real-world problems with minimalism and elegance

Brief history

● 1980s: researchers at Ericsson’s Computer Science Laboratory, after analyzing several languages, start creating a new one– The first Erlang VM was Prolog-based – which explains a lot of

similarities between Erlang and Prolog

● 1991: first release of the C-based Erlang VM: BEAM = Björn’s Erlang Abstract Machine

● 1992//1994: first commercial project written in Erlang● 1996: introduction of the OTP (=Open Telecom Platform)

framework● 1998: Erlang becomes open source

Why Erlang?

Cross-platform Virtual Machine

Functional

Declarative

Pattern matching

Immutable structures Higher-order functionsList manipulation

Optimized tail recursion

OTP middlewarelibraries

Per-process garbagecollection

Lightweight processes

Advanced IPC

Code patching

Bit sequence patterns

Visual debuggerVisual

Processmanager

Interactive shell

Minimalist

Open source

Interoperability withother ecosystems

Transparent SMPEasy inter-nodecommunication

Dynamic, but compiled

EDoc documentation

Installing Erlang

● On Ubuntu, the fastest way is running:

sudo apt-get install erlang

● On Windows, the easiest way is the setup wizard

To start the interactive shell, run: erl

To execute the compiler, run: erlc

In the shell, commands end with “.”, so multiline commands are supported

Part 2

Basic syntax

Language overview

● Erlang is a dynamically-typed language – in particular, you don’t need to declare variable types

● However, Erlang is a compiled language, therefore source files must be explicitly compiled – which prevents runtime syntax errors

● Erlang’s syntax and basic ideas are fairly similar to Prolog’s; one of the first and most important differences is that computation output is expressed by the return value of functions, not by parameters.

Integers

● Integers have no bound. Internally:– By default, to enhance performances, they are stored in a word

– Longer integers are slower, but their size is only constrained by the available memory

● An integer constant can be defined as follows:

[+|-][<Base>#]Value– Base can be 2//16; if omitted, 10 is assumed

– Value is expressed via digits and, if Base > 10, via letters A to F

● Examples: 8, -90, 16#EA3, -8#32● Integer division: div (infix operator) → example: 13 div 5● Integer remainder: rem (infix operator) → example: 13 rem 5

Floats

● Float = floating point number, as described by IEEE754-1985 standard

● Examples: 9.07, -1.0e12● Usually slower than integers● The division operator / always returns a float● As usual, operations mixing integers and floats

upcast their operands to float● Integers and float are classified as numbers

ASCII characters

● There is not a char type: instead, you can express an integer using its ASCII character via a dedicated notation:

$<ASCII char|escape character>

Examples: – $A is the integer value 65 – $\n is the integer value 10

● Consequently, on the shell, single characters are printed out as numbers

Atoms

● Atom = constant literal standing for itself● To define an atom, just use it in code:

– Without quotes, it must match this regex: [a-z][A-Za-z0-9_@]*Examples: true, ok, helloWorld, error@point

– Otherwise, it must be within single quotesExamples: ‘EXIT’, ‘spaces CAN be included’

● Memory imprint of an atom is constant – whatever its length● Optimized for equality check and pattern matching● Atoms are not garbage-collected – beware of memory leaks if

you arbitrarily generate, via dedicated built-in functions, a lot of atoms over the time!

Boolean values

● true and false are just atoms – and they happen to be returned, for example, by comparison and logic operators

● Comparison operators:– <, >, >=, =< (à la Prolog)

– ==, /= → equal/not equal, disregarding the type

– =:=, =/= → type-checking equal/not equal (efficient!)

● Logical operators: and, andalso (short-circuits), or, orelse (short-circuits), xor, not

Guards

● Guard = expression returning true or false that can only contain:– Bound variables and constants

– Arithmetic, comparison, logic, bitwise operators

– A few built-in functions– Compatible preprocessor macros

● User-defined functions are not allowed, to prevent side-effects – because Erlang is functional, but not purely functional

● Guard expressions can be combined using:– Comma (,), to join expressions using logic and– Semicolon (;), to join expressions using logic or

● Should a runtime error occur while evaluating a guard, it is silently caught and the guard returns false

Tuples

● Tuple = collection of items designed to be:– Heterogeneous → you can mix elements of all types

– Immutable– Arbitrarily nestable– Fixed-size

● To define a tuple, just employ:– {item1, item2, …, itemN}

– {} → empty tuple

● When the first item in a tuple is an atom, the atom is called tag and the tuple is tagged → tagged tuples are especially useful to:– Exchange messages between processes

– Return multiple values from a function

● However, in lieu of tagged tuples, consider using records (see later)

Lists

● Like tuples, lists are heterogeneous, immutable, nestable collections of items

● Lists are also fixed-size, but they are meant to be processed – it is fairly common to obtain shorter or longer lists via manipulation

● To define a proper list:– [item1, item2, …, itemN]

– [] → empty list

– [head item 1, head item2, … | tail] → where tail is another proper list

List operations

● length(AList) returns the number of items in the given list● ++ (infix) concatenates 2 lists. As usual, it is O(N) with respect

to the length of the first list, so use it carefully; often, it is more efficient to prepend items to the result (which is O(1)), so as to reverse it in the end

● -- (infix) removes every item in the rightside list (at most once per occurrency) from the leftside list

● ++ and -- are right-associative – of course, you can force precedence via parentheses

● Lists support usual pattern matching, especially [Head | Tail] or the other constructors mentioned previously

Strings

● Strings in Erlang are simply lists of integers● There are two important bonus effects:

– Double-quoted strings are supported: “Hi” is shortcut for [$H, $i], which in turn is shortcut for [72, 105]

– Lists of integers that can be mapped to printable ASCII characters will be printed out as double-quoted strings by the Erlang shell

● Unlike atoms, the string length affects its memory imprint: when handling huge strings, you might consider using binaries instead

● The compiler automatically concatenates string literals as if they were one: “Pandas are” “ cute animals” == “Pandas are cute animals”. Never use ++ to concatenate string constants.

List comprehensions

● Concise notations to express a pipeline of functional map’s and filter’s on lists:

[itemTerm || generator|filter, generator|filter, …]– itemTerm: expression denoting the generic item of the result list. Can

combine variables provided by any generator

– generator: construct of the form pattern <- sourceList – most often, it will be Variable <- sourceList

– filter: a guard that can employ any variable bound by generators to its left

● Only items satisfying both generator patterns and filter guards participate in the creation of the new list

● List comprehensions can be nested and can reference (or even shadow) variables in outer comprehensions

List comprehensions - Examples

Be L1 = [5, 8, 10, 29] and L2 = [9, 10, 4, 0, 73, 2]

Then:● [X || X <- L1, X > 6] → [8, 10, 29]● [{X, Y} || X <- L1, Y <- L2} is the cartesian

product L1xL2 → [{5,9}, {5, 10}, {5, 4}, …]● [X + 3 – Y || X <- L1, Y <- L2, Y < (X + 5)] → in

this case, the filter on Y references X

References

● Reference = a term that is unique in an Erlang runtime system

● To create a reference, use make_ref(), usually assigning its return value to a variable

● References are especially employed in message passing, in particular to unambiguously identify the reply to a given request sent

Comments

● Erlang only supports single-line comments, starting with %

● Comments can also be parsed by dedicated tools providing, for example, autocompletion and smart suggestions

Variables

● Variables always start with an uppercase letter or an underscore

● Single-assignment principle: any variable can be assigned a value once only per scope

● Variables have no type declaration – Erlang is dynamically typed

● Assignment is just a special case of pattern matching – actually, = is called match operator

● Variables can be assigned once in the shell, too; however, one can use f() to unbind them all, or f(Var) to unbind just the given Var.

Pattern matching

● Similar to other functional languages:

Pattern = Term

where:– Pattern can include variables (bound and unbound) and constants terms

– Term cannot include unbound variables

● Pattern and Term can have nested structures – which, of course, should match● If matching succeeds, the unbound variables on the left get bound to the

corresponding terms on the right● There are also catch-all patterns:

– _ (the underscore, called anonymous variable) always matches but is never bound

– _Var → “don’t care” variables: they work like other variables, but the compiler won’t issue a warning if they are used in the pattern only. They are declared for the sake of clarity

Case

● case is a conditional construct applying unification:

case expression ofpattern1 [when guard1] -> expr1.1, …, expr1.N1;

(…) patternM [when guardM] -> exprM.1, …, exprM.NM

end ● Branches are checked from top to bottom, and only the first branch matching

expression (and having guard missing or evaluated to true) is selected● If no pattern matches, a runtime error occurs; to prevent this, you can use a

catch-all (an unbound variable such as Default -> or the wildcard _ ->)● case returns the last expression of the selected branch (so, you could have

MyVar = case ...)

If

● if is a conditional construct based on guards:

ifguard1 -> expr1.1, …, expr1.N1;(…)guardM -> exprM.1, …, expr M.NM

end● The first guard, from top to bottom, evaluated to true determines

the chosen branch, whose latest expression becomes the overall value of the if construct

● If no branch is selected, a runtime error occurs; the catchall guard is, of course, true ->

Functions

● Erlang supports overloading based on arity (=number of parameters): functions with equal name but different arity are unrelated

● A function consists of clauses: a local function f having N parameters is referred to as f/N and is declared as follows:– f(arg1.1, …, arg1.N) [when guard 1] ->

expr1.1, …, expr1.M1;

f(arg2.1, …, arg2.N) [when guard 2] -> expr2.1, …, expr2.M2;

f(argZ.1, …, argZ.N) [when guard Z] -> exprZ.1, …, exprZ.MZ.

● Clauses end with “;”, except the last one, ending with “.”● Every argument can be an arbitrary pattern: so, unlike the case

construct, function heads can match multiple values at once

Functions - Examples

● f(A) -> A + 1.

● f(A, B) -> A + B.

● factorial(0) -> 1;factorial(N) when N > 0 -> N * factorial(N -1).

→ this function has 2 clauses – and one of them employes a guard

● f({A, B}, C, D) ->Delta = C * D,{A + Delta, B + Delta}.

→in this case, the first argument is a tuple, so binding is performed on its 2 variables.The result is a tuple as well

Clause order

● Clauses are checked from top to bottom: the first clause whose head (=pattern to the left of ->) matches the arguments - and whose guard is missing or true - is selected for execution

● If no head matches, a runtime error occurs● Consequently, arguments are actually passed via pattern

matching. ● Erlang performs call-by-value: all arguments are evaluated

before calling a function● The result of a function is the last expression of the selected

clause body (=sequence of expressions to the right of ->)

Recursion

● Functional languages like Erlang or Haskell do not provide destructive iteration constructs such as for, while, repeat

● Erlang is based on:– Recursive functions

– Library functions hiding the destructive aspect of the accumulation process – such as foldl

Tail recursion

● In almost every functional language, tail recursion is provided a dedicated optimization:– Tail-recursive functions run like a C for loop –

requiring constant memory, as no recursive calls are internally performed by the VM

● In Erlang, tail recursion is not necessarily better than non-tail recursion, as the latter is very optimized, too

Modules

● Modules are .erl text files, whose structure is determined by attribute declarations:➢ -module(moduleName): moduleName must be the file name without extension

➢ -vsn(version): the module’s version. If missing, a checksum is used

➢ -export([name/arity, name/arity, …]): list of functions that can be called from outside the module

➢ -import(modName, [name/arity, …]): functions exported by another module that can be called as if they were local functions

● Custom module attributes can be added – such as-author(“AuthorName”)

● Module:module_info/0 and /1 return metainfo for the given Module

Employing modules

● Unlike languages such as Python, modules must be explicitly compiled – for example, via the shell’s c(ModuleName) function or via the erlc compiler.

● Bytecode modules have .beam extension● Erlang finds modules in a code path - similar to Java’s

classpath – returned by code:get_path/0● To fix problems exposed by Java’s classpath (especially

scan time), Erlang can be started in embedded mode● Directories can be prepended/appended when starting

Erlang or anytime during the execution

Lambda functions

● Lambda function = anonymous function – therefore, it is generally assigned to a variable or passed as argument:

fun (arg1.1, …, arg1.N) [when guard1] - > expr1, …, exprM1;(...)(argK.1, …, argK.N) - > exprK.1, …, exprK.MK

end● Lambda functions usually have just one clause; should they have more, all the

heads must have the same arity● A lambda function can reference bound variables of outer functions, defining a

closure:

h(A) ->

fun (B) -> A * B end.

Functional programming

● Erlang is functional: functions are not just static code blocks – they are values

● Therefore, functions can be assigned to variables, passed to and returned from other (higher-order) functions

● A lambda function can be assigned to a variable, but a standard function can be assigned as well:– To assign function f/N from module mymod, just use

MyFunction = fun mymod:f/N– To assign a local function:

MyFunction = fun f/N

● The lists module include common functional utilities – map, filter, flatmap, foldl, all, any, ...

BIFs

● BIFs = built-in functions, mainly belonging to the erlang module

● Most of them are auto-imported, so they can be called without qualifying them

● BIFs are often seen as part of the language:– Conversion functions

– Basic list (hd/1, tl/1, length/1) and tuple (tuple_size/1) manipulation

– Date/time

– Process management and communication

Calling a function

● To call a local function or BIF f/N: f(arg1, …, argN)● To call a function f/N exported by module mymod:

– mymod:f(arg1, ...argN) → fully-qualified call. In lieu of constants, mymod and f could be variables – to provide late binding. If you are used to Python, please note that modules do not need to be imported – but their compiled bytecode must be in the code search path.

– Add the module attribute -import(mymod, [f/N]), then call f as if it were local: f(…)

● Of course, fully-qualified calls are allowed when calling local functions, too → in this case, the ?MODULE preprocessor constant is handy

● apply(Module, Function, ArgumentsList) is a metaprogramming BIF similar to a fully-qualified call – but the arity itself can be unknown at the time of the call, if ArgumentsList is passed as a variable

● To call a lambda function, use (arg1, …, argN) after its expression (between parentheses) or after any variable bound to it

Records

● Records, similarly to Pascal, provide a structured way to access fields by name

● Records are usually declared in include modules, via a dedicated attribute:

-record(recordType,{field1 [=defaultValue1], …, fieldN [=defaultValueN]})

● Record fields have no type declaration, like variables● The value of a record field can even be set to another record instance● Records are internally translated into tagged tuples, but they provide a

lot more flexibility should one need to add fields

Instantiating records

● To instantiate a record:

#recordType{ fieldX = ValueX, …}– Missing fields will be assigned their default value –

or undefined, if it was not declared

– Unlike tuple items, fields can be assigned in any order

– Usually, the instance is assigned to a variable: MyVar = #...

Accessing record values

● To access a single field:

RecordInstanceVar#recordType.fieldName– Again, the expression is usually assigned to a variable

● To access multiple fields, use pattern matching:

#recordType{fieldX=VarA, fieldY = VarB} = RecordInstance– will bind VarA to fieldX of RecordInstance and VarB to fieldY

● Record patterns are supported in other contexts, such as function heads or case and receive constructs

Copying records

● Records are immutable – but creating a copy of a record having different values for one or more fields is very simple and efficient:

NewRecord = RecordInstance#recordType{ fieldX = newValueX, fieldY = newValueY, …}

● NewRecord will be equal to RecordInstance, except the given fields

Records in the shell

● To read all the records defined in a file, use:

rr(“fileName”)

● To define a record in the shell, use:

rd(…)

which is the shell version of -record(...)

Bit strings

● Bit string = untyped chunk of bits, used for performance reasons or when handling low-layer protocols

● Bit notation: << Value1, Value2, …, ValueN >>where each value:– Can be an integer or a string– Can be followed by :size and/or /specifier1[-specifier2-...]

● Empty bit string: <<>>● Bit strings also support bit string comprehensions,

similar to list comprehensions.Example: << <<(X+1)>> || X <- [1, 2, 3] >> → << 2, 3, 4 >>

Binaries

● Binary = a byte string, that is a bit string containing a number of bits evenly divisible by 8

● Any term can be converted to/from its binary representation via simple BIFs: term_to_binary/1 and binary_to_term/1

● With a list of values:– term_to_binary/1 returns its binary representation

– list_to_binary/1 returns a binary whose items are the list items

Bit strings and pattern matching

● Bit strings, via bit notation with size declarations and specifiers, support very fine-grained pattern-matching

● Example: <<Higher:5, Lower:3>> = <<2#01010001:8/unsigned>>– The original value is 2#01010001, that is 81

– Higher is bound to the bits 01010, that is 10

– Lower is bound to the bits 001, that is 1

● Please, refer to Erlang’s documentation for further details

Advanced bit pattern matching

● A striking trait is that size qualifiers can be expressed by variables previously bound in the very same pattern – which greatly simplifies frame and packet analysis via a single pattern.

● Example:<< A:3, B:A, _/bits >> = << 2#1110101010101:13 >>– A is bound to the 3 left-most bits, (111)2 → (7)10

– B is bound to the following 7 bits, as A is its size: (0101010)2 → (42)10

– _/bits consumes the remaining bits, ensuring the match

Bitwise operators

● Erlang defines bitwise infix operators:– band

– bor

– bxor

– bnot

– bsl → shift left

– bsr → shift right

● Example: 2#1011 band 2#0010 == 2#0010

Writing to the console

● The io module provides utilities for writing to the console:– nl/0: outputs a newline character

– write/1: outputs the given term – strings are printed as lists

– format/2: similar to C’s printf, with an Erlang-specific format string

● It also provides input utilities:– read/1 reads a term from stdin, after showing the given

prompt → returns {ok, InputTerm} on success

● io includes more functions – have a look at its EDoc

Preprocessor

● EPP = Erlang PreProcessor – fairly similar to C and C++

● To define a compile-time constant, use -define(CONSTANT_NAME, value).

● A constant can be injected via ?CONSTANT_NAME in the source code– Predefined constants: ?MODULE, ?FILE, ?LINE, …

● -define can also create parametric macros – parametric textual replacements that, unlike functions, can be employed in guards

● -ifdef, -ifndef, -else, -endif enable conditional compilation

● Include files, often containing record definitions and constants:– Should have .hrl extension

– Can be included via the attribute -include(“includeFile.hrl”)

Runtime errors

● Defensive programming = trying to foresee and catch every single runtime error. It is not frequent in Erlang: it is more likely to let a process crash, so that a dedicated process will choose what to do

● Common errors: badarith, function_clause, case_clause, if_clause, badmatch, undef

● throw/1 raises a throw, with the given atom● try/catch and the old-fashioned catch can

intercept runtime errors

Benchmarking

● A useful function is:

timer:tc(Module, Function, ArgumentsList)

● It calls a function timing its execution, then returns a tuple:

{ExecutionTimeInMicroSeconds, FunctionResult}

Part 3

Parallelism in Erlang

Parallelism in Erlang

● Erlang processes are not operating system threads – they are much more lightweight, handled by the VM itself

● Every process has a process identifier (pid) and a dedicated mailbox

● Processes do not communicate via shared memory, but by sending messages to each other’s mailbox

Spawning a process

spawn(Module, Function, ArgumentsList)● spawns a new process and makes it execute the given

Function, exported by Module● spawn/3 always returns a pid (=process identifier): if the given

function does not exist or can’t be called, the new process will crash due to a runtime error, but the spawning process won’t know

● ArgumentsList is designed to initialize the new process – for example, passing information about the spawning process

● Processes can even be spawned from lambda functions, using spawn/1

Process lifetime

● A spawned process runs until:– Its function reaches the end → but, very often, the body is

a tail-recursive function, or functions composing a finite-state machine

– A runtime error occurs during its execution

● Multiple processes can run the very same code at once – after all, there are no global structures and everything is immutable (well, almost)

● Therefore, code structure and process dynamics are related but orthogonal dimensions

Basic process functions

● BIFs:– spawn/3, spawn/1: spawns a new process

– self/0: returns the pid of the running process

– processes/0: returns the list of pids of the processes running on the current VM

● Shell functions:– flush(): fetches and prints out all the messages from

the shell’s mailbox

– i(): lists all the current processes

Registering processes

● Processes designed to act as global services, thus having longer lifetime, can be registered – that is, assigned an alias

● register(Alias, Pid) → Alias is an atom, Pid is a pid – for example, returned by spawn/3. If the alias is already registered, a runtime error occurs

● Any process can register any (self or other) process● Usually, processes are registered with the very name of the module

containing their related code● registered/0 → list of registered aliases● whereis/1 → pid of the given alias● Processes are automatically unregistered on termination● In the shell, regs() lists the registered processes

Sending messages

● The current process can send a message to any process:

TargetProcess ! Message– TargetProcess is the pid / alias of the target process

– Message is any Erlang term

● Messages are appended to the mailbox of the target process

● The same message can be sent to many processes:

TargetProcess1 ! … ! TargetProcessN ! Message

Details on message passing

● Messages are guaranteed to arrive in the order they were sent – of course, messages from different process may well arrive interleaved

● Message passing is asynchronous:– The sender immediately continues execution; for synchronous behavior, one

must explicitly request an ack – If the target is an invalid pid, or the target process does not exist, nothing

happens

– Errors only occur when sending a message to an alias (for example, a non-registered alias)

● The sender‘s pid is not sent: if you want to pass it, add it to the message (after retrieving it via self/0)

● Tagged tuples or even atoms are common messages in simple situations

Receiving messages

● A process can check its mailbox for messages:

receivepattern1 [when guard1] -> expr1.1, …;(…)patternM [when guardM] -> exprM.1, …[after Milliseconds|infinity -> timeoutExp1, …]

end

● The last pattern branch and the after branch must not include the trailing “;”

Details on receiving messages

● receive is synchronous; every time, it linearly scans the whole mailbox (in arrival order) and tries to sequentially match each scanned message with the patterns (in declaration order):– If the scanned message matches a pattern – and the related guard is

missing or true - the message is removed from the mailbox, receive stops scanning and the branch expressions are evaluated: the last one becomes receive‘s value

– If no message matches, the process is suspended, waiting for the arrival of a matching message

– If after is declared and no matching message arrives, the process is resumed after the given milliseconds and the after branch is evaluated

– after 0 means that receive won’t block after scanning the mailbox

– after infinity is the default behavior described – when after is missing

Mailbox housekeeping

● As messages arrive to a process mailbox, they must be fetched by a receive – otherwise, they‘ll cause:– Growing memory imprint

– Slower reception time, as receive performs a linear scan

● Generally speaking, it is often a good idea to add a catch-all pattern to the main receive constructs, so as to remove unexpected messages: of course, it might be equally useful to log them, to trace their origin process and cause

● If you must receive N messages from N different processes in a precise order, you must not use a receive having N branches – the simplest correct solution consists in having N subsequent 1-branch receive constructs.

● When using timeouts, beware of stale messages due to previous requests (for example, to a server process) which timed out: you must find a way to flush them – for example, by introducing references and/or timestamps

Functional interfaces

● Most often, in Erlang, access to shared resources is ensured by sending request messages to dedicated processes

● However, it is good practice to hide message passing behind a functional interface – that is, a module whose functions perform the actual requests

● There are several advantages:– Clients are unaware of the underlying protocol, which can be

arbitrarily changed– The actual location of the server can be changed as well– Location transparency: in case of a remote server, the client will

only notice longer response times and a higher percentage of failures

Creating functional interfaces

● In lieu of a direct message:

resource_service ! {request, Params}

a functional interface provides a function:

resource_service:request(Params)

where resource_service is a process alias in the former case and a module in the latter – they belong to different namespaces.

● There are 2 types of calls:– Synchronous calls: the client expects a reply. The function is blocking, then usually returns

{ok, Result} or {error, Reason}

– Asynchronous calls: the client is not interested in the result, so the function immediately returns a value, usually ok

Parallelism issues

● Being functional and focused on immutability, Erlang is outstandingly more parallelism-oriented than other languages

● However, Erlang programs can still suffer from common issues:– Race conditions

– Deadlocks

– Starvation

● What’s more, Erlang still has mutable aspects – for example, the process alias registry in a VM

Linking processes

● link(Pid) → creates a bidirectional existential link between the current process and the one whose pid is Pid

● link expresses a mutual dependency between the linked processes

● spawn_link/3 atomically spawns and links● unlink(Pid) is also available, but far less

frequent than link

Details on process linking

● By default, if either linked process terminates:– On normal termination, nothing happens

– On abnormal termination, the other process gets an exit signal {‘EXIT’, Pid, Reason} and crashes, propagating such signal to its remaining linked processes - after replacing Pid with its own pid

● However, if the non-terminating process – or any process in the propagation chain - had called process_flag(trap_exit, true), it just receives a message in the mailbox: {‘EXIT’, Pid, Reason} – which can be normal or not.

● Consequently, a process can choose to respawn the crashed linked process, therefore acting as a supervisor – it is fairly common in Erlang to attach spawned processes to fine-grained supervisor trees

● For more info, please also refer to exit/1 and exit/2, as well as erlang:monitor/2

Distributed systems

● Erlang node = an executing runtime system● Alive node = Erlang node having a name – so that it can

communicate with other nodes● A name can usually be assigned on startup:

– Short name: erl -sname <short name> → all the nodes are in the same IP domain

– Long name: erl -name <long name> → nodes can reside in arbitrary IP domains

● node/0 → identifier of the current node, which must be used by other nodes for communication

● net_kernel:start/1 and net_kernel:stop/0 can also be used, as well as erlang:is_alive/0

Internode communication

● Nodes having short(/long) name can only communicate with nodes having short(/long) name

● Furthermore, they must both share the same atom called magic cookie, which can be set:– When starting the VM: erl -setcookie <cookie>

– Programmatically, via erlang:set_cookie/1

– Writing the atom in the $HOME/.erlang.cookie file

● If none of the above options is chosen, a cookie file containing a random cookie is automatically created, so local nodes can automatically communicate with no setup

● To test inter-node communication, use net_adm:ping(Node), which returns pong on success and pang on failure

Spawning a remote process

spawn(Node, Module, Function, ArgumentsList)

● Spawns a process on the given Node. spawn_link/4 is also available● Returns a pid → location transparency: when sending a message to a

pid, such pid may reside on any node● On the other hand, sending a message to a registered alias is not

transparent:

{Alias, Node} ! Message

● As usual, message passing should be hidden behind functional interfaces● When spawning a process, or when passing it a lambda function, all the

referenced modules must be available in the target node’s code path

Node network

● Erlang nodes - having the same name type (short/long) and sharing the same cookie - transparently and lazily get connected when one node references the other for the first time. Not only: they transitively connect to every node connected to each other → scalability issues

● To override such behaviour:– erl -connect_all false → prevents transitive connections

– erl -hidden starts a hidden node – a node to which connections can only be done directly, not transitively

● To list the available nodes:– nodes() → all the nodes connected to this node, except hidden nodes

– nodes(connected) → all the nodes connected to this node – including hidden nodes

– nodes(hidden) → all the hidden nodes connected to this node

● node/1 → node containing the given pid/reference/...● To monitor a node: monitor_node(Node, EnableMonitoring)

RPC calls

● Calling a function residing on a connected node is simple:

rpc:call(Node, Module, Function, Arguments)

it returns:– The function result on success

– {badrpc, Reason} in case of failure

Hot code loading

● Whenever a different version (identified by the -vsn attribute) of a module is loaded - for example, by compiling it with compile:file/1 or the shell’s c/1, or by loading it via code:load_file/1:

1)the new module code is marked current

2)the module code used until now is marked old

3)the module code that was marked old is purged, and processes running its instructions are terminated

● After the update, processes spawned from a function in the old module code will continue referencing local functions of the old module code – unless the calls are fully qualified – in such case, the new version of the function is used.

● To enable a process to auto-update to the latest version of its own module, it is important to make a fully-qualified tail call in its body: as soon as such call is reached (usually, after a receive), the process will call the new version of its body, thus switching to the new module code.

OTP middleware

● Erlang includes a library of middleware templates:– Library modules provide generic behavior, carefully crafted

to support different scenarios and exception cases

– Client code must implement its specific behavior by providing callback modules exposing the interface required by the generic behavior

● For example, OTP provides:– gen_server → generic server in a client/server relation

– supervisor → supervisor with fine-grained policies

Part 4

Conclusion

The tip of the iceberg

● Erlang includes ETS and Dets, for in-memory and on-disk caching, as well as Mnesia, a distributed, soft real-time transactional database system packaged as an OTP application

● UDP and TCP packets can be received as Erlang messages via receive

● wxErlang is a GUI tookit based on wxWindows, employing Erlang‘s mailbox system to distribute events

● A visual debugger and a visual process monitor are included● Erlang evolves over time, introducing new constructs! For

example, it now supports maps, which are somehow similar to records, but more lightweight in terms of syntactic sugar

Final considerations

● Erlang is a mature but modern language, used in real-world and real-time scenarios; it’s also simple, minimalist and elegant, thanks to its functional nature and its well-crafted libraries

● Erlang is an ecosystem as well, fostering brilliant and didactic ideas - such as lightweight processes, message passing, linking and location transparency – which have influenced other languages

● Finally, it is open source and supported by a vibrant community! ^__^

Further references

● http://erlang.org → Official website

● https://twitter.com/erlang_org →Erlang on Twitter

● http://www.tryerlang.org/ → Hands-on tutorial

● Erlang Programming – book’s website

● Learn You Some Erlang for Great Good!

● Elixir - a functional language built on top of the Erlang VM

Thanks for your attention! ^__^