79
IBM PHP Runtime Team © 2006 IBM Corporation Understanding PHP Opcodes Andy Wharmby

IBM PHP Runtime Team © 2006 IBM Corporation Understanding PHP Opcodes Andy Wharmby

Embed Size (px)

Citation preview

IBM PHP Runtime Team

© 2006 IBM Corporation

Understanding PHP Opcodes

Andy Wharmby

© 2006 IBM Corporation2

PHP Opcodes

Presentation splits into 3 sections

– Generation of opcodes

• ZEND_COMPILE

– Generation of the Interpreter code

• Interpreter comes in many flavours!!

– Execution of opcodes

• ZEND_EXECUTE

© 2006 IBM Corporation3

Execution path for a script

php_execute_script()

zend_execute_scripts()

zend_execute()

user call(function/method)

include/require

zend_compile_file()

© 2006 IBM Corporation4

zend_compile_file

Function ptr that can be overridden to call alternative compiler, e.g. by B-compiler

By default resolves to a call to compile_file() in zend_language_scanner.c

Compilation is broken down into 2 steps:

– Lexical analysis of source PHP script into tokens

– Parsing of resulting tokens into opcodes

PHP

script

Lexical

AnalyserParser

byte codestokens

© 2006 IBM Corporation5

Lexical Analysis

Lexical analyser code in zend_language_scanner.c

– generated from zend_language_scanner.l using “flex”

Exposed to userspace as token_get_all()

<?php

$tokens = token_get_all("<?php echo = ‘Hello World’;

?>"); foreach($tokens as $token) { if (is_array($token)) { printf("%s \t %s\n", token_name($token[0]), $token[1]); } else { printf("\t'%s'\n", $token); } } ?>

T_OPEN_TAG <?phpT_ECHO echoT_WHITESPACET_CONSTANT_ENCAPSED_STRING 'Hello World' ';'T_CLOSE_TAG ?>

Lexical

Analysis

© 2006 IBM Corporation6

Parsing

Next the tokens are compiled into opcodes

– Parser code in zend_language_parser.c which is generated from zend_langauge_parser.l by “Bison”

– Calls code in zend_compile.c to generate opcodes

Parser

T_OPEN_TAG <?phpT_ECHO echoT_WHITESPACET_CONSTANT_ENCAPSED_STRING 'Hello World' ';'T_CLOSE_TAG ?>

ZEND_OPZEND_OPZEND_OP ZEND_OP

© 2006 IBM Corporation7

Non-PHP statements

Whats does the complier do with any non-PHP statements in the input script, e.g. HTML

– All such statements are complied into ECHO statements

– So at execution time the statements are just output asis

<!-- example for PHP 5.0.0 final release -->

<?php

$domain = "localhost";

$user = "root";#note "MIKE" is unacceptable

$password = "";

$conn = mysql_connect( $domain, $user, $password );

if($conn)

{

$msg = "Congratulations !!!! $user, You connected to MySQL";

}

?>

<html>

<head>

<title>Connecting user</title>

</head>

<body>

<h3>

<?php echo( $msg ); ?>

</h3>

</body>

</html>

© 2006 IBM Corporation8

Non-PHP statements

line # op fetch ext operands

-------------------------------------------------------------------------------

3 0 ECHO '%3C%21--+example+for+

PHP+5.0.0+final+release+--%3E%0D%0A%0D%0A'

5 1 ASSIGN !0, 'localhost'

6 2 ASSIGN !1, 'root'

7 3 ASSIGN !2, ''

9 4 INIT_FCALL_BY_NAME 'mysql_connect'

…….. snip ….

22 ADD_STRING ~5, ~5, 'MySQL'

23 ASSIGN !4, ~5

14 24 JMP ->25

26 25 ECHO '%0D%0A%3Chtml%3E%0D%0A

%0D%0A+%3Chead%3E%0D%0A++%3Ctitle%3EConnecting+user%3C%2Ftitle%3E%0D%0A+%3C%2Fhead%3E%0D

%0A%0D%0A+%3Cbody%3E%0D%0A++%3Ch3%3E+%0D%0A+++'

26 ECHO !4

30 27 ECHO '+%0D%0A++%3C%2Fh3%3E

%0D%0A+%3C%2Fbody%3E%0D%0A%0D%0A%3C%2Fhtml%3E'

28 RETURN 1

29 ZEND_HANDLE_EXCEPTION

© 2006 IBM Corporation9

Opcodes

Each Opcodes consists of:– Opcode handler – 1 or 2 input operands– Optional result operand – Optional “Extended value”

• Meaning opcode dependent, e.g on a ZEND_CAST it defines target type– Line number in original source script– Opcode. Range 0 - 151.

• All listed in zend_vm_opcodes.h– Total size of each zend_op is 96 bytes

Some operations consist of 2 opcodes– e.g ZEND_ASSIGN_OBJ – 2nd Opcode set to ZEND_OP_DATA

struct _zend_op { opcode_handler_t handler; znode result; znode op1; znode op2; ulong extended_value; uint lineno; zend_uchar opcode;};

© 2006 IBM Corporation10

znode One for each operand and result

– each znode is 24 bytes Type can be as follows:

– IS_CONST (0x1)• program literal

– IS_TMP_VAR (0x2)• temporary variable with no name• intermediate result

– IS_VAR (0x4)• temporary variable with a name• defined in symbol table

– IS_UNUSED (0x8)• operand not specified

– IS_CV (0x10)• optimized version of VAR

For some opcodes type of znode is implied, e.g. for a JMP opcode’s op1 znode defines jump target address in “jmp_addr”

EA defines “extended attributes”– meanings opcode dependent – e.g. on a ZEND_UNSET_VAR it defines if variable is static or not

typedef struct _znode { int op_type; union { zval constant;

zend_uint var; zend_uint opline_num; zend_op_array *op_array; zend_op *jmp_addr; struct { zend_uint var; /*dummy */ zend_uint type; } EA; } u;} znode;

© 2006 IBM Corporation11

zend_compile_file() Returns a pointer to zend_op_array for global scope

– first, it's not an array but a structure

zend_op_array contains a pointer to an array of opcodes, plus much more including:

– pointer to array of complied variables details. More on these later.

– count of number of temporaries (TMP + VAR) required by opcodes

• i.e. the number of Zend Engine registers used

– pointer to hash table for all static's defined by the function

• the Hashtable is created and populated by the compiler if needed

Compiler produces one zend_op_array for:

– global scope

• this is the one returned to caller of zend_compile_file and is saved in EG(active_op_array)

– each user function

• added to thread’s function table by compiler

– each user class method

• added to function table for class by compiler

© 2006 IBM Corporation12

zend_compile_file()

Initial opcode array allocated by init_op_array() in zend_opcode.c– allocated from heap – sufficient for just 64 opcodes

Reallocated each time it is full when by get_next_op() – Reallocates new array 4 times current size

Storage for opcode array freed by call to destroy_op_array() at request end– For global scope called from zend_execute_scripts()– For functions and methods called by Hash Table dtor routine. More later

struct _zend_op_array {…….zend_uint *refcount;zend_op *opcodes;zend_uint last, size;zend_compiled_variable *vars;int last_var, size_var;zend_uint T;zend_brk_cont_element *brk_cont_array;zend_uint last_brk_cont;zend_uint current_brk_cont;zend_try_catch_element *try_catch_array;int last_try_catch;/* static variables support */HashTable *static_variables;….. e.t.c

;

ZEND_OP ZEND_OP ZEND_OP ZEND_OP ZEND_OP

© 2006 IBM Corporation13

Opcodes Not all opcode information can be determined as opcodes are

generated by compiler, e.g. target address for a JMP opcode. So after all opcodes generated a 2nd pass is made over opcode array to fill in the missing information: – set target for all jump opcodes

• during compilation jump targets are opcode array index’s. These are changed to absolute addresses

– set opcode handler to as defined by executor generated:• CALL: address of handler function • GOTO: address of label• SWITCH: identifier (int) for handling CASE block

– for any operands (op1 or op2) which are CONSTANTS modify zval to is_ref=1, refcount=2 to ensure zval copied

– trim opcode array to required size; i.e. free unused storage – See pass_two() in zend_opcode.c

© 2006 IBM Corporation14

Functions and Classes During MINIT 2 hash tables are built which the compiler uses

– GLOBAL_FUNCTION_TABLE • Populated with names of all built-in functions, and functions

defined by any enabled extensions – GLOBAL_CLASS_TABLE

• Populated with default classes and any classes defined by enabled extensions

The complier ADDs to both tables during compile step– Any new entries are then removed again at request shutdown

In a non-ZTS environment compiler updates the GLOBAL hash tables

In a ZTS environment GLOBAL tables are read-only– A separate r/w copy of each table is created for each new thread and

populated from GLOBAL table in compiler_globals_ctor()

© 2006 IBM Corporation15

Function table’s

One function table per thread– Address stored in executor globals (EG)

Function table is a Hashtable mapping function name to “zend_function” – zend_function structure is itself a union of structure’s

Populated with built in functions, extension functions e.t.c by copying GLOBAL_FUNCTIONS_TABLE built during MINIT in compiler_globals_ctor()

– type == ZEND_INTERNAL_FUNCTION– zend_function == zend_internal_function

During each request functions defined by user script are added to function table at compile time

– type == ZEND_USER_FUNCTION– zend_function == zend_op_array

typedef union _zend_function { zend_uchar type; struct { zend_uchar type; /* never used */ char *function_name; zend_class_entry *scope; ………. <SNIP > …………. zend_bool pass_rest_by_reference; unsigned char return_reference; } common; zend_op_array op_array; zend_internal_function internal_function;} zend_function;

© 2006 IBM Corporation16

State of play after compile complete

opcodes

zend_op_array

active_oparray

symbol_table

active_symbol_table

function_table

class_table

executor_globals

GLOBALS

_ENV

HTTP_ENV_VARS

…….

symbol_table

function_table

<?php function add5($a) { return $a + 5; } function sub5($a) { return $a - 5; } $a = add5(10); $b = sub5(15); ?>

zend_op_array

zend_op_array

zend_internal_fucntion

<internal func>

add5

sub5

…….

<internal func>

op_array for

global scope

zend_internal_fucntion

© 2006 IBM Corporation17

Function tables

User entries removed from global function table during RSHUTDOWN processing by call to shutdown_executor()

– As user function entries are added after all internal functions the code uses the zend_hash_reverse_apply() function to traverse threads function table entries backwards removing entries until type != ZEND_USER_FUNCTION

– Removal triggers HT dtor routine ZEND_FUNCTION_DTOR which in turn calls destroy_op_array() to free opcode array and other structures which hang of zend_op_array

© 2006 IBM Corporation18

Class_table

One class table per thread

– Address stored in executor globals (EG)

Class table is a Hashtable mapping class name to “zend_class_entry”

Populated with default classes and extension defined classes by copying GLOBAL_CLASS_TABLE built during MINIT in compiler_globals_ctor()

During each request classes defined by user script are added to class table at compile time

– Each class has its own function table and compiler adds an entry for each method defined by a class

© 2006 IBM Corporation19

State of play after compile complete :

opcodes

zend_op_array

active_oparray

symbol_table

active_smbol_table

function_table

class_table

executor_globals

GLOBALS

_ENV

HTTP_ENV_VARS

…….

symbol_table

<internal class>

Dog

…….

class_table

<?phpclass Dog{ function bark() { print "Woof!"; } function sit() { print “Sit!!”; }}$pooch = new Dog;$pooch ->bark();$pooch ->sit();?>

zend_op_arraybark

sit

…….

function_table

<internal func>

<internal func>

…….

function_table

zend_op_array

<internal method>

<internal method>

…….

function_tablezend_internal_fucntion

zend_internal_fucntion

© 2006 IBM Corporation20

Class table

User class entries are removed by shutdown executor() by traversing threads class table backwards removing all entries until type != ZEND_USER_CLASS

– Removal triggers HT door routine ZEND_CLASS_DTOR which in turn calls destroy_zend_class()

– destroy_zend_class() calls zend_hash_destroy() on the class’s function_table which walks the HT and calls dtor ZEND_FUNCTION_DTOR on each entry as described earlier

© 2006 IBM Corporation21

Static variables

Local scope but value retained across calls

Hashtable allocated by compiler per function or method when first static variable defined

– Referenced by zend_op_array structure

Statics added to Hashtable as found by compiler

© 2006 IBM Corporation22

Examining compile results

Two tools available for analysing results of compile

– VLD

– Parsekit

Both available from PECL

© 2006 IBM Corporation23

VLD

Dumps opcodes for a given PHP script

– Written by Derick Rethans

Download from PECL

– http://pecl.php.net/package/vld/0.8.0

Simple configuration

– --enable-vld[=shared]

Invoked via command line switches

– php -dvld.active=[0|1] –dvld.execute=[0|1] –f <php script>

– Can override defaults in php.ini

© 2006 IBM Corporation24

VLD

No config.w32 file for Windows

ARG_ENABLE("vld", “Enable Vulcan Opcode decoder" , "no");

if (PHP_VLD != "no") {

EXTENSION("vld", "vld.c srm_oparray.c");

}

© 2006 IBM Corporation25

VLD output

line # op fetch ext operands-------------------------------------------------------------------------- 2 0 ECHO '%0A' 4 1 ASSIGN !0, 5 5 2 ASSIGN !1, 10 6 3 ADD ~2, !0, !1 4 ADD ~3, ~2, 99 5 ASSIGN !2, ~3 8 6 INIT_STRING ~5 7 ADD_STRING ~5, ~5, 'c' 8 ADD_STRING ~5, ~5, '%3D+' 9 ADD_VAR ~5, ~5, !2 10 ADD_STRING ~5, ~5, '+' 11 ADD_CHAR ~5, ~5, 10 12 ECHO ~5 11 13 RETURN 1 14 ZEND_HANDLE_EXCEPTION

c= 114

<?php $a = 5; $b = 10; $c = $a + $b + 99; echo "c= $c \n"; ?>php -f test.php -dvld.active=1

KEY

! == compiler variable

$ == variable

~ == temporary

There are TMP’s defied for results here

but they are not used and VLD does not

list them

© 2006 IBM Corporation26

Why all these “+” in VLD output for CONST’s ?

<?php

echo "Hello World";

echo "Hello World";

echo "Hello + World";

?>

line # op fetch ext operands

-------------------------------------------------------------------------------

2 0 ECHO '%0D%0A+'

4 1 ECHO 'Hello+World'

5 2 ECHO 'Hello++++++++++++++++++++++++++++++++World'

6 3 ECHO 'Hello+%2B+World'

9 4 RETURN 1

5 ZEND_HANDLE_EXCEPTION

Answer: VLD calls php_url_encode() on the CONST to format it before output which amongst other things converts all spaces to “+”. Internally white space is stored as 0x20 as you would expect.

© 2006 IBM Corporation27

parsekit PHP opcode analyser written by Sara Goleman

– meant for development and debug only; some code not thread safe

Download from PECL

– http://pecl.php.net/package/parsekit

Simple configuration

– --enable-session[=shared]

Implements 5 functions

– parsekit_compile_string

– parsekit_compile_file

– parsekit_func_arginfo

– parsekit_opcode_flags

– parsekit_opcode_name

© 2006 IBM Corporation28

parsekit array parsekit_compile_string ( string phpcode [, array &errors [, int

options]] )– compiles and then analyzes supplied string – array parsekit_compile_string ( string phpcode [, array &errors [, int options]] )

• errors: 2 dimensional array of errors encounterd during compile– example of use in parsekit/examples

• options: either PARSEKIT_SIMPLE or PARSEKIT_QUIET– PARSEKIT_QUIET results in more verbose output

array parsekit_compile_file ( string filename [, array &errors [, int options]] )– As above but takes name of a .php file as input

array parsekit-func-arginfo (mixed function)– Return the arg_info data for a given user defined function/method

long parsekit_opcode_flags (long opcode)– Return flags which define return type, operand types etc for an opcode

string parsekit_opcode_name (long opcode)– Return name for given opcode

© 2006 IBM Corporation29

parsekit-compile-string: SIMPLE output

<?php $oparray = parsekit_compile_string('echo "HelloWorld";', $errors, PARSEKIT_SIMPLE); var_dump($oparray);

?>

array(5) { [0]=> string(36) "ZEND_ECHO UNUSED 'HelloWorld' UNUSED" [1]=> string(30) "ZEND_RETURN UNUSED NULL UNUSED" [2]=> string(42) "ZEND_HANDLE_EXCEPTION UNUSED UNUSED UNUSED" ["function_table"]=> NULL ["class_table"]=> NULL}

© 2006 IBM Corporation30

parsekit-compile-file: QUIET output

<?php $oparray = parsekit_compile_string('echo "HelloWorld";', $errors, PARSEKIT_QUIET); var_dump($oparray);

?>

array(20) { ["type"]=> int(2) ["type_name"]=> string(18) "ZEND_USER_FUNCTION" ["fn_flags"]=> int(0) ["num_args"]=> int(0) ["required_num_args"]=> int(0) ["pass_rest_by_reference"]=> bool(false) ["uses_this"]=> bool(false) ["line_start"]=> int(0) ["line_end"]=> int(0) ["return_reference"]=> bool(false) ["refcount"]=> int(1) ["last"]=> int(3) ["size"]=> int(3) ["T"]=> int(0) ["last_brk_cont"]=> int(0) ["current_brk_cont"]=> int(-1) ["backpatch_count"]=> int(0) ["done_pass_two"]=> bool(true) ["filename"]=> string(49) "C:\Testcases\helloWorld.php" ["opcodes"]=> array(3) { [0]=> array(5) { ["opcode"]=> int(40) ["opcode_name"]=> string(9) "ZEND_ECHO" ["flags"]=> int(768) ["op1"]=> array(3) { ["type"]=> int(1) ["type_name"]=> string(8) "IS_CONST" ["constant"]=> &string(11) "Hello World" } ["lineno"]=> int(3) etc…..

© 2006 IBM Corporation31

parsekit-func-arginfo

<? phpfunction foo ($a, stdClass $b, &$c) {

}

$oparray = parsekit_func_arginfo (‘foo’);

var_dump($oparray);

?>

array(3) { [0]=> array(3) { ["name"]=> string(1) "a" ["allow_null"]=> bool(true) ["pass_by_reference"]=> bool(false) } [1]=> array(4) { ["name"]=> string(1) "b" ["class_name"]=> string(8) "stdClass" ["allow_null"]=> bool(false) ["pass_by_reference"]=> bool(false) } [2]=> array(3) { ["name"]=> string(1) "c" ["allow_null"]=> bool(true) ["pass_by_reference"]=> bool(true) }}

© 2006 IBM Corporation32

parsekit-opcode-name

<?php

$opname = parsekit_opcode_name (61);

var_dump($opname);

?>

string(21) "ZEND_DO_FCALL_BY_NAME"

<?php

$opflags = parsekit_opcode_flags (61);

var_dump($opflags);

?>

int(16777218)

flags define whether opcode takes op1 and op2,

defines EA, sets a result etc

© 2006 IBM Corporation33

Execution path for a script

php_execute_script()

zend_execute_scripts()

zend_execute()

user call(function/method)

include/require

zend_compile_file()

© 2006 IBM Corporation34

PHP Interpreter

Can be generated in many flavours

– 12 different versions possible

Generated by a chunk of PHP code; zend_vm_gen.php

– You need to understand regular expressions before attempting to read this code

Interpreter generated from

– definition of each opcode in zend_vm_def.h, and

– skeletal interpreter body in zend_vm-execute.skl

© 2006 IBM Corporation35

Interpreter generation process

zend_vm_gen.php

zend_vm_execute.skl

zend_vm_def.h

zend_vm-execute.h

zend_vm_opcodes.h

© 2006 IBM Corporation36

zend_vm_execute.skl{%DEFINES%}

ZEND_API void {%EXECUTOR_NAME%}(zend_op_array *op_array TSRMLS_DC)

{

zend_execute_data execute_data;

{%HELPER_VARS%}

{%INTERNAL_LABELS%}

if (EG(exception)) {

return;

}

/* Initialize execute_data */

EX(fbc) = NULL;

EX(object) = NULL;

EX(old_error_reporting) = NULL;

if (op_array->T < TEMP_VAR_STACK_LIMIT) {

EX(Ts) = (temp_variable *) do_alloca(sizeof(temp_variable) * op_array->T);

} else {

EX(Ts) = (temp_variable *) safe_emalloc(sizeof(temp_variable), op_array->T, 0);

}

…… etc

triggers to zend_vmg_gen.php to insert generated code

© 2006 IBM Corporation37

zend_vm_defs.h

ZEND_VM_HANDLER(1, ZEND_ADD, CONST|TMP|VAR|CV, CONST|TMP|VAR|CV)

{

zend_op *opline = EX(opline); zend_free_op free_op1, free_op2;

add_function(&EX_T(opline->result.u.var).tmp_var,

GET_OP1_ZVAL_PTR(BP_VAR_R),

GET_OP2_ZVAL_PTR(BP_VAR_R) TSRMLS_CC);

FREE_OP1();

FREE_OP2();

ZEND_VM_NEXT_OPCODE();

}

opcode opcode name

types accepted forop1

types accepted forop2

.. although this is just a macro!

helper function triggers to php code to replace text

© 2006 IBM Corporation38

Interpreter generation process

Usage information:

php zend_vm_gen.php [options]

Options:

--with-vm-kind=CALL|SWITCH|GOTO - select threading model (default is CALL)

--without-specializer - disable executor specialization

--with-old-executor - enable old executor

--with-lines - enable #line directives

–with-vm-kind defines execution method• CALL: Each opcode handler is defined as a function• SWITCH: Each opcode handler is a case block in one huge switch

statement• GOTO: Label defined for each opcode handler

--without-specializer means only one handler per opcode • With specializer’s a handler generated for each possible combination of

operand types • A reported 20% speedup with specializers enabled over old executor

© 2006 IBM Corporation39

Interpreter generation process

--with-old-executor enables runtime decision to call old pre-ZE2 type executor which is a CALL type executor with no specializer’s – zend_vm_use_old_executor() defined to switch executor model – no current callers though

--with-lines results in addition of #lines directives to generated zend_vm_execute.h

#line 28 "C:\PHPDEV\php5.2-200612111130\Zend\zend_vm_def.h"

static into ZEND_ADD_SPEC_CONST_CONST_HANDLER(ZEND_OPCODE_HANDLER_ARGS)

{

zend_op *opline = EX(opline);

add_function(&EX_T(opline->result.u.var).tmp_var,

…. etc

default interpreter which is checked into CVS is generated as followsphp zend_vm_gen.php –with-vm-kind=CALL

© 2006 IBM Corporation40

Specialization

With specialization enabled an handler is generated for each valid combination of input operand

– As each input operand (op1 and op2) can take 1 of 5 types

• TMP• VAR• CV• CONST• UNUSED

– This gives a theoretical 25 opcode handlers for each opcode

© 2006 IBM Corporation41

zend_vm_defs.h

ZEND_VM_HANDLER(1, ZEND_ADD, CONST|TMP|VAR|CV, CONST|TMP|VAR|CV)

{

zend_op *opline = EX(opline); zend_free_op free_op1, free_op2;

add_function(&EX_T(opline->result.u.var).tmp_var,

GET_OP1_ZVAL_PTR(BP_VAR_R),

GET_OP2_ZVAL_PTR(BP_VAR_R) TSRMLS_CC);

FREE_OP1();

FREE_OP2();

ZEND_VM_NEXT_OPCODE();

}

© 2006 IBM Corporation42

ZEND_ADD without specialization

static int ZEND_ADD_HANDLER(ZEND_OPCODE_HANDLER_ARGS)

{

zend_op *opline = EX(opline);

zend_free_op free_op1, free_op2;

add_function(&EX_T(opline->result.u.var).tmp_var,

get_zval_ptr(&opline->op1, EX(Ts), &free_op1, BP_VAR_R),

get_zval_ptr(&opline->op2, EX(Ts), &free_op2, BP_VAR_R)

TSRMLS_CC);

FREE_OP(free_op1);

FREE_OP(free_op2);

ZEND_VM_NEXT_OPCODE();

}

Handler calls non-type specific routines to get zval * for op1 and op2

© 2006 IBM Corporation43

ZEND_ADD with specialization

static int ZEND_ADD_SPEC_CONST_CONST_HANDLER

(ZEND_OPCODE_HANDLER_ARGS)

{

zend_op *opline = EX(opline);

add_function(&EX_T(opline->result.u.var).tmp_var,

&opline->op1.u.constant,

&opline->op2.u.constant TSRMLS_CC);

ZEND_VM_NEXT_OPCODE();

}

static int ZEND_ADD_SPEC_CONST_TMP_HANDLER

(ZEND_OPCODE_HANDLER_ARGS)

{

zend_op *opline = EX(opline);

zend_free_op free_op2;

add_function(&EX_T(opline->result.u.var).tmp_var,

&opline->op1.u.constant,

_get_zval_ptr_tmp(&opline->op2, EX(Ts), &free_op2 TSRMLS_CC)

TSRMLS_CC);

zval_dtor(free_op2.var);

ZEND_VM_NEXT_OPCODE();

}

static int ZEND_ADD_SPEC_CONST_VAR_HANDLER

(ZEND_OPCODE_HANDLER_ARGS)

{

zend_op *opline = EX(opline);

zend_free_op free_op2;

add_function(&EX_T(opline->result.u.var).tmp_var,

&opline->op1.u.constant,

_get_zval_ptr_var(&opline->op2, EX(Ts), &free_op2 TSRMLS_CC)

TSRMLS_CC);

if (free_op2.var) {zval_ptr_dtor(&free_op2.var);};

ZEND_VM_NEXT_OPCODE();

}

…. and 13 other handlers

Handlers call type specific routines

to get zval * for op1 and op2

© 2006 IBM Corporation44

zend_vm_gen.php$op1_get_zval_ptr = array( "ANY" => "get_zval_ptr(&opline->op1, EX(Ts), &free_op1, \\1)", "TMP" => "_get_zval_ptr_tmp(&opline->op1, EX(Ts), &free_op1 TSRMLS_CC)", "VAR" => "_get_zval_ptr_var(&opline->op1, EX(Ts), &free_op1 TSRMLS_CC)", "CONST" => "&opline->op1.u.constant", "UNUSED" => "NULL", "CV" => "_get_zval_ptr_cv(&opline->op1, EX(Ts), \\1 TSRMLS_CC)",);

$op2_get_zval_ptr = array( "ANY" => "get_zval_ptr(&opline->op2, EX(Ts), &free_op2, \\1)", "TMP" => "_get_zval_ptr_tmp(&opline->op2, EX(Ts), &free_op2 TSRMLS_CC)", "VAR" => "_get_zval_ptr_var(&opline->op2, EX(Ts), &free_op2 TSRMLS_CC)", "CONST" => "&opline->op2.u.constant", "UNUSED" => "NULL", "CV" => "_get_zval_ptr_cv(&opline->op2, EX(Ts), \\1 TSRMLS_CC)",…..<snip>function gen_code(….)…… $code = preg_replace( array( ......... "/GET_OP1_ZVAL_PTR\(([^)]*)\)/", "/GET_OP2_ZVAL_PTR\(([^)]*)\)/", ........ ), array( ....... ....... $op1_get_zval_ptr[$op1], $op2_get_zval_ptr[$op2], ....... ), $code);

© 2006 IBM Corporation45

Generated code not always the best !!

static int ZEND_INIT_ARRAY_SPEC_CONST_CONST_HANDLER(ZEND_OPCODE_HANDLER_ARGS){ zend_op *opline = EX(opline); array_init(&EX_T(opline->result.u.var).tmp_var); if (IS_CONST == IS_UNUSED) { ZEND_VM_NEXT_OPCODE();#if 0 || IS_CONST != IS_UNUSED } else { return ZEND_ADD_ARRAY_ELEMENT_SPEC_CONST_CONST_HANDLER(ZEND_OPCODE_HANDLER_ARGS_PASSTHRU);#endif }}

ZEND_VM_HANDLER(71, ZEND_INIT_ARRAY, CONST|TMP|VAR|UNUSED|CV, CONST|TMP|VAR|UNUSED|CV){ zend_op *opline = EX(opline); array_init(&EX_T(opline->result.u.var).tmp_var); if (OP1_TYPE == IS_UNUSED) { ZEND_VM_NEXT_OPCODE();#if !defined(ZEND_VM_SPEC) || OP1_TYPE != IS_UNUSED } else { ZEND_VM_DISPATCH_TO_HANDLER(ZEND_ADD_ARRAY_ELEMENT);#endif }}

Input: zend_vm-def.h

Output: zend_vm-execute.h

© 2006 IBM Corporation46

Mapping opcode to an handler

Generated zend_execute.h contains an array to map opcodes to handlers

– without specializers array has just 151 entries

– with specializers 3775 (151 * 25) entries

zend_execute.c defines a function to enable compiler to determine correct handler for a given opcode

– zend_vm_set_opcode_handler(zend_op *op)

– Decodes type information for op1 and op2 in supplied “zend_op” and picks appropriate handler from array of handlers. Handler returned will be either:

• function pointer for handler when CALL• id of handler routine for SWITCH• address of handlers label for GOTO

Mapping performed at compile time

– pass_two() of complier calls zend_vm_set_opcode_handle() to patch handler into all generated opcodes

© 2006 IBM Corporation47

zend_execute By default zend_execute function pointer addresses the generated

execute() routine in zend_execute.h

– This is called by zend_execute_scripts() with :• a pointer to the zend_op_array for global scope, and • if ZTS enabled the tsrm_ls pointer

Executor keeps state data for current user function in zend_execute_data structure which is allocated in execute() stack frame

– Address of currently executing functions zend_execute_data stored in EG

struct _zend_execute_data {

struct _zend_op *opline;

zend_function_state function_state;

zend_function *fbc; /* Function Being Called */

zend_op_array *op_array;

zval *object;

union _temp_variable *Ts;

zval ***CVs;

zend_bool original_in_execution;

HashTable *symbol_table;

struct _zend_execute_data *prev_execute_data;

zval *old_error_reporting;

};

© 2006 IBM Corporation48

execute()

On entry acquire storage for– Temporary variables

• Number of temporary variables used by function stored in “T” field of zend_op_array• Storage allocated on stack if alloca() available and T < 2000• If alloca not available or 2000+ temporaries then allocated by emalloc from heap

– CV cache• Number of compiled variables used stored in “last_var” field of zend_op_array• Allocated on stack regardless of size if alloca available or emalloc otherwise

Initialize zend_execute_data– Initialize EX(opline) to address first opcode to execute– EX(symbol_table) = EG(active_symbol_table)– EX(prev_execute_data) = EG(current_execute_data);– EG(current_execute_data) = &execute_data;

zend_execute_data

………

current_execute_data

……....

executor_globals

zend_execute_data

null

global scope foo()<?php function foo() { … } …… foo();}

© 2006 IBM Corporation49

Operand Types Operands Op1 and Op2 can be either:

– VAR ($)• Temporary variable into which interpreter caches zval * and zval ** for a defined

symbol.– TMP (~)

• Temporary variable were interpreter keeps an intermediate result.• For example $a = $b + $c, the sum of $b and $c will be stored in a TMP before

being assigned to $a– CV (!)

• Compiled variable. • Optimized version of a VAR. More to follow shortly

– CONSTANT • Program literal, e.g. $a = “hello”• Symbols are also constants• ZVAL allocated by complier

– ZVAL has is_ref=1 refcount=2 to force split on assignment– UNUSED

• Operand not defined for opcode Result operand can be VAR, TMP or CV

© 2006 IBM Corporation50

Temporary Variables: VAR and TMP

struct _zend_execute_data {

struct _zend_op *opline;

…….

union _temp_variable *Ts;

….. etc

};

typedef union _temp_variable { zval tmp_var; struct { zval **ptr_ptr; zval *ptr; zend_bool fcall_returned_reference; } var; struct { zval **ptr_ptr; zval *ptr; zend_bool fcall_returned_reference; zval *str; zend_uint offset; } str_offset; zend_class_entry *class_entry;} temp_variable;

“Ts” field of zend_execute_data addresses an array of temp_variables – Size of array based on information gathered by compiler.

• The “var” field in the operands znode contains the offset into the “temp_variables” array

• Temporaries are each 24 bytes

• T and EX_T macros provided to do this

• Temporary variables are NOT re-used by compiler

typedef struct _znode { int op_type; union { zval constant;

zend_uint var; zend_uint opline_num; zend_op_array *op_array; zend_op *jmp_addr; struct { zend_uint var; /*dummy */ zend_uint type; } EA; } u;} znode;

© 2006 IBM Corporation51

VAR variables

FETCH_W $0, 'a' /* Retrieve the $a variable for writing */ ASSIGN $0, 123 /* Assign the numeric value 123 to retrieved variable 0 */ FETCH_W $2, 'b' /* Retrieve the $b variable for writing */ ASSIGN $2, 456 /* Assign the numeric value 456 to retrieved variable 2 */ FETCH_R $5, 'a' /* Retrieve the $a variable for reading */ FETCH_R $6, 'b' /* Retrieve the $b variable for reading */ ADD ~7, $5, $6 /* Add the retrieved variables (5 & 6) together and store the result in 7 */ FETCH_W $4, 'c' /* Retrieve the $c variable for writing */ ASSIGN $4, ~7 /* Assign the value in temporary variable 7 into retrieved variable 4 */ FETCH_R $9, 'c' /* Retrieve the $c variable for reading */ ECHO $9 /* Echo the retrieved variable 9 */

<?php $a = 123; $b = 456; $c = $a + $b; echo $c; ?>

Note: Each time $a is accessed we look it up in symbol table and store result in a different VAR

© 2006 IBM Corporation52

VAR variables

typedef union _temp_variable { zval tmp_var; struct { zval **ptr_ptr; zval *ptr; zend_bool fcall_returned_reference; } var; … etc} temp_variable;

ZVAL

typedef union _temp_variable { zval tmp_var; struct { zval **ptr_ptr; zval *ptr; zend_bool fcall_returned_reference; } var; … etc} temp_variable;

ZVALpDataPtr

symbol_table

After FETCH_R

After FETCH_RW or FETCH_W

pDataPtr

symbol_table

© 2006 IBM Corporation53

Compiled Variables Introduced in PHP 5.1 Avoids need for expensive symbol table lookup EVERY TIME a symbol is

referenced The “var” field in the operands znode contains the index into the CV cache When variable is initialized at runtime engine looks up symbol in symbol table

and stores zval ** in a CV cache addressed from zend– Hash value of variable calculated at compile time which allows “quick” HT

functions to be used at runtime– Subsequent uses of CV avoid symbol table lookup

All references to same symbol by a function/method refer to same CV– Unlike temporary variables

Only supported for simple variables– i.e. not object properties, auto globals or “this” pointer – For more information: See Sara Golemon’s blog on subject:

http://blog.libssh2.org/index.php?/archives/21-Compiled-Variables.html

© 2006 IBM Corporation54

Compiled Variables – compile time processing

An array of eligible variables constructed at compile time by lookup_CV()– Address of array stored in “vars” field of zend_op_array– For any variable eligible to be a CV compiler walks the current “vars” array to check for a

match. • If found index returned. • If not found then its added in next free slot

– Name and length of symbol– Hash code

• Array allocated from heap. When array fills its extended by 16 entries by erealloc

struct _zend_op_array {……

zend_compiled_variable *vars; int last_var, size_var;…e.t.c

};

last_var contains index of last slot used, size_var last available index

typedef struct zend_compiled_variable { char *name; int name_len; ulong hash_value;} zend_compiled_variable;

© 2006 IBM Corporation55

Compiled Variables – At runtime To access a CV, we extract CV number from the znode and use as index into CV cache.

– If CV cache slot non-zero then you have the zval **!! No symbol table lookup– If CV cache slot is zero then it’s the first reference to X so:

• Lookup X in symbol table using pre-computed hash, i.e. zend_hash_quick_find()• First lookup of a symbol in a function will also fail so symbol is also added to symbol

table at this point if lookup for W or RW using information in “vars” array. Uses zend_hash_quick_update() using pre-computed hash again.

• If lookup is for R then CV cache set to address EG(uninitialized_zval)• Save returned zval** for X saved in CV cache

– On userspace “unset” the CV cache is entry is set to NULL to force ht lookup on next reference to symbol

.....zval *** CVs

….

zval **

execute_data

CV cache

ZVAL

zval *

symbol table

…..

zend_uint var;

….

znode

pDataPtr slot of

a HT bucket

© 2006 IBM Corporation56

Compiled variables:with regular variables

<?php $a = 123; $b = 456; $c = $a + $b; echo $c; ?>

FETCH_W $0, 'a' /* Retrieve the $a variable for writing */ ASSIGN $0, 123 /* Assign the numeric value 123 to retrieved variable 0 */ FETCH_W $2, 'b' /* Retrieve the $b variable for writing */ ASSIGN $2, 456 /* Assign the numeric value 456 to retrieved variable 2 */ FETCH_R $5, 'a' /* Retrieve the $a variable for reading */ FETCH_R $6, 'b' /* Retrieve the $b variable for reading */ ADD ~7, $5, $6 /* Add the retrieved variables (5 & 6) together and store the result in 7 */ FETCH_W $4, 'c' /* Retrieve the $c variable for writing */ ASSIGN $4, ~7 /* Assign the value in temporary variable 7 into retrieved variable 4 */ FETCH_R $9, 'c' /* Retrieve the $c variable for reading */ ECHO $9 /* Echo the retrieved variable 9 */

ASSIGN !0, 123 /* Assign the numeric value 123 to compiled variable 0 */ ASSIGN !1, 456 /* Assign the numeric value 456 to compiled variable 1 */ ADD ~2, !0, !1 /* Add compiled variable 0 to compiled variable 1 */ ASSIGN !2, ~2 /* Assign the value of temporary variable 2 to compiled variable 2 */ ECHO !2 /* Echo the value of compiled variable 2 */

Without CV

With CV

© 2006 IBM Corporation57

Compiled variables : with Object variables

<?php $f->a = 123; $f->b = 456; $f->c = $f->a + $F->b; echo $f->c; ?>

ASSIGN_OBJ !0, 'a' /* Assign the numeric value 123 to property 'a' of compiled variable 0 object */ OP_DATA 123 /* Additional data for ASSIGN_OBJ opcode */ ASSIGN_OBJ !0, 'b' /* Assign the numeric value 456 to property 'b' of compiled variable 0 object */ OP_DATA 456 /* Additional data for ASSIGN_OBJ opcode */ FETCH_OBJ_R $3, !0, 'a‘ /* Retrieve property 'a' from compiled variable 0 object */FETCH_OBJ_R $4, !0, 'b‘ /* Retrieve property 'b' from compiled variable 0 object */ADD ~5, $3, $4 /* Add those values and store the result in temp var 5 */ASSIGN_OBJ !0, 'c' /* Assign the ADD result to property 'c' of compiled variable 0 object */OP_DATA ~5 /* Additional data for ASSIGN_OBJ opcode */ FETCH_OBJ_R $6, !0, 'c‘ /* Retrieve property 'c' from compiled variable 0 object */ ECHO $6 /* Echo the value */

With CV

Note: Properties are re-fetched every time a read or write is performed on them which cannot be avoided due to the magic methods _get(), _set() e.t.c. which can return a different variable on different fetches

© 2006 IBM Corporation58

Symbol tables One for global scope created during RINIT processing by call to init_executor()

– Add “GLOBALS entry to new symbol table; which is a recursive reference– Populated with requested super globals by call to php_hash_environment()

• _POST, _GET, and _COOKIE• if INI register_argc_argc=on specified then argv and argc added • if INI auto_globals_jit = off specified then _ENV, _SERVER and _REQUEST added

– If auto_globals_jit=on they are added by compiler if reference found; see zend_is_auto_global()

• if INI register_long_arrays=on specified long versions of _ENV, _POST etc • if INI register_globals=yes the adds all globals to symbol table

Symbol tables created for each called function/method at time of call – created and freed in zend_do_fcall_common_helper()– Hashtable for Symbol table allocated from a cache if any available

• Up to 32 Symbol table HashTable’s cached • Hashtables are cleared before being added to cache so no need to initialize on

allocation from cache– Otherwise allocate and init a new HashTable

Symbols added to symbol table at runtime on first reference to a symbol– On reference to a symbol we do a Hashtable lookup; if lookup fails we add it

• Initially its set to reference a special zval of uninitialized_zval until its assigned to

© 2006 IBM Corporation59

Which symbol table to use ?

EX(active_symbol_table) contains reference to current function/method symbol table

However, as super/auto globals only stored in global scope symbol table what happens if a function references one ?

– this is where the EA attribute of znode comes into play !!• EA for op2 set by complier to direct interpreter to FETCH from global symbol table

EX(symbol_table) rather than current symbol table EX(active_symbol_table) when it finds a reference to an auto_global

– complier checks every symbol against auto_globals hash table– see fetch_simple_variable_ex()

– different flag set in EA for static variable, class static etc• see zend_get_target_symbol_table()

<?php

function foo() {

$a = $_ENV;

}

foo();

?>

line # op fetch ext operands

-------------------------------------------------------------------------------

9 0 FETCH_R global $0, '_ENV'

1 ASSIGN !0, $0

10 2 RETURN null

3 ZEND_HANDLE_EXCEPTION

© 2006 IBM Corporation60

Static Variables

Hashtable referenced by zend_op-array details all statics, if any, defined for a function/method

– When a static is accessed at runtime an entry is added to local symbol table (EG(active_symbol_table)

– Entry made to reference same zval referenced by static_variables Hashtable in a “Change on Write” set • just like a parameter passed by reference

© 2006 IBM Corporation61

Static Variables

<?php function foo() { static $count = 0; $count ++; } foo(); foo(); foo();?>

line # op fetch ext operands

-------------------------------------------------------------------------------

5 0 FETCH_W static $0, 'count'

1 ASSIGN_REF !0, $0

6 2 POST_INC ~1, !0

3 FREE ~1

8 4 RETURN null

5 ZEND_HANDLE_EXCEPTION

vld output for foo()

count value=0 is_ref=0 refount=1

zvalcount

static_variablesEG(active_symbol_table)

value=0 is_ref=1 refount=2

<?php function foo() { static $count = 0; $count ++; } foo(); foo(); foo();?>

<?php function foo() { static $count = 0; $count ++; } foo(); foo(); foo();?>

value=1 is_ref=1 refount=2

© 2006 IBM Corporation62

Function calling Opcode sequence depends if target known at compile time

– Target function name is not known at compile time if• Call site before user function definition• Conditional functions used• Referred to in code as “dynamic function call”

If complier cannot verify function name at compile time then sequence is – INIT_FCALL_BY_NAME

• performs a runtime check on function name – SEND_* for each argument.

• EA set to force arg checks for “pass by ref” at runtime– DO_FCALL_BY_NAME

If compiler can verify function name then sequence is just– SEND_* for each argument– DO_FCALL

foo(&£a, $b, 100)

SEND_REF SEND_VAR SEND_VAL

© 2006 IBM Corporation63

Function calling

<?php $a= 10; $b= 5; foo($a, $b); function foo(&$x, &$y) { echo "foo called with: $x $y"; } foo($a, $b); ?>

line # op fetch ext operands---------------------------------------------------- ------------------- 2 0 ECHO '%0D%0A+' 4 1 ASSIGN !0, 10 5 2 ASSIGN !1, 5 6 3 INIT_FCALL_BY_NAME 'foo' 4 SEND_VAR !0 5 SEND_VAR !1 6 DO_FCALL_BY_NAME 2 0 8 7 NOP 14 8 SEND_REF !0 9 SEND_REF !1 10 DO_FCALL 2 'foo', 0 17 11 RETURN 1 12 ZEND_HANDLE_EXCEPTION

SEND_VAR opcode’s extended_value set when FCALL_BY_NAME to force SEND_VAR handler to check expected args at RUNTIME and if call by REF expected it re-dispatches SEND_REF handler

Uses EX(fbc) set by INIT_FCALL_BY_NAME to access required arg info.

extended_value on FCALL opcode

is number of arguments passed

opcodes for global scope

© 2006 IBM Corporation64

Calling other user functions

When user space function calls another user function or method then arguments pushed to a LIFO argument stack

– Address in EG(argument stack)

– Initial argument stack of 64 slots is allocated by init_executor()

– When full a new stack twice the size allocated

ZEND_SEND_* opcodes for each argument

– pushes a zval * to argument stack

– zval’s are split when necessary

© 2006 IBM Corporation65

Example 1: Passing arguments by value without splitting

line # op ext operands-------------------------------------------------- 3 0 NOP 8 1 ASSIGN !0, 10 9 2 ASSIGN !1, 5 10 3 ASSIGN !2, !0 12 4 SEND_VAR !0 5 SEND_VAR !1 6 DO_FCALL 2 'foo', 0 15 7 RETURN 1 8 ZEND_HANDLE_EXCEPTION

NULL

NULL

arg1

arg2

2

value=10

refcount= 3

is_ref= 0

argument_stack

<?php function foo($x, $y) { echo "foo called with: $x $y"; } $a= 10; $b= 5; $c= $a foo($a, $b); ?>

value=5

refcount=2

is_ref= 0

a

b

c

symbol_table

After opcode #5:number of args

opcodes for global scope

no need to split zval for $aas its part of a “copy onwrite” set

© 2006 IBM Corporation66

Example 2: Passing arguments by value when splitting required

line # op ext operands------------------------------------------------- 3 0 NOP 8 1 ASSIGN !0, 10 9 2 ASSIGN !1, 5 10 3 ASSIGN !2, !0 12 4 SEND_VAR !0 5 SEND_VAR !1 6 DO_FCALL 2 'foo', 0 15 7 RETURN 1 8 ZEND_HANDLE_EXCEPTION

<?php function foo($x, $y) { echo "foo called with: $x $y"; } $a= 10; $b= 5; $c= &$a foo($a, $b); ?>

After opcode #5:

NULL

NULL

arg1

arg2

2

value=10

refcount= 2

is_ref= 1

argument_stack

value=5

refcount=2

is_ref= 0

a

b

c

symbol_table

value=10

refcount=1

is_ref= 0

we have to split zval for $a as its part of a “change onwrite” set

opcodes for global scope

© 2006 IBM Corporation67

Example 3:Passing arguments by reference when splitting required

NULL

NULL

arg1

arg2

2

value=10

refcount= 1

is_ref= 0

argument_stack

<?php function foo($x, $y) { echo "foo called with: $x $y"; } $a= 10; $b= 5; $c = $a foo(&$a, &$b); ?>

value=10

refcount=2

is_ref= 1

a

b

c

symbol_table

CV is actually in op1 not result operand

line # op ext operands------------------------------------------------ 3 0 NOP 8 1 ASSIGN !0, 10 9 2 ASSIGN !1, 5 10 3 ASSIGN !2, !0 11 4 SEND_REF !0 5 SEND_REF !1 6 DO_FCALL 2 'foo', 0 14 7 RETURN 1 8 ZEND_HANDLE_EXCEPTION

value=10

refcount= 2

is_ref= 1

After opcode #5:

here we have to split zval for $a as its part of a “copy onwrite” set

© 2006 IBM Corporation68

Example 4:Passing arguments by reference (compile time)

NULL

NULL

arg1

arg2

2

value=10

refcount= 1

is_ref= 0

argument_stack

<?php $a= 10; $b= 5; $c= $a; foo($a, $b); function foo(&$x, &$y) { echo "foo called with: $x $y"; } ?>

value=10

refcount=2

is_ref= 1

a

b

c

symbol_table

line # op ext operands---------------------------------------------------- 5 0 ASSIGN !0, 10 6 1 ASSIGN !1, 5 7 2 ASSIGN !2, !0 8 3 INIT_FCALL_BY_NAME 'foo' 4 SEND_VAR !0 5 SEND_VAR !1 6 DO_FCALL_BY_NAME 2 0 10 7 NOP 16 8 RETURN 1 9 ZEND_HANDLE_EXCEPTION

value=10

refcount= 2

is_ref= 1

After opcode #5:

As its FCALL_BY_NAME extra checks in SEND_VAR kick in to check arg info for compile time call by ref. If so redispatch SEND_REF to do right thing !

Not shown but op2 znode is used

to save argument number which is

used to index into arg info structure

© 2006 IBM Corporation69

Receiving arguments passed by value

NULL

NULL

arg1

arg2

2

value=10

refcount= 4

is_ref= 0

argument_stack

<?php function foo($x, $y) { echo "foo called with: $x $y"; } $a= 10; $b= 5; $c = $a foo($a, $b); ?>

value=10

refcount=3

is_ref= 0

a

b

c

callers symbol_table

line # op ext operands-------------------------------------------------- 3 0 RECV 1 1 RECV 2 4 2 INIT_STRING ~0 3 ADD_STRING ~0, ~0, 'foo' 4 ADD_STRING ~0, ~0, '+' 5 ADD_STRING ~0, ~0, 'called‘ e.t.c

After opcode #1:

x

y

callee symbol_table

this is argument number

Result op is CV for arg

© 2006 IBM Corporation70

Receiving arguments passed by reference

NULL

NULL

arg1

arg2

2

value=10

refcount= 1

is_ref= 0

argument_stack

<?php function foo($x, $y) { echo "foo called with: $x $y"; } $a= 10; $b= 5; $c = $a foo(&$a, &$b); ?>

value=10

refcount=3

is_ref= 1

a

b

c

symbol_table

line # op ext operands-------------------------------------------------- 3 0 RECV 1 1 RECV 2 4 2 INIT_STRING ~0 3 ADD_STRING ~0, ~0, 'foo' 4 ADD_STRING ~0, ~0, '+' 5 ADD_STRING ~0, ~0, 'called‘ e.t.c

value=10

refcount= 3

is_ref= 1

After opcode #1:

x

y

symbol_table

© 2006 IBM Corporation71

Function binding <?php

function a() {

echo "called function a";

}

function b() {

echo "called fucntio b";

}

a();

b();

?>

line # op fetch ext operands

-------------------------------------------------------------------------------

3 0 NOP

6 1 NOP

10 2 DO_FCALL 0 'a', 0

11 3 DO_FCALL 0 'b', 0

14 4 RETURN 1

5 ZEND_HANDLE_EXCEPTION

What are these NOP’s ?

© 2006 IBM Corporation72

Function binding They are artefacts of “function binding” When compiler encounters a function declaration in a script it generates a

“ZEND_DECLARE_FUNCTION” opcode in current opcode array – op1 is long function name

• \0<function name><file name><address>– “ fooC:\Testcases\tes.php0012CD38”

• where address is character position of last char of function prototype in scripts buffer– op2 is short name, i.e. just ”foo”– a function table entry is added by compiler for long name

After parsing function body and generating its zend_op_array compiler then performs “early binding” for the unconditional functions

– Effectively executes opcode at compile time. See zend_do_early_binding().– Opcode checks for duplicate function names

• Looks up function table entry for long name. This should always be successful !!• Attempts to add a function entry with short name using zend_function_entry just

retrieved. If this fails we have a duplicate function name and an error message is produced detailing filename and line number of previous declaration.

– If no duplicate then the ZEND_DECLARE_FUNCTION opcode is converted to a NOP• opcode set to ZEND_NOP• op1 and op2 set to UNUSED and zval’s for name strings freed

– Deletes function table entry for “long name”

© 2006 IBM Corporation73

Conditional Functions

Same function name can be defined multiple time with different content and/or signature

– A zend_op_array generated for each different version of a conditional function

Which function gets executed not known until runtime

– So function binding delayed until runtime

– ZEND_DECLARE_FUNCTION perists on complier output

<?php $a= 10; if (a > 10) { function foo() { echo "foo has no parms"; } } else { function foo($a) { echo "foo has 1 parm"; } } if (a > 10) { foo(); } else { foo($a); }

© 2006 IBM Corporation74

Conditional Functions

line # op fetch ext operands

-------------------------------------------------------------------------------

2 0 ECHO '%0D%0A+'

5 1 ASSIGN !0, 10

7 2 FETCH_CONSTANT ~1, 'a'

3 IS_SMALLER ~2, 10, ~1

4 JMPZ ~2, ->7

8 5 ZEND_DECLARE_FUNCTION '%00fooC%3A%5CEclipse-PHP%5C

workspace%5CTestcases%5Ctest.php0140E973', 'foo'

12 6 JMP ->8

13 7 ZEND_DECLARE_FUNCTION '%00fooC%3A%5CEclipse-PHP%5C

workspace%5CTestcases%5Ctest.php0140E9B9', 'foo'

20 8 FETCH_CONSTANT ~3, 'a'

9 IS_SMALLER ~4, 10, ~3

10 JMPZ ~4, ->14

. . . e.t.c . . .

© 2006 IBM Corporation75

Conditional Functions

line # op fetch ext operands

-------------------------------------------------------------------------------

10 0 ECHO 'foo+has+no+parms'

11 1 RETURN null

2 ZEND_HANDLE_EXCEPTION

line # op fetch ext operands

-------------------------------------------------------------------------------

13 0 RECV 1

15 1 ECHO 'foo+has+1+parm'

16 2 RETURN null

3 ZEND_HANDLE_EXCEPTION

zend_op_array for foo()

zend_op_array for foo($a)

© 2006 IBM Corporation76

Exception Handling

<?php function foo($x) { if ($x > 1 ) { throw new Exception; } } try { foo(1); } catch (Exception $e) { echo "exception 1"; } try { foo(2); } catch (Exception $e) { echo "exception 2"; } ?>

line # op fetch ext operands------------------------------------------------------------------------------- 3 0 NOP 11 1 SEND_VAL 1 2 DO_FCALL 1 'foo', 0 13 3 ZEND_FETCH_CLASS :1, 'Exception' 4 ZEND_CATCH null, 'e' 14 5 ECHO 'exception+1' 18 6 SEND_VAL 2 7 DO_FCALL 1 'foo', 0 20 8 ZEND_FETCH_CLASS :3, 'Exception' 9 ZEND_CATCH null, 'e' 21 10 ECHO 'exception+2' 25 11 RETURN 1 12 ZEND_HANDLE_EXCEPTION

line # op fetch ext operands------------------------------------------------------------------------------- 3 0 RECV 1 5 1 IS_SMALLER ~0, 1, !0 2 JMPZ ~0, ->8 6 3 ZEND_FETCH_CLASS :1, 'Exception' 4 NEW $2, :1 5 DO_FCALL_BY_NAME 0 0 6 ZEND_THROW $2 7 7 JMP ->8 8 8 RETURN null 9 ZEND_HANDLE_EXCEPTION

not shown by VLD but extended value of CATCH opcode contains opcode number to branch too if exception not thrown.

if no exception thrown during TRY block then we actually execute the ZEND_FETCH_CLASS and ZEND_CATCH opcodes. ZEND_CATCH on finding no exception thrown dispatches first opcode after end of catch block, i.e 6 in this case

© 2006 IBM Corporation77

Exception handling

When ZEND_THROW opcode executes it sets EG(exception) and EG(opline_before_exception) before dispatching the ZEND_HANDLE_EXCEPTION opcode at end of current op array

– see zend_throw_exception_internal()

ZEND_HANDLE_EXCEPTION opcode handler checks all try/catch blocks in current scope to see if the range they cover includes the last opcode executed in current scope before exception. If any dispaches the firstopcode of the catch block which will be ZEND_FETCH_CLASS

• uses an array built by compiler which defines scope of eah try/catch block• array records scope in terms of opcode number of first try block opcode

and opcode number of first catch block opcode– if none then return to caller

• on seeing EG(exception) still set return processing sets EG(opline_before_exception) to the last opcode executed in the caller, i.e. the FCALL opcode, and then sets next opcode in caller to the ZEND_HANDLE_EXCEPTION opcode at end of callers opcode array

– Repeat until a catch block found or we return from global scope and which point if EG(execption) set “uncaught exception” error msg produced

© 2006 IBM Corporation78

Exception handling

struct _zend_op_array {

…….

zend_try_catch_element *try_catch_array;

int last_try_catch;

….. etc

};

typedef struct _zend_try_catch_element { zend_uint try_op; zend_uint catch_op; /* ketchup! */

} zend_try_catch_element;

struct _zend_execute_data {

struct _zend_op *opline;

…….

zend_op_arrary *op_array

….. etc

};

contains opcode number of first opcode of

try and catch blocks

array is realloacted as every try/catch

block found by compiler

© 2006 IBM Corporation79

Exception Handling

try_op = 1catch_op = 3

try_op = 6catch_op = 8

try_catch_array

last_try_catch = 2

zend_op_array

try_catch_array

last_try_catch = 0

zend_op_array null

line # op fetch ext operands------------------------------------------------------------------------------- 3 0 NOP 11 1 SEND_VAL 1 2 DO_FCALL 1 'foo', 0 13 3 ZEND_FETCH_CLASS :1, 'Exception' 4 ZEND_CATCH null, 'e' 14 5 ECHO 'exception+1' 18 6 SEND_VAL 2 7 DO_FCALL 1 'foo', 0 20 8 ZEND_FETCH_CLASS :3, 'Exception' 9 ZEND_CATCH null, 'e' 21 10 ECHO 'exception+2' 25 11 RETURN 1 12 ZEND_HANDLE_EXCEPTION

line # op fetch ext operands------------------------------------------------------------------------------- 3 0 RECV 1 5 1 IS_SMALLER ~0, 1, !0 2 JMPZ ~0, ->8 6 3 ZEND_FETCH_CLASS :1, 'Exception' 4 NEW $2, :1 5 DO_FCALL_BY_NAME 0 0 6 ZEND_THROW $2 7 7 JMP ->8 8 8 RETURN null 9 ZEND_HANDLE_EXCEPTION