Php’s guts

Preview:

DESCRIPTION

Talk given at True North PHP 2014

Citation preview

PHP’s Guts

Why you should know how PHP works

Architecture

Yo I heard you like modules in your modules so we gave you three types!.

Engine

Lexer, Parser, Compiler, Executor

Core

Streams

Request Management

Variables

Network

SAPIS

Server API

CLI, CGI, mod_php, phpdbg, embed

input args

output, flushing, file descriptors, interruptions, system user info

input filtering and optionally headers, post data, http specific stuff

Extensions

Talk to a C library

do stuff faster then PHP

make the engine funny

Lifecycle

MINIT

RINIT

GINIT

GINIT

MINIT

RINIT

SCRIPTRSHUTDOWN

MSHUTDOWN

GSHUTDOWN

Who Cares?

Pick the right SAPI

Fewer extensions = better

Static extensions = better

Lifecycle is important for sharing stuff

Newer PHP = better faster stronger

turn on the “go_fast” ini setting

Thread, fork, async, very wow

Threading

Thread Safe != reentrant

Thread safe != parallel

Thread safe != async

Thread safe != concurrent

Thread safe == two threads running at the same time won’t stomp on the others data

yes really, that’s all it means

Reentrant

Let’s quit this, and run it again

and it will be like we never ran it

Async

I’m gonna work on this stuff

But I’m not going to block you if you have important stuff to do

Parallel … Concurrent

Concurrent – two things at the same time that need communication

Parallel – two things at the same time

TSRM

Thread safe resource manager

global data in extensions

making some C re-entrant

thread safety

Why do I care?

react-php (parallel)

pecl event (async)

pthreads (concurrent)

pcntl (fork and pray)

proc_open/popen (subprocessing)

queues and jobs and workers

native tls rfc

Welcome to the Engine

Lexers and Parsers and Opcodes OH MY!

Lexer

checks PHP’s spelling

turns into tokens

see token_get_all for what PHP sees

Parser

checks PHP’s grammar

E_PARSE means “bad phpish”

creates opcodes (or AST)

Compiler

Only with AST

Turns AST into Opcodes

Allows for fancier grammar

Opcodes

dump with http://derickrethans.nl/projects.html

machine readable language the runtime understands

Opcache (and AST)

cache opcodes – skip lexing and parsing

https://support.cloud.engineyard.com/entries/26902267-PHP-Performance-I-Everything-

You-Need-to-Know-About-OpCode-Caches

https://wiki.php.net/rfc/abstract_syntax_tree

Engine (Virtual Machine)

reads opcode

does something

???

PROFIT

Why do I care?

Use an opcode cache

If you don’t, you’re crazy, stupid, or lazy

Upgrade to get cooler stuff

Variables

PHP is a C types wrapper

Zvals

typedef union _zvalue_value {

long lval;

double dval;

struct {

char *val;

int len;

} str;

HashTable *ht;

zend_object_value obj;

} zvalue_value;

typedef union _zend_value {

zend_long lval’;

double dval;

zend_refcounted *counted;

zend_string *str;

zend_array *arr;

zend_object *obj;

zend_resource *res;

zend_reference *ref;

zend_ast_ref *ast;

zval *zv;

void *ptr;

zend_class_entry *ce;

zend_function *func;

} zend_value;

Numbers

Booleans are unsigned char

Integers are really signed long integers

Longs are platform dependent

Floats and doubles are doubles not floats

64 Bit Madness

LLP64

short = 16

int = 32

long = 32

long long = 64

pointer = 64

(windows)

LP64

short = 16

integer = 32

long = 64

long long = 64

pointer = 64

(unices)

Strings

Char *

Translated to what we see by an algorithm

ASCII, UTF8, binary – EVERYTHING has a codepage

wchar? screw you

Arrays

they’re not

hashtables

and doubly linked lists

Resources

stores random opaque C data

in a giant list of doom

sigh

Objects

handlers

property tables

magic storage

Why do I care?

Know the limitations of your data types

Remember that arrays aren’t arrays

Beware of many many resources

Beware of many many objects

64 bit can be broken in strange ways

C Moar

Implementation WTFeries and other fun

Stack? Heap?

Stack = scratch space for thread of execution

can overflow!

slightly faster

size determined at thread start

Heap = space for dynamic allocation

managed by program

can fragment

leaky!

Zend Memory Manager

Internal Heap Allocator

frees yo memory (leak management)

preallocates blocks in set sizes that PHP uses

caches allocations to avoid fragmentation

allows monitoring of memory usage

COW (not moo)

Copy On Write

1 zval, many variables

each variable increases refcount

destroy after refcount

Oh no, a change! copy

Refcounts, GC, and PHPNG

Sometimes you have a refcount but no var to reference it

This is a circular reference, this sucks (ask doctrine)

GC checks for this periodically and cleans up

PHPNG

References are not Pointers

PHP is smarter than you are

access the same variable content by different names

using symbol table aliases

variable name != variable content

Side Track – Objects are not References

$a = new stdClass;

$b = $a;

$a->foo = 'bar';

var_dump($b);

$a = 'baz';

var_dump($b);

Places to Learn More

http://www.phpinternalsbook.com

http://php.net

http://lxr.php.net

http://wiki.php.net

http://nikic.github.io/

http://blog.krakjoe.ninja/

About Me

http://emsmith.net

@auroraeosrose

That’s Aurora Eos Rose

auroraeosrose@gmail.com

freenode in #phpmentoring #phpwomen #phpinternals