23
a smalltalk on Object and Protocol in CPython shiyao.ma <[email protected]> May. 4 th

Intro python-object-protocol

Embed Size (px)

Citation preview

a smalltalk on Object and Protocol in CPythonshiyao.ma <[email protected]>

May. 4th

Why this‣ Python is great for carrying out research experiment.

this should lay the foundation why I discuss Python.

‣ Life is short. You need Python. this should lay the foundation why people like Python.

2

life is neither a short nor a long, just a (signed) int, 31bits at most, I say.

Takeaway‣ Understand the inter-relation among {.py, .pyc .c} file.

‣ Understand that everything in Python is an object.

‣ Understand how functions on TypeObject affect InstanceObject.

3

CPython Overview‣ First implemented in Dec.1989 by GvR, the BDFL

‣ Serving as the reference implementation. ‣ IronPython (clr) ‣ Jython (jvm) ‣ Brython (v8, spider) [no kidding]

‣ Written in ANSI C. ‣ flexible language binding ‣ embedding (libpython), e.g., openwrt, etc.

4

CPython Overview‣ Code maintained by Mercurial.

‣ source: https://hg.python.org/cpython/

‣ Build toolchain is autoconf (on *nix)

./configure --with-pydebug && make -j2

5

CPython Overview‣ Structure

6

cpython

configure.ac

Doc

Grammar

Include

Lib

Mac

Modules

Objects

Parser

Programs

Python

CPython Overview‣ execution lifetime

7

PY

Parser

PY[CO]

VM

LOAD 1LOAD 2ADDLOAD XSTORE X

x = 1 + 2

1 12

3

3x

STACK

takeaway: py/pyc/c inter-relatoin

8

object and protocol:

the objects

Objectobject: memory of C structure with common header

9

PyListObject PyDictObject PyTupleObject PySyntaxErrorObject PyImportErrorObject …

takeaway: everything is object

ob_type

ob_refcnt

PyObject

ob_type

ob_size

ob_refcnt

PyVarObject

Object StructureWill PyLongObject overflow?

10

The answer: chunk-chunk

digit[n]

digit[3]

digit[2]

ob_type

ob_size

digit ob_digit[1]

ob_refcnt

PyLongObject

typedef PY_UINT32_T digit;

result = PyObject_MALLOC(offsetof(PyLongObject, ob_digit) + size*sizeof(digit));

n = 2 ** 64 # more bits than a wordassert type(n) is int and n > 0

Object StructureWhy my multi-dimensional array won’t work?

11

The answer: indirection, mutability

allocated

ob_type

ob_item

ob_refcnt

ob_size

PyListObject

PyObject*

PyObject*

PyObject*

PyObject*

allocated

ob_type

ob_item

ob_refcnt

ob_size

PyObject*

PyObject*

… 42

None

m, n = 4, 2arr = [ [ None ] * n ] * marr[1][1] = 42# [[None, 42], [None, 42], [None, 42], [None, 42]]

PyList_SetItem

Object Structurewhat is the ob_type?

12

The answer: flexible type system

class Machine(type): pass# Toy = Machine(foo, bar, hoge)class Toy(metaclass=Machine): passtoy = Toy()# Toy, Machine, type, typeprint(type(toy), type(Toy), type(Machine), type(type))

ob_type

ob_refcnt

toy

ob_type

ob_refcnt

…ob_type

ob_refcnt

Toy Machine

ob_type

ob_refcnt

Type

Object Structurewhat is the ob_type?

13

# ob_type2# 10fd69490 - 10fd69490 - 10fd69490print("%x - %x - %x" % (id(42 .__class__), id(233 .__class__), id(int)))

assert dict().__class__ is dict

# dynamically create a class named "MagicKlass"klass=“MagicKlass"klass=type(klass, (object,), {"quack": lambda _: print("quack")});duck = klass()# quackduck.quack()assert duck.__class__ is klass

Object Structurewhat is the ob_type?

14

ob_type

ob_refcnt

PyObject

*tp_as_mapping

*tp_as_sequence

*tp_as_number

ob_type

tp_getattr

tp_print

ob_refcnt

PyTypeObject

nb_subtract

nb_add

15

object and protocol:

the protocol

16

Protocol: duck-typing in typing

AOL‣ Abstract Object Layer

17

*tp_as_mapping

*tp_as_sequence

*tp_as_number

ob_type

tp_getattr

tp_print

ob_refcnt

PyTypeObject

nb_subtract

nb_add

When I see a bird that walks like a duck and swims like a duck and quacks like a duck, I call that bird a duck.

Object Protocol

Number Protocol

Sequence Protocol

Iterator Protocol

Buffer Protocol

int PyObject_Print(PyObject *o, FILE *fp, int flags)int PyObject_HasAttr(PyObject *o, PyObject *attr_name)int PyObject_DelAttr(PyObject *o, PyObject *attr_name)…

PyObject* PyNumber_Add(PyObject *o1, PyObject *o2)PyObject* PyNumber_Multiply(PyObject *o1, PyObject *o2)PyObject* PyNumber_FloorDivide(PyObject *o1, PyObject *o2)…

PyObject* PySequence_Concat(PyObject *o1, PyObject *o2)PyObject* PySequence_Repeat(PyObject *o, Py_ssize_t count)PyObject* PySequence_GetItem(PyObject *o, Py_ssize_t i)…

int PyIter_Check(PyObject *o)PyObject* PyIter_Next(PyObject *o)

int PyObject_GetBuffer(PyObject *exporter, Py_buffer *view, int flags)void PyBuffer_Release(Py_buffer *view)int PyBuffer_IsContiguous(Py_buffer *view, char order)…

Mapping Protocol int PyMapping_HasKey(PyObject *o, PyObject *key)PyObject* PyMapping_GetItemString(PyObject *o, const char *key)int PyMapping_SetItemString(PyObject *o, const char *key, PyObject *v)…

Example‣ Number Protocol (PyNumber_Add)

18

// v + w?

PyObject *PyNumber_Add(PyObject *v, PyObject *w){ // this just an example!

// try on v result = v->ob_type->tp_as_number.nb_add(v, w)

// if fail or if w->ob_type is a subclass of v->ob_type result = w->ob_type->tp_as_number.nb_add(w, v)

// return result}

*tp_as_mapping

*tp_as_sequence

*tp_as_number

ob_type

tp_getattr

tp_print

ob_refcnt

PyTypeObject

nb_subtract

nb_add

takeaway: typeobject stores meta information

More ExampleWhy can we multiply a list? Is it slow?

19

arr = [None] * 3# [None, None, None]

Exercise:arr = [None] + [None]# [None, None]

Magic Methodsaccess slots of tp_as_number, and its friends

20

Note tp_as_mapping->mp_length and tp_as_sequence->sq_length map to the same slot __len__

If your C based MyType implements both, what’s MyType.__len__ and len(MyType()) ?

# access magic method of dict and listdict.__getitem__ # tp_as_mapping->mp_subscriptdict.__len__ # tp_as_mapping->mp_length

list.__getitem__ # tp_as_sequence->sq_itemlist.__len__ # tp_as_sequence->sq_length

Magic Methodsbackfill as_number and its friends

21

class A(): def __len__(self): return 42class B(): pass# 42print(len(A()))# TypeError: object of type 'B' has no len()print(len(B()))

Py_ssize_tPyObject_Size(PyObject *o){ PySequenceMethods *m;

if (o == NULL) { null_error(); return -1; }

m = o->ob_type->tp_as_sequence; if (m && m->sq_length) return m->sq_length(o);

return PyMapping_Size(o);}Which field does A.__len__ fill?

Next: HeterogeneousHave you ever felt insecure towards negative indexing of PyListObject?

22

The answer: RTFSC

words = "the quick brown fox jumps over the old lazy dog".split()assert words[-1] == "dog"words.insert(-100, "hayabusa")assert words[-100] == ??

Thanks

23