Upload
shiyao-ma
View
62
Download
0
Embed Size (px)
Citation preview
a smalltalk on Object and Protocol in CPythonshiyao.ma <[email protected]>
May. 4th
Why this‣ Python is great for carrying out research experiment.
this should lay the foundation why I discuss Python.
‣ Life is short. You need Python. this should lay the foundation why people like Python.
2
life is neither a short nor a long, just a (signed) int, 31bits at most, I say.
Takeaway‣ Understand the inter-relation among {.py, .pyc .c} file.
‣ Understand that everything in Python is an object.
‣ Understand how functions on TypeObject affect InstanceObject.
3
CPython Overview‣ First implemented in Dec.1989 by GvR, the BDFL
‣ Serving as the reference implementation. ‣ IronPython (clr) ‣ Jython (jvm) ‣ Brython (v8, spider) [no kidding]
‣ Written in ANSI C. ‣ flexible language binding ‣ embedding (libpython), e.g., openwrt, etc.
4
CPython Overview‣ Code maintained by Mercurial.
‣ source: https://hg.python.org/cpython/
‣ Build toolchain is autoconf (on *nix)
./configure --with-pydebug && make -j2
5
CPython Overview‣ Structure
6
cpython
configure.ac
Doc
Grammar
Include
Lib
Mac
Modules
Objects
Parser
Programs
Python
CPython Overview‣ execution lifetime
7
PY
Parser
PY[CO]
VM
LOAD 1LOAD 2ADDLOAD XSTORE X
x = 1 + 2
1 12
3
3x
STACK
takeaway: py/pyc/c inter-relatoin
Objectobject: memory of C structure with common header
9
PyListObject PyDictObject PyTupleObject PySyntaxErrorObject PyImportErrorObject …
takeaway: everything is object
ob_type
ob_refcnt
PyObject
ob_type
ob_size
ob_refcnt
PyVarObject
Object StructureWill PyLongObject overflow?
10
The answer: chunk-chunk
digit[n]
…
digit[3]
digit[2]
ob_type
ob_size
digit ob_digit[1]
ob_refcnt
PyLongObject
typedef PY_UINT32_T digit;
result = PyObject_MALLOC(offsetof(PyLongObject, ob_digit) + size*sizeof(digit));
n = 2 ** 64 # more bits than a wordassert type(n) is int and n > 0
Object StructureWhy my multi-dimensional array won’t work?
11
The answer: indirection, mutability
allocated
ob_type
ob_item
ob_refcnt
ob_size
PyListObject
PyObject*
PyObject*
…
PyObject*
PyObject*
allocated
ob_type
ob_item
ob_refcnt
ob_size
PyObject*
PyObject*
… 42
None
m, n = 4, 2arr = [ [ None ] * n ] * marr[1][1] = 42# [[None, 42], [None, 42], [None, 42], [None, 42]]
PyList_SetItem
Object Structurewhat is the ob_type?
12
The answer: flexible type system
class Machine(type): pass# Toy = Machine(foo, bar, hoge)class Toy(metaclass=Machine): passtoy = Toy()# Toy, Machine, type, typeprint(type(toy), type(Toy), type(Machine), type(type))
ob_type
ob_refcnt
…
toy
ob_type
ob_refcnt
…ob_type
ob_refcnt
…
Toy Machine
ob_type
ob_refcnt
…
Type
Object Structurewhat is the ob_type?
13
# ob_type2# 10fd69490 - 10fd69490 - 10fd69490print("%x - %x - %x" % (id(42 .__class__), id(233 .__class__), id(int)))
assert dict().__class__ is dict
# dynamically create a class named "MagicKlass"klass=“MagicKlass"klass=type(klass, (object,), {"quack": lambda _: print("quack")});duck = klass()# quackduck.quack()assert duck.__class__ is klass
Object Structurewhat is the ob_type?
14
ob_type
…
…
…
ob_refcnt
PyObject
…
*tp_as_mapping
*tp_as_sequence
*tp_as_number
…
ob_type
tp_getattr
…
tp_print
ob_refcnt
PyTypeObject
…
nb_subtract
…
nb_add
AOL‣ Abstract Object Layer
17
…
*tp_as_mapping
*tp_as_sequence
*tp_as_number
…
ob_type
tp_getattr
…
tp_print
ob_refcnt
PyTypeObject
…
nb_subtract
…
nb_add
When I see a bird that walks like a duck and swims like a duck and quacks like a duck, I call that bird a duck.
Object Protocol
Number Protocol
Sequence Protocol
Iterator Protocol
Buffer Protocol
int PyObject_Print(PyObject *o, FILE *fp, int flags)int PyObject_HasAttr(PyObject *o, PyObject *attr_name)int PyObject_DelAttr(PyObject *o, PyObject *attr_name)…
PyObject* PyNumber_Add(PyObject *o1, PyObject *o2)PyObject* PyNumber_Multiply(PyObject *o1, PyObject *o2)PyObject* PyNumber_FloorDivide(PyObject *o1, PyObject *o2)…
PyObject* PySequence_Concat(PyObject *o1, PyObject *o2)PyObject* PySequence_Repeat(PyObject *o, Py_ssize_t count)PyObject* PySequence_GetItem(PyObject *o, Py_ssize_t i)…
int PyIter_Check(PyObject *o)PyObject* PyIter_Next(PyObject *o)
int PyObject_GetBuffer(PyObject *exporter, Py_buffer *view, int flags)void PyBuffer_Release(Py_buffer *view)int PyBuffer_IsContiguous(Py_buffer *view, char order)…
Mapping Protocol int PyMapping_HasKey(PyObject *o, PyObject *key)PyObject* PyMapping_GetItemString(PyObject *o, const char *key)int PyMapping_SetItemString(PyObject *o, const char *key, PyObject *v)…
Example‣ Number Protocol (PyNumber_Add)
18
// v + w?
PyObject *PyNumber_Add(PyObject *v, PyObject *w){ // this just an example!
// try on v result = v->ob_type->tp_as_number.nb_add(v, w)
// if fail or if w->ob_type is a subclass of v->ob_type result = w->ob_type->tp_as_number.nb_add(w, v)
// return result}
…
*tp_as_mapping
*tp_as_sequence
*tp_as_number
…
ob_type
tp_getattr
…
tp_print
ob_refcnt
PyTypeObject
…
nb_subtract
…
nb_add
takeaway: typeobject stores meta information
More ExampleWhy can we multiply a list? Is it slow?
19
arr = [None] * 3# [None, None, None]
Exercise:arr = [None] + [None]# [None, None]
Magic Methodsaccess slots of tp_as_number, and its friends
20
Note tp_as_mapping->mp_length and tp_as_sequence->sq_length map to the same slot __len__
If your C based MyType implements both, what’s MyType.__len__ and len(MyType()) ?
# access magic method of dict and listdict.__getitem__ # tp_as_mapping->mp_subscriptdict.__len__ # tp_as_mapping->mp_length
list.__getitem__ # tp_as_sequence->sq_itemlist.__len__ # tp_as_sequence->sq_length
Magic Methodsbackfill as_number and its friends
21
class A(): def __len__(self): return 42class B(): pass# 42print(len(A()))# TypeError: object of type 'B' has no len()print(len(B()))
Py_ssize_tPyObject_Size(PyObject *o){ PySequenceMethods *m;
if (o == NULL) { null_error(); return -1; }
m = o->ob_type->tp_as_sequence; if (m && m->sq_length) return m->sq_length(o);
return PyMapping_Size(o);}Which field does A.__len__ fill?
Next: HeterogeneousHave you ever felt insecure towards negative indexing of PyListObject?
22
The answer: RTFSC
words = "the quick brown fox jumps over the old lazy dog".split()assert words[-1] == "dog"words.insert(-100, "hayabusa")assert words[-100] == ??