79
Writing Fast Code PyCon JP 2015 [email protected]

Writing Fast Code (JP) - PyCon JP 2015

Embed Size (px)

Citation preview

Writing Fast CodePyCon JP 2015

[email protected]

Younggun, Kim http://younggun.kim

@scari_netscari

Badass Alien @ District 9, SMARTSTUDY

http://pinkfong.com

PyCon Korea Organizer http://pycon.kr We’ll host PyCon APAC 2016 at Seoul

What I Think My Code Run

Movie - The Good The Bad The Weird, 2008

How My Code Really Run

The Killers : All These Things That I’ve Done M/V https://youtu.be/sZTpLvsYYHw

Objective

1. Understanding How Computer Works

2. How to use Profiler

But why?

Say, thousands of people using your code everyday and if you save 1 second to run it, this means you could save over 4 days of time human race wasted per a year.

See How Computer Works and How Fast Computer

and it’s peripherals

I/O >> 4D Wall >> Memory

Morse Code Modem (2400) CDMA(2G) HSPA(3G, DL)LTE*USB 2.0802.11nUSB 3.0SATA 3.0Thunderbolt 2DDR2 1066MhzDDR3 1600Mhz

≈ 21 bps≈ 2400 bps≈ 153 kbit/s≈ 13.98 Mbit/s≈ 100 Mbit/s≈ 480 Mbit/s≈ 600 Mbit/s≈ 3 Gbit/s≈ 6 Gbit/s≈ 20 Gbit/s≈ 64 Gbit/s≈ 102.4 Gbit/s

https://en.wikipedia.org/wiki/List_of_device_bit_rates

Yes! Memory is blazing fast! (Really?)

DDR3 1600MhzFSB 400 (old Xeon)PCI Express 3.0 (x16)QuickPath InterconnectHyperTransport 3.1L3 Cache(i7-4790X)L2 Cache(i7-4790X)

≈ 12.8 GB/s≈ 12.8 GB/s≈ 16 GB/s≈ 38.4 GB/s≈ 51.2 GB/s≈ 170 GB/s≈ 308 GB/s

Nope!

Computer Knows Only 0 and 1

00100000001000100000000101011110

Like This

00100000001000100000000101011110opcode

addr 1

addr 2

value

MIPS32 Add Immediate instruction (ADDI)

addi $r1, $r2, 350

$r1 = $r2 + 350

Computer Execute These Instruction per clock basis

Clock (Hz)

1Hz

1Hz

L1 Cache AccesL2 Cache AccessL3 Cache AccessRAM AccessSSD I/OHDD I/OInternet: Tokyo to SFRun IPython (0.6s)Reboot (5m)

3s 9s

43s6m

2-6 days1-12 months

12 years63 years

32,000 years!!

Hey! This is PyCon!

How Do You Know Python Works?

Neon Genesis Evangelion

そのための dis です

Neon Genesis Evangelion

Disassemble Python Code To CPython Bytecode To Support Analysis

dis module

https://github.com/python/cpython/blob/master/Include/opcode.h

line # of source

op addr / instruction annotations

param

An Empty List Creation

[] vs list()

Dictionary

{} vs dict()

Find an element in a list

using for-loop vs in

A tool for dynamic program analysisthat measure the space or time

complexity of a program.

Profilers

• cProfile (profile) • hotshot • line_profiler • memory_profiler • yappi • profiling • pyinstrument • plop • pprofile

cProfile

• built-in profiling tool • hook into the VM in CPython • introduces overhead a bit

https://docs.python.org/3.5/library/profile.html

cProfile

python -m cProfile python_code.py

line_profiler

• can profile line-by-line basis • Uses a decorator to mark the

chosen function (@profile) • introduces greater overhead

https://github.com/rkern/line_profiler

profiling• Interactive Python profiler which

inspired from Unity3D Profiler • Keep the call stack. • Live Profiling • Only Support Linux

https://github.com/what-studio/profiling

https://github.com/sublee/pyconkr2015-profiling-resources/blob/master/continuous.gif

fibona

Use profiler with real code

fibonaKorean Fried Chicken Served as one chicken. (not pieces)

And it’s quite complex to determine how many chicken would enough for N people.

fibonaThe problem can be solved easily using fibonacci number.

1 1 2 3 5 8 13 21 34 …

For Nth fibonacci number of people, N-1 th fibonacci number of chicken would be perfect.

fibona

Awesome Idea! but how do you get enough chicken if number of the people is not an fibonacci number?

fibonaApply Zeckendorf’s theorem, which is about the representation of integers as sum of Fibonacci number

https://en.wikipedia.org/wiki/Zeckendorf's_theorem

目標をセンターに入れてプロファイリング

cProfile

python -m cProfile fibonachicken.py

cProfile

line_profiler

line_profiler

kernprof -l -v fibonachicken.py

line_profiler

line_profiler

line_profiler

line_profiler

Both fib() and is_fibonacci() is the bottleneck. Should replace these with better one

Hypothesis #1

Improvement of fib() could result better performance

Binet’s Formula

https://en.wikipedia.org/wiki/Jacques_Philippe_Marie_Binet

cProfile

Hypothesis #2

Can we improve is_fibonacci() not to use fib() at all?

n is a Fibonacci number if and only if 5n*n+4 or 5n*n-4 is a square

Gessel’s Formula

http://www.maths.surrey.ac.uk/hosted-sites/R.Knott/Fibonacci/fibFormula.html

cProfile

Thanks!