31
Astronomical Data Analysis with Python Lecture 8 Yogesh Wadadekar NCRA-TIFR July August 2010 Yogesh Wadadekar (NCRA-TIFR) Topical course 1 / 27

Astronomical Data Analysis with Python - Lecture 8

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Astronomical Data Analysis with Python - Lecture 8

Astronomical Data Analysis with PythonLecture 8

Yogesh Wadadekar

NCRA-TIFR

July August 2010

Yogesh Wadadekar (NCRA-TIFR) Topical course 1 / 27

Page 2: Astronomical Data Analysis with Python - Lecture 8

Slides available at:

http://www.ncra.tifr.res.in/∼yogesh/python_course_2010/

Yogesh Wadadekar (NCRA-TIFR) Topical course 2 / 27

Page 3: Astronomical Data Analysis with Python - Lecture 8

Comments on assignment

Yogesh Wadadekar (NCRA-TIFR) Topical course 3 / 27

Page 4: Astronomical Data Analysis with Python - Lecture 8

SciPy - superset of numpy http://scipy.org

constants: physical constants and conversion factors (sinceversion 0.7.0[1])cluster: hierarchical clustering, vector quantization, K-meansfftpack: Discrete Fourier Transform algorithmsintegrate: numerical integration routinesinterpolate: interpolation toolsio: data input and outputlib: Python wrappers to external librarieslinalg: linear algebra routinesmisc: miscellaneous utilities (e.g. image reading/writing)optimize: optimization algorithms including linear programmingsignal: signal processing toolssparse: sparse matrix and related algorithmsspatial: KD-trees, nearest neighbors, distance functions

Yogesh Wadadekar (NCRA-TIFR) Topical course 4 / 27

Page 5: Astronomical Data Analysis with Python - Lecture 8

SciPy

special: special functionsstats: statistical functionsweave: tool for writing C/C++ code as Python multiline stringsndimage: image processing

Via RPy, SciPy can interface to the R statistical package R.

Yogesh Wadadekar (NCRA-TIFR) Topical course 5 / 27

Page 6: Astronomical Data Analysis with Python - Lecture 8

SciPy Cookbook

http://www.scipy.org/CookbookThis page hosts worked examples of commonly-done tasks. Some areintroductory in nature, while others are quite advanced (these are atthe bottom of the page).

Yogesh Wadadekar (NCRA-TIFR) Topical course 6 / 27

Page 7: Astronomical Data Analysis with Python - Lecture 8

power of matplotlib

import numpy as npx, y = np.random.randn(2, 100000)H, xedges, yedges = np.histogram2d(x, y, bins=50)extent = [xedges[0], xedges[-1], yedges[0],yedges[-1]]imshow(H, extent=extent)

Yogesh Wadadekar (NCRA-TIFR) Topical course 7 / 27

Page 8: Astronomical Data Analysis with Python - Lecture 8

Three main reasons to extend Python

1 speed.2 to leverage existing libraries that are not already wrapped.3 to allow Python programs to access devices at absolute memory

addresses, using C functions as an intermediary.

If neither of these reasons apply, you can happily code only in Python.

Yogesh Wadadekar (NCRA-TIFR) Topical course 8 / 27

Page 9: Astronomical Data Analysis with Python - Lecture 8

Questions to answer before trying to extend Python

is the compiled language code you need small enough to recodein Python. e.g. a Numerical recipes routine that requires perhaps3-4 other routines should be easily rewritable in Python.

if the code is large, check whether it already has a Pythoninterface. e.g. LAPACKif the code in question is in octave, idl, matlab, mathematica andyou have a licensed copy of these software, it will be easier to usethe interfaces to those languages.Do you really need to pass arrays or are disk files a good way ofexchanging data?It is more efficient to wrap a set of related subroutines rather thanjust one (or a few). Can you collaborate to do this? Ask on astropymailing list.

Yogesh Wadadekar (NCRA-TIFR) Topical course 9 / 27

Page 10: Astronomical Data Analysis with Python - Lecture 8

Questions to answer before trying to extend Python

is the compiled language code you need small enough to recodein Python. e.g. a Numerical recipes routine that requires perhaps3-4 other routines should be easily rewritable in Python.if the code is large, check whether it already has a Pythoninterface. e.g. LAPACK

if the code in question is in octave, idl, matlab, mathematica andyou have a licensed copy of these software, it will be easier to usethe interfaces to those languages.Do you really need to pass arrays or are disk files a good way ofexchanging data?It is more efficient to wrap a set of related subroutines rather thanjust one (or a few). Can you collaborate to do this? Ask on astropymailing list.

Yogesh Wadadekar (NCRA-TIFR) Topical course 9 / 27

Page 11: Astronomical Data Analysis with Python - Lecture 8

Questions to answer before trying to extend Python

is the compiled language code you need small enough to recodein Python. e.g. a Numerical recipes routine that requires perhaps3-4 other routines should be easily rewritable in Python.if the code is large, check whether it already has a Pythoninterface. e.g. LAPACKif the code in question is in octave, idl, matlab, mathematica andyou have a licensed copy of these software, it will be easier to usethe interfaces to those languages.

Do you really need to pass arrays or are disk files a good way ofexchanging data?It is more efficient to wrap a set of related subroutines rather thanjust one (or a few). Can you collaborate to do this? Ask on astropymailing list.

Yogesh Wadadekar (NCRA-TIFR) Topical course 9 / 27

Page 12: Astronomical Data Analysis with Python - Lecture 8

Questions to answer before trying to extend Python

is the compiled language code you need small enough to recodein Python. e.g. a Numerical recipes routine that requires perhaps3-4 other routines should be easily rewritable in Python.if the code is large, check whether it already has a Pythoninterface. e.g. LAPACKif the code in question is in octave, idl, matlab, mathematica andyou have a licensed copy of these software, it will be easier to usethe interfaces to those languages.Do you really need to pass arrays or are disk files a good way ofexchanging data?

It is more efficient to wrap a set of related subroutines rather thanjust one (or a few). Can you collaborate to do this? Ask on astropymailing list.

Yogesh Wadadekar (NCRA-TIFR) Topical course 9 / 27

Page 13: Astronomical Data Analysis with Python - Lecture 8

Questions to answer before trying to extend Python

is the compiled language code you need small enough to recodein Python. e.g. a Numerical recipes routine that requires perhaps3-4 other routines should be easily rewritable in Python.if the code is large, check whether it already has a Pythoninterface. e.g. LAPACKif the code in question is in octave, idl, matlab, mathematica andyou have a licensed copy of these software, it will be easier to usethe interfaces to those languages.Do you really need to pass arrays or are disk files a good way ofexchanging data?It is more efficient to wrap a set of related subroutines rather thanjust one (or a few). Can you collaborate to do this? Ask on astropymailing list.

Yogesh Wadadekar (NCRA-TIFR) Topical course 9 / 27

Page 14: Astronomical Data Analysis with Python - Lecture 8

Numpy C extensions

http://www.scipy.org/Cookbook/C_Extensions/NumPy_arrays

Yogesh Wadadekar (NCRA-TIFR) Topical course 10 / 27

Page 15: Astronomical Data Analysis with Python - Lecture 8

SWIG

SWIG is a software development tool that connects programs writtenin C and C++ with a variety of high-level programming languages.SWIG is used with different types of languages such as Perl, PHP,Python, Tcl, Ruby, C#, Common Lisp (CLISP, Allegro CL, CFFI, UFFI),Java, Lua, Modula-3, OCAML, Octave and R. Also several interpretedand compiled Scheme implementations (Guile, MzScheme, Chicken)are supported. SWIG is most commonly used to create high-levelinterpreted or compiled programming environments, user interfaces,and as a tool for testing and prototyping C/C++ software.

Yogesh Wadadekar (NCRA-TIFR) Topical course 11 / 27

Page 16: Astronomical Data Analysis with Python - Lecture 8

The example.c file

Yogesh Wadadekar (NCRA-TIFR) Topical course 12 / 27

Page 17: Astronomical Data Analysis with Python - Lecture 8

The example.i interface file

Now, in order to add these files to your favorite language, you need towrite an “interface file” which is the input to SWIG. An interface file forthe C functions we want to wrap is needed.

Yogesh Wadadekar (NCRA-TIFR) Topical course 13 / 27

Page 18: Astronomical Data Analysis with Python - Lecture 8

Building the module

$ swig -python example.i$ gcc -c example.c example_wrap.c-I/usr/include/python2.6$ ld -shared example.o example_wrap.o -o _example.so

Yogesh Wadadekar (NCRA-TIFR) Topical course 14 / 27

Page 19: Astronomical Data Analysis with Python - Lecture 8

Dynamic typing gives way to static typing

You can use an assert before the function call to the external function.

Yogesh Wadadekar (NCRA-TIFR) Topical course 15 / 27

Page 20: Astronomical Data Analysis with Python - Lecture 8

F2PY - Fortran to Python interface generator

autogenerates interface files to allow Python to call a Fortransubroutine.

Yogesh Wadadekar (NCRA-TIFR) Topical course 16 / 27

Page 21: Astronomical Data Analysis with Python - Lecture 8

Installing f2py

Recent versions of numpy already include f2py. So, if you installednumpy there is nothing more to do.

Yogesh Wadadekar (NCRA-TIFR) Topical course 17 / 27

Page 22: Astronomical Data Analysis with Python - Lecture 8

The Hello example

C File hello.fsubroutine foo (a)integer aprint*, "Hello from Fortran!"print*, "a=",aend Of course, there may be multiple subroutines in hello.f, all ofwhich will get wrapped in one go.

Yogesh Wadadekar (NCRA-TIFR) Topical course 18 / 27

Page 23: Astronomical Data Analysis with Python - Lecture 8

Build the module

$ f2py -c -m hello hello.f__doc__ are autocreated.

Yogesh Wadadekar (NCRA-TIFR) Topical course 19 / 27

Page 24: Astronomical Data Analysis with Python - Lecture 8

A more complicated example - fib3.f

Calculate the first n Fibbonacci numbers.

Yogesh Wadadekar (NCRA-TIFR) Topical course 20 / 27

Page 25: Astronomical Data Analysis with Python - Lecture 8

If you can’t edit the Fortran code

you can work with signature files which are like the interface files in C.

Yogesh Wadadekar (NCRA-TIFR) Topical course 21 / 27

Page 26: Astronomical Data Analysis with Python - Lecture 8

Features of f2py

F2PY automatically generates __doc__ strings (and optionallyLaTeX documentation) for extension modules.F2PY generated functions accept arbitrary (but sensible) Pythonobjects as arguments. The F2PY interface automatically takescare of type-casting.most Fortran 90 features also work.

Pyfort, a competitor to f2pY is not as feature rich

Yogesh Wadadekar (NCRA-TIFR) Topical course 22 / 27

Page 27: Astronomical Data Analysis with Python - Lecture 8

Attributes and statements of f2py

intent([ in | inout | out | hide | in,out |inout,out | c | copy | cache | callback | inplace |aux ])dimension(<dimspec>)common, parameterallocatableoptional, required, externaldepend([<names>])check([<C-booleanexpr>])note(<LaTeX text>)usercode, callstatement, callprotoargument,threadsafe, fortrannamepymethoddefentry

Yogesh Wadadekar (NCRA-TIFR) Topical course 23 / 27

Page 28: Astronomical Data Analysis with Python - Lecture 8

Other options for extending Python

SIP: lighter version of SWIG. Only works with C/C++. Originallywritten for PyQtctypes: no interface code needed, can directly call compiledfunctions in a binary library file. Some features work only onWindows.Boost.python: good for wrapping entire libraries in C or C++Pyrex: is a Python like language designed for writing Pythonextension modules. Almost any Python code is valid Pyrex code.

Yogesh Wadadekar (NCRA-TIFR) Topical course 24 / 27

Page 29: Astronomical Data Analysis with Python - Lecture 8

Extending Python to Java and C#

Java: Jython is an implementation of Python for the JVM. Jythontakes the Python programming language syntax and enables it torun on the Java platform. This allows seamless integration withthe use of Java libraries and other Java-based applications. ButCPython extensions don’t work.C# - IronPython is an implementation of the Python programminglanguage running under .NET and Silverlight

Yogesh Wadadekar (NCRA-TIFR) Topical course 25 / 27

Page 30: Astronomical Data Analysis with Python - Lecture 8

My recommendations

see if numpy C extensions will do the job for you.if you use Fortran (any flavour), use f2py. If you need to write freshcode (for speedy execution), write in Fortran and wrap with f2py.For C/C++ use SWIG or may SIPFor Java - use Jython, for C# use IronPythonFor all other languages, SWIG is the richest optionand please don’t forget regression tests.

Yogesh Wadadekar (NCRA-TIFR) Topical course 26 / 27

Page 31: Astronomical Data Analysis with Python - Lecture 8

Feel free to contact me for any teething problems...

[email protected]

Yogesh Wadadekar (NCRA-TIFR) Topical course 27 / 27