21
Advanced Python on Abel Dmytro Karpenko Research Infrastructure Services group Department for Scientific Computing USIT, UiO

Advanced Python on Abel - Universitetet i oslo · 4/4/16 3 NumPy, SciPy, matplotlib Centrally managed on Abel Tight mutual integration Powerful set of tools for data analysis and

  • Upload
    others

  • View
    14

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Advanced Python on Abel - Universitetet i oslo · 4/4/16 3 NumPy, SciPy, matplotlib Centrally managed on Abel Tight mutual integration Powerful set of tools for data analysis and

Advanced Python on Abel

Dmytro KarpenkoResearch Infrastructure Services groupDepartment for Scientific ComputingUSIT, UiO

Page 2: Advanced Python on Abel - Universitetet i oslo · 4/4/16 3 NumPy, SciPy, matplotlib Centrally managed on Abel Tight mutual integration Powerful set of tools for data analysis and

4/4/16 2

Support for large, multi-dimensional arrays and matrices, and a large library of high-level mathematical functions to operate on these arrays

Support for scientific computing: optimization, linear algebra, integration, interpolation, statistics, FFT, signal and image processing, etc. Based heavily on NumPy

Plotting library, designed especially for use with NumPy, with MatLab-like interface

Page 3: Advanced Python on Abel - Universitetet i oslo · 4/4/16 3 NumPy, SciPy, matplotlib Centrally managed on Abel Tight mutual integration Powerful set of tools for data analysis and

4/4/16 3

NumPy, SciPy, matplotlib

● Centrally managed on Abel

● Tight mutual integration

● Powerful set of tools for data analysis and visualization

● Usually available on every scientific resource

● Easy to learn and use

● The 3 pieces used together can replace MATLAB.

Page 4: Advanced Python on Abel - Universitetet i oslo · 4/4/16 3 NumPy, SciPy, matplotlib Centrally managed on Abel Tight mutual integration Powerful set of tools for data analysis and

4/4/16 4

Getting started on Abel● For interactive use:

However, you'd rather use matplotlib as

>>> import matplotlib.pyplot as plt>>> import matplotlib.pyplot as plt

-bash-4.1$ module load python2

-bash-4.1$ python

Python 2.7.10 (default, Jul 1 2015, 11:02:23)

[GCC Intel(R) C++ gcc 4.4 mode] on linux2

Type "help", "copyright", "credits" or "license" for more information.

>>> import numpy

>>> import scipy

>>> import matplotlib

-bash-4.1$ module load python2

-bash-4.1$ python

Python 2.7.10 (default, Jul 1 2015, 11:02:23)

[GCC Intel(R) C++ gcc 4.4 mode] on linux2

Type "help", "copyright", "credits" or "license" for more information.

>>> import numpy

>>> import scipy

>>> import matplotlib

Page 5: Advanced Python on Abel - Universitetet i oslo · 4/4/16 3 NumPy, SciPy, matplotlib Centrally managed on Abel Tight mutual integration Powerful set of tools for data analysis and

4/4/16 5

Numpy

http://docs.scipy.org/doc/numpy/

https://docs.scipy.org/doc/numpy/reference/routines.html

Routines index

General documentation

SciPYhttp://docs.scipy.org/doc/scipy/reference/

Matplotlibhttp://matplotlib.org/1.5.1/users/index.html

Page 6: Advanced Python on Abel - Universitetet i oslo · 4/4/16 3 NumPy, SciPy, matplotlib Centrally managed on Abel Tight mutual integration Powerful set of tools for data analysis and

4/4/16 6

Numpy arrays>>> import numpy as np

>>> cvalues = [25.3, 24.8, 26.9, 23.9]

>>> C = np.array(cvalues)

>>> print(C)

[ 25.3 24.8 26.9 23.9]

>>> print(C * 9 / 5 + 32)

[ 77.54 76.64 80.42 75.02]

# Indexing and slicing similar to python lists

>>> print C[0]

>>> print C[1:3]

>>> import numpy as np

>>> cvalues = [25.3, 24.8, 26.9, 23.9]

>>> C = np.array(cvalues)

>>> print(C)

[ 25.3 24.8 26.9 23.9]

>>> print(C * 9 / 5 + 32)

[ 77.54 76.64 80.42 75.02]

# Indexing and slicing similar to python lists

>>> print C[0]

>>> print C[1:3]

● More straightforward syntax than with lists

● Considerably faster

Page 7: Advanced Python on Abel - Universitetet i oslo · 4/4/16 3 NumPy, SciPy, matplotlib Centrally managed on Abel Tight mutual integration Powerful set of tools for data analysis and

4/4/16 7

Numpy arrays: advanced addressing and slicing

>>> A = np.array([ [3.4, 8.7, 9.9], [1.1, -7.8, -0.7], [4.1, 12.3, 4.8] ])

>>> print(A[1, 0])

1.1

>>> A = np.array([

... [11,12,13,14,15],

... [21,22,23,24,25],

... [31,32,33,34,35],

... [41,42,43,44,45],

... [51,52,53,54,55] ] )

>>> print(A[:3,2:])

[[13 14 15]

[23 24 25]

[33 34 35]]

>>> A = np.array([ [3.4, 8.7, 9.9], [1.1, -7.8, -0.7], [4.1, 12.3, 4.8] ])

>>> print(A[1, 0])

1.1

>>> A = np.array([

... [11,12,13,14,15],

... [21,22,23,24,25],

... [31,32,33,34,35],

... [41,42,43,44,45],

... [51,52,53,54,55] ] )

>>> print(A[:3,2:])

[[13 14 15]

[23 24 25]

[33 34 35]]

Page 8: Advanced Python on Abel - Universitetet i oslo · 4/4/16 3 NumPy, SciPy, matplotlib Centrally managed on Abel Tight mutual integration Powerful set of tools for data analysis and

4/4/16 8

Numpy arrays: advanced slicing (using step)

>>> A = np.array([ [ 0 1 2 3 4 5 6]

... [ 7 8 9 10 11 12 13]

... [14 15 16 17 18 19 20]

... [21 22 23 24 25 26 27] ] )

>>> print (A[::2, ::3])

[[ 0 3 6]

[14 17 20]]

>>> print(A[::2, ::3])

[[ 0 3 6]

[14 17 20]]

>>> A = np.array([ [ 0 1 2 3 4 5 6]

... [ 7 8 9 10 11 12 13]

... [14 15 16 17 18 19 20]

... [21 22 23 24 25 26 27] ] )

>>> print (A[::2, ::3])

[[ 0 3 6]

[14 17 20]]

>>> print(A[::2, ::3])

[[ 0 3 6]

[14 17 20]]

[start:stop:step]

Page 9: Advanced Python on Abel - Universitetet i oslo · 4/4/16 3 NumPy, SciPy, matplotlib Centrally managed on Abel Tight mutual integration Powerful set of tools for data analysis and

4/4/16 9

Numpy arrays: evenly spaced values>>> a = np.arange(1, 10)

>>> print(a)

[1 2 3 4 5 6 7 8 9]

>>> x = np.arange(0.5, 10.4, 0.8)

>>> print(x)

[ 0.5 1.3 2.1 2.9 3.7 4.5 5.3 6.1 6.9 7.7 8.5 9.3 10.1]

>>> print(np.linspace(1, 10))

[ 1. 1.18367347 1.36734694 1.55102041 1.73469388 1.91836735 2.10204082 2.28571429 2.46938776

2.65306122 2.83673469 3.02040816 3.20408163 3.3877551 3.57142857 3.75510204 3.93877551 4.12244898

4.30612245 4.48979592 4.67346939 4.85714286 5.04081633 5.2244898 5.40816327 5.59183673 5.7755102

5.95918367 6.14285714 6.32653061 6.51020408 6.69387755 6.87755102 7.06122449 7.24489796 7.42857143

7.6122449 7.79591837 7.97959184 8.16326531 8.34693878 8.53061224 8.71428571 8.89795918 9.08163265

9.26530612 9.44897959 9.63265306 9.81632653 10. ]

>>> a = np.arange(1, 10)

>>> print(a)

[1 2 3 4 5 6 7 8 9]

>>> x = np.arange(0.5, 10.4, 0.8)

>>> print(x)

[ 0.5 1.3 2.1 2.9 3.7 4.5 5.3 6.1 6.9 7.7 8.5 9.3 10.1]

>>> print(np.linspace(1, 10))

[ 1. 1.18367347 1.36734694 1.55102041 1.73469388 1.91836735 2.10204082 2.28571429 2.46938776

2.65306122 2.83673469 3.02040816 3.20408163 3.3877551 3.57142857 3.75510204 3.93877551 4.12244898

4.30612245 4.48979592 4.67346939 4.85714286 5.04081633 5.2244898 5.40816327 5.59183673 5.7755102

5.95918367 6.14285714 6.32653061 6.51020408 6.69387755 6.87755102 7.06122449 7.24489796 7.42857143

7.6122449 7.79591837 7.97959184 8.16326531 8.34693878 8.53061224 8.71428571 8.89795918 9.08163265

9.26530612 9.44897959 9.63265306 9.81632653 10. ]

Page 10: Advanced Python on Abel - Universitetet i oslo · 4/4/16 3 NumPy, SciPy, matplotlib Centrally managed on Abel Tight mutual integration Powerful set of tools for data analysis and

4/4/16 10

Numpy arrays: reshaping>>> x = np.array([ [67, 63, 87], [77, 69, 59], [85, 87, 99], [79, 72, 71], [63, 89, 93], [68, 92, 78]])

>>> print(np.shape(x))

(6, 3)

>>> x.shape = (3, 6)

>>> print(x)

[[67 63 87 77 69 59]

[85 87 99 79 72 71]

[63 89 93 68 92 78]]

>>> X = np.arange(28).reshape(4,7)

>>> print(X)

[[ 0 1 2 3 4 5 6]

[ 7 8 9 10 11 12 13]

[14 15 16 17 18 19 20]

[21 22 23 24 25 26 27]]

>>> x = np.array([ [67, 63, 87], [77, 69, 59], [85, 87, 99], [79, 72, 71], [63, 89, 93], [68, 92, 78]])

>>> print(np.shape(x))

(6, 3)

>>> x.shape = (3, 6)

>>> print(x)

[[67 63 87 77 69 59]

[85 87 99 79 72 71]

[63 89 93 68 92 78]]

>>> X = np.arange(28).reshape(4,7)

>>> print(X)

[[ 0 1 2 3 4 5 6]

[ 7 8 9 10 11 12 13]

[14 15 16 17 18 19 20]

[21 22 23 24 25 26 27]]

Page 11: Advanced Python on Abel - Universitetet i oslo · 4/4/16 3 NumPy, SciPy, matplotlib Centrally managed on Abel Tight mutual integration Powerful set of tools for data analysis and

4/4/16 11

Scipy statistics>>> from scipy import stats

# Probability density function

>>> stats.norm.pdf(0.5)

0.35206532676429952

# Cumulative distribution function

>>> stats.norm.cdf(0.5)

0.69146246127401312

# Typical statistics functions

>>> norm.mean()

0.0

>>> norm.median()

0.0

>>> norm.std()

1.0

>>> from scipy import stats

# Probability density function

>>> stats.norm.pdf(0.5)

0.35206532676429952

# Cumulative distribution function

>>> stats.norm.cdf(0.5)

0.69146246127401312

# Typical statistics functions

>>> norm.mean()

0.0

>>> norm.median()

0.0

>>> norm.std()

1.0

Page 12: Advanced Python on Abel - Universitetet i oslo · 4/4/16 3 NumPy, SciPy, matplotlib Centrally managed on Abel Tight mutual integration Powerful set of tools for data analysis and

4/4/16 12

Scipy statistics>>> a = np.array([1,2,3,4])

>>> b = np.array([10,9,8,7])

# Pearson correlation coefficient

>>> stats.pearsonr(a, b)

(-1.0, 0.0)

# Measure Kolmogorov-Smirnov distance between two samples

>>> stats.ks_2samp(a,b)

Ks_2sampResult(statistic=1.0, pvalue=0.011065637015803861)

# Maximum Likelihood Estimation

>>> a=np.array([0.1, 0.2, 0.3, 0.4, 0.5])

>>> stat.norm.fit(a)

(0.29999999999999999, 0.1414213562373095)

>>> a = np.array([1,2,3,4])

>>> b = np.array([10,9,8,7])

# Pearson correlation coefficient

>>> stats.pearsonr(a, b)

(-1.0, 0.0)

# Measure Kolmogorov-Smirnov distance between two samples

>>> stats.ks_2samp(a,b)

Ks_2sampResult(statistic=1.0, pvalue=0.011065637015803861)

# Maximum Likelihood Estimation

>>> a=np.array([0.1, 0.2, 0.3, 0.4, 0.5])

>>> stat.norm.fit(a)

(0.29999999999999999, 0.1414213562373095)

Page 13: Advanced Python on Abel - Universitetet i oslo · 4/4/16 3 NumPy, SciPy, matplotlib Centrally managed on Abel Tight mutual integration Powerful set of tools for data analysis and

4/4/16 13

matplotlibimport matplotlib.pyplot as plt

plt.plot([1,2,3,4])

plt.ylabel('some numbers')

plt.show()

import matplotlib.pyplot as plt

plt.plot([1,2,3,4])

plt.ylabel('some numbers')

plt.show()

Page 14: Advanced Python on Abel - Universitetet i oslo · 4/4/16 3 NumPy, SciPy, matplotlib Centrally managed on Abel Tight mutual integration Powerful set of tools for data analysis and

4/4/16 14

matplotlibimport matplotlib.pyplot as plt

plt.plot([1,2,3,4], [1,4,9,16])

plt.show()

import matplotlib.pyplot as plt

plt.plot([1,2,3,4], [1,4,9,16])

plt.show()

Page 15: Advanced Python on Abel - Universitetet i oslo · 4/4/16 3 NumPy, SciPy, matplotlib Centrally managed on Abel Tight mutual integration Powerful set of tools for data analysis and

4/4/16 15

matplotlibimport matplotlib.pyplot as plt

plt.plot([1,2,3,4], [1,4,9,16], 'ro')

plt.axis([0, 6, 0, 20])

plt.show()

import matplotlib.pyplot as plt

plt.plot([1,2,3,4], [1,4,9,16], 'ro')

plt.axis([0, 6, 0, 20])

plt.show()

Page 16: Advanced Python on Abel - Universitetet i oslo · 4/4/16 3 NumPy, SciPy, matplotlib Centrally managed on Abel Tight mutual integration Powerful set of tools for data analysis and

4/4/16 16

matplotlibimport numpy as np

import matplotlib.pyplot as plt

t = np.arange(0., 5., 0.2)

plt.plot(t, t, 'r--', t, t**2, 'bs', t, t**3, 'g^')

plt.show()

import numpy as np

import matplotlib.pyplot as plt

t = np.arange(0., 5., 0.2)

plt.plot(t, t, 'r--', t, t**2, 'bs', t, t**3, 'g^')

plt.show()

Page 17: Advanced Python on Abel - Universitetet i oslo · 4/4/16 3 NumPy, SciPy, matplotlib Centrally managed on Abel Tight mutual integration Powerful set of tools for data analysis and

4/4/16 17

matplotlibimport numpy as np

import matplotlib.pyplot as plt

mu, sigma = 100, 15

x = mu + sigma * np.random.randn(10000)

n, bins, patches = plt.hist(x, 50, normed=1, facecolor='g', alpha=0.75)

plt.xlabel('Smarts')

plt.ylabel('Probability')

plt.title('Histogram of IQ')

plt.text(60, .025, r'$\mu=100,\ \sigma=15$')

plt.axis([40, 160, 0, 0.03])

plt.grid(True)

plt.show()

import numpy as np

import matplotlib.pyplot as plt

mu, sigma = 100, 15

x = mu + sigma * np.random.randn(10000)

n, bins, patches = plt.hist(x, 50, normed=1, facecolor='g', alpha=0.75)

plt.xlabel('Smarts')

plt.ylabel('Probability')

plt.title('Histogram of IQ')

plt.text(60, .025, r'$\mu=100,\ \sigma=15$')

plt.axis([40, 160, 0, 0.03])

plt.grid(True)

plt.show()

Page 18: Advanced Python on Abel - Universitetet i oslo · 4/4/16 3 NumPy, SciPy, matplotlib Centrally managed on Abel Tight mutual integration Powerful set of tools for data analysis and

4/4/16 18

matplotlibplt.scatter([x[0] for x in alldata],[x[1] for x in alldata],s=5,marker='+',c='b')

plt.text(200000,6000,"Pearson coef = %.3f" % stat.pearsonr([x[0] for x in alldata],[x[1] for x in alldata])[0])

plt.show()

plt.scatter([x[0] for x in alldata],[x[1] for x in alldata],s=5,marker='+',c='b')

plt.text(200000,6000,"Pearson coef = %.3f" % stat.pearsonr([x[0] for x in alldata],[x[1] for x in alldata])[0])

plt.show()

Page 19: Advanced Python on Abel - Universitetet i oslo · 4/4/16 3 NumPy, SciPy, matplotlib Centrally managed on Abel Tight mutual integration Powerful set of tools for data analysis and

4/4/16 19

matplotlib

● For non-interactive use:

import matplotlib as mpl

mpl.use('Agg')

import matplotlib.pyplot as plt

import matplotlib as mpl

mpl.use('Agg')

import matplotlib.pyplot as plt

now use as usual......but don't forget to save the picture instead of showing

x = np.arange(0, 3 * np.pi, 0.1)

y = np.sin(x)

plt.plot(x, y)

plt.savefig("plt_test.png")

x = np.arange(0, 3 * np.pi, 0.1)

y = np.sin(x)

plt.plot(x, y)

plt.savefig("plt_test.png")

Page 20: Advanced Python on Abel - Universitetet i oslo · 4/4/16 3 NumPy, SciPy, matplotlib Centrally managed on Abel Tight mutual integration Powerful set of tools for data analysis and

4/4/16 20

Exampleimport numpy as np

from scipy.stats import norm

import matplotlib

matplotlib.use('Agg')

import matplotlib.pyplot as plt

a = np.array([1, 2, 3]) # Create a rank 1 array

print type(a) # Prints "<type 'numpy.ndarray'>"

print a.shape # Prints "(3,)"

print a[0], a[1], a[2] # Prints "1 2 3"

a[0] = 5 # Change an element of the array

print a

print "-------------"

print norm.cdf(0)

…............................

import numpy as np

from scipy.stats import norm

import matplotlib

matplotlib.use('Agg')

import matplotlib.pyplot as plt

a = np.array([1, 2, 3]) # Create a rank 1 array

print type(a) # Prints "<type 'numpy.ndarray'>"

print a.shape # Prints "(3,)"

print a[0], a[1], a[2] # Prints "1 2 3"

a[0] = 5 # Change an element of the array

print a

print "-------------"

print norm.cdf(0)

…............................

…......................

print "-----------------"

x = np.arange(0, 3 * np.pi, 0.1)

y = np.sin(x)

# Plot the points using matplotlib

plt.plot(x, y)

plt.savefig("plt_test.png")

print "\nDone"

…......................

print "-----------------"

x = np.arange(0, 3 * np.pi, 0.1)

y = np.sin(x)

# Plot the points using matplotlib

plt.plot(x, y)

plt.savefig("plt_test.png")

print "\nDone"

Page 21: Advanced Python on Abel - Universitetet i oslo · 4/4/16 3 NumPy, SciPy, matplotlib Centrally managed on Abel Tight mutual integration Powerful set of tools for data analysis and

4/4/16 21

Example#!/bin/bash

#SBATCH --job-name=advanced_python_test

#SBATCH --account=uio

#SBATCH --time=00:03:00

#SBATCH --mem-per-cpu=4G

#SBATCH --mail-type=ALL

## Set up job environment:

source /cluster/bin/jobsetup

module purge # clear any inherited modules

module load python2

set -o errexit # exit on errors

## Copy input files to the work directory:

cp -rf adv_python_test.py $SCRATCH

…...............................................

#!/bin/bash

#SBATCH --job-name=advanced_python_test

#SBATCH --account=uio

#SBATCH --time=00:03:00

#SBATCH --mem-per-cpu=4G

#SBATCH --mail-type=ALL

## Set up job environment:

source /cluster/bin/jobsetup

module purge # clear any inherited modules

module load python2

set -o errexit # exit on errors

## Copy input files to the work directory:

cp -rf adv_python_test.py $SCRATCH

…...............................................

…......................................

## Do some work:

cd $SCRATCH

python pyth.py

cp plt_test.png $SUBMITDIR

…......................................

## Do some work:

cd $SCRATCH

python pyth.py

cp plt_test.png $SUBMITDIR