Upload
others
View
14
Download
0
Embed Size (px)
Citation preview
Advanced Python on Abel
Dmytro KarpenkoResearch Infrastructure Services groupDepartment for Scientific ComputingUSIT, UiO
4/4/16 2
Support for large, multi-dimensional arrays and matrices, and a large library of high-level mathematical functions to operate on these arrays
Support for scientific computing: optimization, linear algebra, integration, interpolation, statistics, FFT, signal and image processing, etc. Based heavily on NumPy
Plotting library, designed especially for use with NumPy, with MatLab-like interface
4/4/16 3
NumPy, SciPy, matplotlib
● Centrally managed on Abel
● Tight mutual integration
● Powerful set of tools for data analysis and visualization
● Usually available on every scientific resource
● Easy to learn and use
● The 3 pieces used together can replace MATLAB.
4/4/16 4
Getting started on Abel● For interactive use:
However, you'd rather use matplotlib as
>>> import matplotlib.pyplot as plt>>> import matplotlib.pyplot as plt
-bash-4.1$ module load python2
-bash-4.1$ python
Python 2.7.10 (default, Jul 1 2015, 11:02:23)
[GCC Intel(R) C++ gcc 4.4 mode] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> import scipy
>>> import matplotlib
-bash-4.1$ module load python2
-bash-4.1$ python
Python 2.7.10 (default, Jul 1 2015, 11:02:23)
[GCC Intel(R) C++ gcc 4.4 mode] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> import scipy
>>> import matplotlib
4/4/16 5
Numpy
http://docs.scipy.org/doc/numpy/
https://docs.scipy.org/doc/numpy/reference/routines.html
Routines index
General documentation
SciPYhttp://docs.scipy.org/doc/scipy/reference/
Matplotlibhttp://matplotlib.org/1.5.1/users/index.html
4/4/16 6
Numpy arrays>>> import numpy as np
>>> cvalues = [25.3, 24.8, 26.9, 23.9]
>>> C = np.array(cvalues)
>>> print(C)
[ 25.3 24.8 26.9 23.9]
>>> print(C * 9 / 5 + 32)
[ 77.54 76.64 80.42 75.02]
# Indexing and slicing similar to python lists
>>> print C[0]
>>> print C[1:3]
>>> import numpy as np
>>> cvalues = [25.3, 24.8, 26.9, 23.9]
>>> C = np.array(cvalues)
>>> print(C)
[ 25.3 24.8 26.9 23.9]
>>> print(C * 9 / 5 + 32)
[ 77.54 76.64 80.42 75.02]
# Indexing and slicing similar to python lists
>>> print C[0]
>>> print C[1:3]
● More straightforward syntax than with lists
● Considerably faster
4/4/16 7
Numpy arrays: advanced addressing and slicing
>>> A = np.array([ [3.4, 8.7, 9.9], [1.1, -7.8, -0.7], [4.1, 12.3, 4.8] ])
>>> print(A[1, 0])
1.1
>>> A = np.array([
... [11,12,13,14,15],
... [21,22,23,24,25],
... [31,32,33,34,35],
... [41,42,43,44,45],
... [51,52,53,54,55] ] )
>>> print(A[:3,2:])
[[13 14 15]
[23 24 25]
[33 34 35]]
>>> A = np.array([ [3.4, 8.7, 9.9], [1.1, -7.8, -0.7], [4.1, 12.3, 4.8] ])
>>> print(A[1, 0])
1.1
>>> A = np.array([
... [11,12,13,14,15],
... [21,22,23,24,25],
... [31,32,33,34,35],
... [41,42,43,44,45],
... [51,52,53,54,55] ] )
>>> print(A[:3,2:])
[[13 14 15]
[23 24 25]
[33 34 35]]
4/4/16 8
Numpy arrays: advanced slicing (using step)
>>> A = np.array([ [ 0 1 2 3 4 5 6]
... [ 7 8 9 10 11 12 13]
... [14 15 16 17 18 19 20]
... [21 22 23 24 25 26 27] ] )
>>> print (A[::2, ::3])
[[ 0 3 6]
[14 17 20]]
>>> print(A[::2, ::3])
[[ 0 3 6]
[14 17 20]]
>>> A = np.array([ [ 0 1 2 3 4 5 6]
... [ 7 8 9 10 11 12 13]
... [14 15 16 17 18 19 20]
... [21 22 23 24 25 26 27] ] )
>>> print (A[::2, ::3])
[[ 0 3 6]
[14 17 20]]
>>> print(A[::2, ::3])
[[ 0 3 6]
[14 17 20]]
[start:stop:step]
4/4/16 9
Numpy arrays: evenly spaced values>>> a = np.arange(1, 10)
>>> print(a)
[1 2 3 4 5 6 7 8 9]
>>> x = np.arange(0.5, 10.4, 0.8)
>>> print(x)
[ 0.5 1.3 2.1 2.9 3.7 4.5 5.3 6.1 6.9 7.7 8.5 9.3 10.1]
>>> print(np.linspace(1, 10))
[ 1. 1.18367347 1.36734694 1.55102041 1.73469388 1.91836735 2.10204082 2.28571429 2.46938776
2.65306122 2.83673469 3.02040816 3.20408163 3.3877551 3.57142857 3.75510204 3.93877551 4.12244898
4.30612245 4.48979592 4.67346939 4.85714286 5.04081633 5.2244898 5.40816327 5.59183673 5.7755102
5.95918367 6.14285714 6.32653061 6.51020408 6.69387755 6.87755102 7.06122449 7.24489796 7.42857143
7.6122449 7.79591837 7.97959184 8.16326531 8.34693878 8.53061224 8.71428571 8.89795918 9.08163265
9.26530612 9.44897959 9.63265306 9.81632653 10. ]
>>> a = np.arange(1, 10)
>>> print(a)
[1 2 3 4 5 6 7 8 9]
>>> x = np.arange(0.5, 10.4, 0.8)
>>> print(x)
[ 0.5 1.3 2.1 2.9 3.7 4.5 5.3 6.1 6.9 7.7 8.5 9.3 10.1]
>>> print(np.linspace(1, 10))
[ 1. 1.18367347 1.36734694 1.55102041 1.73469388 1.91836735 2.10204082 2.28571429 2.46938776
2.65306122 2.83673469 3.02040816 3.20408163 3.3877551 3.57142857 3.75510204 3.93877551 4.12244898
4.30612245 4.48979592 4.67346939 4.85714286 5.04081633 5.2244898 5.40816327 5.59183673 5.7755102
5.95918367 6.14285714 6.32653061 6.51020408 6.69387755 6.87755102 7.06122449 7.24489796 7.42857143
7.6122449 7.79591837 7.97959184 8.16326531 8.34693878 8.53061224 8.71428571 8.89795918 9.08163265
9.26530612 9.44897959 9.63265306 9.81632653 10. ]
4/4/16 10
Numpy arrays: reshaping>>> x = np.array([ [67, 63, 87], [77, 69, 59], [85, 87, 99], [79, 72, 71], [63, 89, 93], [68, 92, 78]])
>>> print(np.shape(x))
(6, 3)
>>> x.shape = (3, 6)
>>> print(x)
[[67 63 87 77 69 59]
[85 87 99 79 72 71]
[63 89 93 68 92 78]]
>>> X = np.arange(28).reshape(4,7)
>>> print(X)
[[ 0 1 2 3 4 5 6]
[ 7 8 9 10 11 12 13]
[14 15 16 17 18 19 20]
[21 22 23 24 25 26 27]]
>>> x = np.array([ [67, 63, 87], [77, 69, 59], [85, 87, 99], [79, 72, 71], [63, 89, 93], [68, 92, 78]])
>>> print(np.shape(x))
(6, 3)
>>> x.shape = (3, 6)
>>> print(x)
[[67 63 87 77 69 59]
[85 87 99 79 72 71]
[63 89 93 68 92 78]]
>>> X = np.arange(28).reshape(4,7)
>>> print(X)
[[ 0 1 2 3 4 5 6]
[ 7 8 9 10 11 12 13]
[14 15 16 17 18 19 20]
[21 22 23 24 25 26 27]]
4/4/16 11
Scipy statistics>>> from scipy import stats
# Probability density function
>>> stats.norm.pdf(0.5)
0.35206532676429952
# Cumulative distribution function
>>> stats.norm.cdf(0.5)
0.69146246127401312
# Typical statistics functions
>>> norm.mean()
0.0
>>> norm.median()
0.0
>>> norm.std()
1.0
>>> from scipy import stats
# Probability density function
>>> stats.norm.pdf(0.5)
0.35206532676429952
# Cumulative distribution function
>>> stats.norm.cdf(0.5)
0.69146246127401312
# Typical statistics functions
>>> norm.mean()
0.0
>>> norm.median()
0.0
>>> norm.std()
1.0
4/4/16 12
Scipy statistics>>> a = np.array([1,2,3,4])
>>> b = np.array([10,9,8,7])
# Pearson correlation coefficient
>>> stats.pearsonr(a, b)
(-1.0, 0.0)
# Measure Kolmogorov-Smirnov distance between two samples
>>> stats.ks_2samp(a,b)
Ks_2sampResult(statistic=1.0, pvalue=0.011065637015803861)
# Maximum Likelihood Estimation
>>> a=np.array([0.1, 0.2, 0.3, 0.4, 0.5])
>>> stat.norm.fit(a)
(0.29999999999999999, 0.1414213562373095)
>>> a = np.array([1,2,3,4])
>>> b = np.array([10,9,8,7])
# Pearson correlation coefficient
>>> stats.pearsonr(a, b)
(-1.0, 0.0)
# Measure Kolmogorov-Smirnov distance between two samples
>>> stats.ks_2samp(a,b)
Ks_2sampResult(statistic=1.0, pvalue=0.011065637015803861)
# Maximum Likelihood Estimation
>>> a=np.array([0.1, 0.2, 0.3, 0.4, 0.5])
>>> stat.norm.fit(a)
(0.29999999999999999, 0.1414213562373095)
4/4/16 13
matplotlibimport matplotlib.pyplot as plt
plt.plot([1,2,3,4])
plt.ylabel('some numbers')
plt.show()
import matplotlib.pyplot as plt
plt.plot([1,2,3,4])
plt.ylabel('some numbers')
plt.show()
4/4/16 14
matplotlibimport matplotlib.pyplot as plt
plt.plot([1,2,3,4], [1,4,9,16])
plt.show()
import matplotlib.pyplot as plt
plt.plot([1,2,3,4], [1,4,9,16])
plt.show()
4/4/16 15
matplotlibimport matplotlib.pyplot as plt
plt.plot([1,2,3,4], [1,4,9,16], 'ro')
plt.axis([0, 6, 0, 20])
plt.show()
import matplotlib.pyplot as plt
plt.plot([1,2,3,4], [1,4,9,16], 'ro')
plt.axis([0, 6, 0, 20])
plt.show()
4/4/16 16
matplotlibimport numpy as np
import matplotlib.pyplot as plt
t = np.arange(0., 5., 0.2)
plt.plot(t, t, 'r--', t, t**2, 'bs', t, t**3, 'g^')
plt.show()
import numpy as np
import matplotlib.pyplot as plt
t = np.arange(0., 5., 0.2)
plt.plot(t, t, 'r--', t, t**2, 'bs', t, t**3, 'g^')
plt.show()
4/4/16 17
matplotlibimport numpy as np
import matplotlib.pyplot as plt
mu, sigma = 100, 15
x = mu + sigma * np.random.randn(10000)
n, bins, patches = plt.hist(x, 50, normed=1, facecolor='g', alpha=0.75)
plt.xlabel('Smarts')
plt.ylabel('Probability')
plt.title('Histogram of IQ')
plt.text(60, .025, r'$\mu=100,\ \sigma=15$')
plt.axis([40, 160, 0, 0.03])
plt.grid(True)
plt.show()
import numpy as np
import matplotlib.pyplot as plt
mu, sigma = 100, 15
x = mu + sigma * np.random.randn(10000)
n, bins, patches = plt.hist(x, 50, normed=1, facecolor='g', alpha=0.75)
plt.xlabel('Smarts')
plt.ylabel('Probability')
plt.title('Histogram of IQ')
plt.text(60, .025, r'$\mu=100,\ \sigma=15$')
plt.axis([40, 160, 0, 0.03])
plt.grid(True)
plt.show()
4/4/16 18
matplotlibplt.scatter([x[0] for x in alldata],[x[1] for x in alldata],s=5,marker='+',c='b')
plt.text(200000,6000,"Pearson coef = %.3f" % stat.pearsonr([x[0] for x in alldata],[x[1] for x in alldata])[0])
plt.show()
plt.scatter([x[0] for x in alldata],[x[1] for x in alldata],s=5,marker='+',c='b')
plt.text(200000,6000,"Pearson coef = %.3f" % stat.pearsonr([x[0] for x in alldata],[x[1] for x in alldata])[0])
plt.show()
4/4/16 19
matplotlib
● For non-interactive use:
import matplotlib as mpl
mpl.use('Agg')
import matplotlib.pyplot as plt
import matplotlib as mpl
mpl.use('Agg')
import matplotlib.pyplot as plt
now use as usual......but don't forget to save the picture instead of showing
x = np.arange(0, 3 * np.pi, 0.1)
y = np.sin(x)
plt.plot(x, y)
plt.savefig("plt_test.png")
x = np.arange(0, 3 * np.pi, 0.1)
y = np.sin(x)
plt.plot(x, y)
plt.savefig("plt_test.png")
4/4/16 20
Exampleimport numpy as np
from scipy.stats import norm
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
a = np.array([1, 2, 3]) # Create a rank 1 array
print type(a) # Prints "<type 'numpy.ndarray'>"
print a.shape # Prints "(3,)"
print a[0], a[1], a[2] # Prints "1 2 3"
a[0] = 5 # Change an element of the array
print a
print "-------------"
print norm.cdf(0)
…............................
import numpy as np
from scipy.stats import norm
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
a = np.array([1, 2, 3]) # Create a rank 1 array
print type(a) # Prints "<type 'numpy.ndarray'>"
print a.shape # Prints "(3,)"
print a[0], a[1], a[2] # Prints "1 2 3"
a[0] = 5 # Change an element of the array
print a
print "-------------"
print norm.cdf(0)
…............................
…......................
print "-----------------"
x = np.arange(0, 3 * np.pi, 0.1)
y = np.sin(x)
# Plot the points using matplotlib
plt.plot(x, y)
plt.savefig("plt_test.png")
print "\nDone"
…......................
print "-----------------"
x = np.arange(0, 3 * np.pi, 0.1)
y = np.sin(x)
# Plot the points using matplotlib
plt.plot(x, y)
plt.savefig("plt_test.png")
print "\nDone"
4/4/16 21
Example#!/bin/bash
#SBATCH --job-name=advanced_python_test
#SBATCH --account=uio
#SBATCH --time=00:03:00
#SBATCH --mem-per-cpu=4G
#SBATCH --mail-type=ALL
## Set up job environment:
source /cluster/bin/jobsetup
module purge # clear any inherited modules
module load python2
set -o errexit # exit on errors
## Copy input files to the work directory:
cp -rf adv_python_test.py $SCRATCH
…...............................................
#!/bin/bash
#SBATCH --job-name=advanced_python_test
#SBATCH --account=uio
#SBATCH --time=00:03:00
#SBATCH --mem-per-cpu=4G
#SBATCH --mail-type=ALL
## Set up job environment:
source /cluster/bin/jobsetup
module purge # clear any inherited modules
module load python2
set -o errexit # exit on errors
## Copy input files to the work directory:
cp -rf adv_python_test.py $SCRATCH
…...............................................
…......................................
## Do some work:
cd $SCRATCH
python pyth.py
cp plt_test.png $SUBMITDIR
…......................................
## Do some work:
cd $SCRATCH
python pyth.py
cp plt_test.png $SUBMITDIR