bing
Flat 10% & upto 50% off + 10% Cashback + Free additional Courses. Hurry up
×
UPTO
50%
OFF!
Intellipaat
Intellipaat
  • Live Instructor-led Classes
  • Expert Education
  • 24*7 Support
  • Flexible Schedule

Python SciPy: Learn with Example

Scientists and researchers are likely to gather enormous amount of information and data that are scientific and technical, from their exploration, experimentation and analysis. Dealing with such huge amount of data becomes a hindrance to them. That is, calculation and computing with large data manually is not an easy task. Hence, we make use of super computers and data science for the purpose of faster computing and accurate outcomes.

Another simpler way to deal with scientific and technical computing of data is by making use of one of the python library which is solely built for this purpose. It is referred to as SciPy (pronunciation “sigh pi”).

SciPy is open-source software, therefore it can be used free of cost and many new data science features are incorporated in it.

Following is the list of all the topics that we will cover in this tutorial.

Installation and setup of scipy

Scipy installation varies with respect to different operating systems. The following content shall be useful for you to easily figure out how to install scipy in your respective operating system.

Pip install scipy

Pip is basically a recursive acronym which stands for “Pip installs packages”. It is a standard package manager which can be installed in most operating systems.
Note: In order to installed other packages by using pip command, you need to make sure that you have installed python and pip in your system.

Python3 -m pip install –user numpy scipy

This command is mainly used for installing scipy in windows operating system with the help of pip. Installation of packages to local users rather than system directories by making use of –user flag.

sudo port install py35-scipy py35-numpy

This command denotes installing of scipy in Mac. Sudo is a command that allows one user to run programs with security privileges of another user.

sudo apt-get install  python-scipy python-numpy

This command is used to install scipy in a Linux operating system. apt-get is one of the command line tool which is necessary to work with APT (Advanced Packaging Tools) software packages.

Scipy modules

Many dedicative software tools are necessary for python scientific computing and scipy is one such tool or library offering many modules that you can work with in order to perform complex operations.

The following list shows some of the modules or sub-packages that can be used for computing

SL No. Sub-Package Function
1. Interpolation scipy.interpolate
2. Integration scipy.integrate
3. Optimization scipy.optimize
4. Signal processing scipy.signal
5. Statistics scipy.stats
6. Fast Fourier Transforms scipy.fftpack
7. Linear algebra scipy.linalg
8. sparse scipy.sparse
9. Input/output scipy.io
10. Special function scipy.special
11. Multidimensional image processing scipy.ndimage
12. Spatial data structures and algorithms scipy.spatial

Python scipy

Import numpy as np
From scipy import signal

This is a basic scipy code where the sub-package signal is being imported. We can import any sub-package in the similar manner. numpy is required for most of the sub-packages. The sub-package signal can be replaced by other modules concerned with scipy.
Data science masters program

Integration or scipy integrate

Numerical integration is carried out in scipy by making use of scipy.integrate sub-package. This package provides several integration techniques. Some of the integration functions are listed below.

SL NO. FUNCTION DESCRIPTION
1 quad To perform single integration
2 dblquad To perform double integration
3 tplquad To perform triple integration
4 nquad Performs n-fold multiple integration
5 fixed_quad for Gaussian quadrature and order n
6 quadrature For Gaussian quadrature to tolerance
7 romberg To perform Romberg integration
8 trapz To perform trapezoidal rule
9 cumtrapz For Trapezoidal rule to cumulatively compute integral
10 simps For Simpson’s rule
11 romb Romberg integration
12 polyint Analytical polynomial integration (NumPy)
13 poly1d Helper function for polyint (NumPy)

 

Single Integrals

It can also be called as general purpose integration. When there is only one variable present between two points, then we make use of the function quad.

quad(func, a, b[, args, full_output, …]

The above function used is, quad with the two limits ranging between a and b.
Let us understand the function with an example
A researcher is gathering a few data and he wants to find out the integrals of those data.
Single integral
In the above example, 12x is the function which lies between the intervals 0 and 1.
Example for single integration

import scipy.integrate
f= lambda x: 12*x
i = scipy.integrate.quad(f, 0, 1)
print (i)

Output:

(6.0, 6.661338147750939e-14)

Lambda function is made use so that any number of arguments can be used but it can have only one expression. Like here the expression is 12x. And we make use of the integrate function scipy.integrate.quad(f, 0, 1)

Double Integral

It is a type of integration where a function consists of at least two variables with y being the first argument and x being the second argument.

dblquad(func, a, b, gfun, hfun[, args, …])

The above function used is dblquad and here the y argument lies between the limits a and b. And the x argument lies between the limits g and h. Hence, two variables are defined.

The researcher now adds another variable to the previous data and makes it into a double integral.

double integeral
In the above example the first integral contains one variable that is the one denoting y function and the second integral contains the second variable denoting the x function.

Example for double integrals
import scipy.integrate
f = lambda x, y : 12*x
g = lambda x : 0
h = lambda y : 1
i = scipy.integrate.dblquad(f, 0, 0.5, g, h)
print(i)

output:

(3.0, 6.661338147750939e-14)

Hence, the above code contains the integration function used for double integrals scipy.integrate.dblquad(f, 0, 0.5, g, h) where, f is the function 12x, 0 and 0.5 are the integrals for y function and g and h are the integrals for x function.

Triple integrals

It is a type of integration where a function consists of at least three variable.
There will be three functions for x, y and z. hence, we compute with three integrals.

tplquad(func, a, b, gfun, hfun, qfun, rfun)

We make use of tplquad function with three integrals, one with interval a and b, the other with interval g and h and the third with interval q and r.

The researcher has further analyzed to involve with three integrals so he thinks of combining the two integral values with the third one represented with a z function.
triple integral
The above example denotes three integrals with a new function dz between the intervals 0 and 3
Example for triple integrals

from scipy import integrate
f = lambda z, y, x: 12*x
integrate.tplquad(f, 0, 0.5, lambda x: 0, lambda x: 1, lambda x, y: 0, lambda x, y: 3)

Output
(9.0, 3.988124156968869e-13)

The above code represents triple integral function integrate.tplquad(f, 0, 1, lambda x: 0, lambda x: 0.5, lambda x, y: 0, lambda x, y: 3) where we have represented three integrals with first one for z function between 0 and 3 intervals, y function between 0 and 0.5 intervals and x function between 0 and 1 intervals.

Scipy imread

Images can be read from a file as an array by making use of scipy.misc.imread. You will be able to use this function only if you have installed python imaging library (PIL).

scipy.ndimage.imread(‘mario.png’)

Output

array([[[121, 112, 131],
[138, 129, 148],
[153, 144, 165],
…,
[119, 126, 74],
[131, 136, 82],
[139, 144, 90]],[[ 89, 82, 100],
[110, 103, 121],
[130, 122, 143],
…,
[118, 125, 71],
[134, 141, 87],
[146, 153, 99]],[[ 73, 66, 84],
[ 94, 87, 105],
[115, 108, 126],
…,
[117, 126, 71],
[133, 142, 87],
[144, 153, 98]],
…,[[ 87, 106, 76],
[ 94, 110, 81],
[107, 124, 92],
…,
[120, 158, 97],
[119, 157, 96],
[119, 158, 95]],[[ 85, 101, 72],
[ 95, 111, 82],
[112, 127, 96],
…,
[121, 157, 96],
[120, 156, 94],
[120, 156, 94]],[[ 85, 101, 74],
[ 97, 113, 84], [111, 126, 97],
…,
[120, 156, 95],
[119, 155, 93],
[118, 154, 92]]], dtype=uint8)

In the above output I have used an image called Mario.png and this image has been read from its saved file and later converted into array so it can be later used for image processing.

Optimize minimize in scipy

Optimization is the method of selecting the most effectual or best resource or situation for a given problem. The syntax of optimization can be given as

import numpy as np
from scipy.optimize import minimize

Minimize function can be used to provide a common interface to constrained and unconstrained algorithms for a multivariate scalar function in scipy.optimize sub-package.

Scipy.optimize contains various modules.

  • Constrained and unconstrained minimization of multivariate scalar functions (minimize ()) using few variety of algorithms (e.g Nelder-Mead simplex)
  • Least-squares minimization (leastsq()) and curve fitting (curve_fit()) algorithms
  • Multivariate equation system solvers (root()) using a variety of algorithms (e.g. hybrid Powell)
  • Scalar univariate functions minimizers (minimize_scalar()) and root finders (newton())

Use of Nelder–Mead Simplex Algorithm

It is Applied for nonlinear optimization problems for which the derivatives will be unknown and it is a direct search method. We make use of the minimize() routine along with a Nelder- Mead simplex example.
The method used in this algorithm is (method = ‘Nelder-Mead’)

import numpy as np
from scipy.optimize import minimize
def rosen(x):
x0 = np.array([1.3, 0.7, 0.8, 1.9, 1.2])
res = minimize(rosen, x0, method=’nelder-mead’)
print(res.x)

Least Squares

We use optimize at times to solve least squares problem with bounds on the variables.
The following example illustrates the use of a rosenblock function to implement least square problem.

def fun_rosenbrock(x):
return np.array([10 * (x[1] – x[0]**2), (1 – x[0])])
from scipy.optimize import least_squares
input = np.array([2, 2])
res = least_squares(fun_rosenbrock, input)
print (res)

output:

active_mask: array([0., 0.])
cost: 9.866924291084687e-30
fun: array([4.44089210e-15, 1.11022302e-16])
grad: array([-8.89288649e-14, 4.44089210e-14])
jac: array([[-20.00000015, 10. ],
[ -1. , 0. ]])
message: ‘`gtol` termination condition is satisfied.’
nfev: 3
njev: 3
optimality: 8.892886493421953e-14
status: 1
success: True
x: array([1., 1.])

Root finding

This is one of the minimization methods that comes under optimization. In the following example we import a root function from the scipy.optimize sub-library in order to use it in the further calculation within the code.

import numpy as np
from scipy.optimize import root
def func(x):
return x*3 + 3 * np.cos(x)
sol = root(func, 0.4)
print (sol)

output

fjac: array([[-1.]])
fun: array([0.])
message: ‘The solution converged.’
nfev: 10
qtf: array([-1.26965105e-10])
r: array([-5.02083661])
status: 1
success: True
x: array([-0.73908513])

Curve fit

This is the part of optimization where we make use of non-linear least squares to fit a function.
The following code illustrates the curve fit.

import numpy as np
np.random.seed(0)
x_data = np.linspace(-7, 7, num=30)
y_data = 2.9 * np.sin(1.5 * x_data) + np.random.normal(size=30)
import matplotlib.pyplot as plt
plt.figure(figsize=(6, 4))
plt.scatter(x_data, y_data)

Output:

Curve fit

from scipy import optimize
def test_func(x, a, b):
return a * np.sin(b * x)
params, params_covariance = optimize.curve_fit(test_func, x_data, y_data, p0=[2, 2])
print(params)
plt.figure(figsize=(6, 4))
plt.scatter(x_data, y_data, label=’Data’)
plt.plot(x_data, test_func(x_data, params[0], params[1]), label=’Fitted function’)
plt.legend(loc=’best’)
plt.show()

Curve fir2

Interpolation

Finding the value between two points in a curve or a line can be termed as interpolation.

import numpy as np
from scipy import interpolate
import matplotlib.pyplot as plt
x = np.linspace(0, 5, 12)
y = np.cos(x**2/3+4)
print (x,y)

Output:

[0.  0.5 1.  1.5 2.  2.5 3.  3.5 4.  4.5 5.  5.5 6.  6.5 7. ]
[  1.15782128   1.37425055   2.51055609 -26.57541423  -1.39794519
-0.20255593   0.87144798  -4.28339592  -0.09170037   3.98989845
-0.23734945  18.55780867   0.30063224  -0.96236589  11.46273345]
plt.plot(x, y,’o’)
plt.show()

interpolate

Scipy-Stats

This sub-package contains a large number of probability distributions as well as a growing library of statistical functions.
There are various sub modules in statistics. They are listed below:

• rv_continuous
• rv_discrete
• rv_histogram

rv_continuous

This is a type of generic continuous random variable class which is mainly meant for sub classing.

rv_continuous([momtype, a, b, xtol, …])

Continuous random variable is represented as rv_continuous with various parameters within the function.
A simple demonstration of rv_continuous sub module.

import scipy.stats as st
class my_pdf(st.rv_continuous):
def _pdf(self,x):
return 3*x**2

rv_discrete

This is a type of generic random variable class which is mainly meant sub classing.

rv_discrete([a, b, name, badvalue, …])

random variable for discrete representation is given by rv_discrete with various parameters within the function.
The following example demonstrates rv_discrete sub module under scipy.stats

from scipy import stats

import matplotlib.pyplot as plt
x = np.arange(7)
y = (0.2, 0.3, 0.1, 0.1, 0.1, 0.0, 0.2)
custm = stats.rv_discrete(name=’custm’, values=(x, y))
fig, ax = plt.subplots(1, 1)
ax.plot(x, custm.pmf(x), ‘ro’, ms=12, mec=’y’)
ax.vlines(x, 0, custm.pmf(x), colors=’b’, lw=4)

rv discrete

In the above generated graph 7 values are plotted at various points as specified along the y axis.

rv_histogram

generates the distribution given by a histogram

rv_histogram(histogram, *args, **kwargs)

random variable histogram is represented as rv_histogram with various parameters within the function.
The following example demonstrats the representation of an rv_historgams.

import scipy.stats
import numpy as np
import matplotlib.pyplot as plt
data = scipy.stats.norm.rvs(size=1000, loc=0, scale=1.0, random_state=123)
hist = np.histogram(data, bins=100)
hist_dist = scipy.stats.rv_histogram(hist)
X = np.linspace(-5.0, 5.0, 100)
plt.hist(data, density=True, bins=100)
plt.plot(X, hist_dist.pdf(X), label=’PDF’)
plt.plot(X, hist_dist.cdf(X), label=’CDF’)

histogram

A gradual increase and then a stable flow of the wave is represented in the above graph that demonstrates the random variable histogram for probability density function [PDF] and Cumulative distribution function [CDF].

Sparse matrix

Arithmetic operations such as addition, subtraction, division, matrix power and multiplication can make use of sparse matrices. We can implement sparse matrix for two matrix formats:

  • Compressed Sparse Row [CSR]
  • Compressed Sparse Column [CSC]
  • Coordinate Format
  • Dictionary of keys based sparse matrix

Scipy.sparse.csr_matrix

This enables efficient row slicing. Let us see a simple program where we generate an empty 3×3 CSR matrix using the scipy.sparse.

import numpy as np
from scipy.sparse import csr_matrix
csr_matrix((3, 3), dtype=np.int8).toarray()

Output:

array([[0, 0, 0],
[0, 0, 0],
[0, 0, 0]], dtype=int8)

Representation of a 3×3 CSR matrix upon specification of the rows and columns through inputs.

row = np.array([0, 1, 0, 2, 1, 1])
col = np.array([1, 0, 2, 0, 0, 2])
data = np.array([1, 2, 3, 4, 5, 6])
csr_matrix((data, (row, col)), shape=(3, 3)).toarray()

Output:

array([[0, 1, 3],
[7, 0, 6],
[4, 0, 0]], dtype=int32)

Scipy.sparse.csc_matrix

This enables efficient column slicing. Let us see a simple program where we generate an empty 3×3 CSC matrix using the scipy.sparse.

import numpy as np
from scipy.sparse import csr_matrix
csc_matrix((3, 3), dtype=np.int8).toarray()

Output:

array([[0, 0, 0],
[0, 0, 0],
[0, 0, 0]], dtype=int8)

Representation of a 3×3 CSC matrix upon specification of the rows and columns through inputs

row = np.array([0, 1, 1, 2, 1, 2])
col = np.array([1, 1, 1, 2, 0, 2])
data = np.array([1, 2, 3, 4, 5, 6])
csc_matrix((data, (row, col)), shape=(3, 3)).toarray()

Output:

array([[ 0, 1, 0],
[ 5, 5, 0],
[ 0, 0, 10]], dtype=int32)

Compressed Sparse Columns is more efficient at accessing column operations or column vectors as it is stored as arrays of columns and their value at each row.

Compressed Sparse Row matrix are the opposite, it is stored as arrays of rows and their values stored at each column, and are more efficient at accessing row operations or row vectors.

CSR and CSC are difficult to construct from scratch. While COO and DOK are easier to construct.

Scipy.sparse.coo_matrix

This enables efficient row slicing. Let us see a simple program where we generate an empty 3×3 COO matrix using the scipy.sparse.

from scipy.sparse import coo_matrix
coo_matrix((3, 3), dtype=np.int8).toarray()

output:

array([[0, 0, 0],
[0, 0, 0],
[0, 0, 0]], dtype=int8)

Representation of a 3×3 CSR matrix upon specification of the rows and columns through inputs.

row = np.array([1, 1, 1, 2, 1, 2])
col = np.array([0, 1, 1, 2, 0, 2])
data = np.array([0, 2, 3, 4, 5, 6])
coo_matrix((data, (row, col)), shape=(3, 3)).toarray()
array([[ 0,  0,  0],
[ 5,  5,  0],
[ 0,  0, 10]])

Scipy.sparse.dok_matrix

We can construct sparse matrix incrementally in an efficient manner using this module under scipy.sparse sub-package

import numpy as np
from scipy.sparse import dok_matrix

The dictionary of keys allows access for individual values within the matrix

Fourier Front Transforms

The method for expressing any function as a sum of periodic components, and for recovering the signal from those components can be termed as Fourier analysis. When both the Fourier transform and its respective function are replaced with some discrete counterparts then it is termed as discrete Fourier transform.

We make use of the Fourier transform sub-package scipy.fftpack.

from scipy.fftpack import fft, ifft
x = np.array([1.0, 2.0, 1.0, -1.0, 1.5])
y = fft(x)
print(y)

Output:

[ 4.5   +0.j     2.08155948-1.65109876j  -1.83155948+1.60822041j                 -1.83155948-1.60822041j   2.08155948+1.65109876j]

Fourier front transform is performed on the given array of values and the above output is generated.

One-dimensional discrete Fourier transform

we can compute one dimensional Fourier transforms by making use of the following standard syntax

fft(a[, n, axis, norm])
ifft(a[, n, axis, norm])

we make use of fft for one-dimensional discrete Fourier transform and ifft for one-dimensional inverse discrete Fourier transform.
Let me explain this sub-package with a simple example of sum of two cosines for a one-dimensional discrete Fourier transform.

from scipy.fftpack import fft
N = 600
T = 1.0 / 800.0
x = np.linspace(0.0, N*T, N)
y = np.cos(70.0 * 2.0*np.pi*x) + 0.5*np.cos(90.0 * 2.0*np.pi*x)
yf = fft(y)
xf = np.linspace(0.0, 1.0/(2.0*T), N//2)
import matplotlib.pyplot as plt
plt.plot(xf, 2.0/N * np.abs(yf[0:N//2]))

One dimensional

By making use of two cosine functions, the sum is calculated and plotted in the graph. The waves are plotted at 70 and 90 degrees respectively along the x axis.

Two-dimensional discrete Fourier transform and N- dimensional discrete Fourier transform

We can compute two-dimensional Fourier transform by making use of the following syntax.

fft2(a[, s, axes, norm])
ifft2(a[, s, axes, norm])

we make use of fft for two-dimensional discrete Fourier transform and ifft for two-dimensional inverse discrete Fourier transform.

fftn(a[, s, axes, norm])
ifftn(a[, s, axes, norm])

we make use of fft for N-dimensional discrete Fourier transform and ifft for N-dimensional inverse discrete Fourier transform.
Let us consider a simple example of time- domain signals by making use of two-dimensional inverse FFT

from scipy.fftpack import ifftn
import matplotlib.pyplot as plt
import matplotlib.cm as cm
N = 30
f, ((ax1, ax2, ax3), (ax4, ax5, ax6)) = plt.subplots(2, 3, sharex=’col’, sharey=’row’)
xf = np.zeros((N,N))
xf[0, 5] = 10
xf[0, N-5] =10
Z = ifftn(xf)
ax1.imshow(xf, cmap=cm.Reds)
ax4.imshow(np.real(Z), cmap=cm.gray)
xf = np.zeros((N, N))
xf[5, 0] = 10
xf[N-5, N-10] = 10
Z = ifftn(xf)
ax2.imshow(xf, cmap=cm.Reds)
ax5.imshow(np.real(Z), cmap=cm.gray)
xf = np.zeros((N, N))
xf[5, 10] = 10
xf[N-5, N-10] = 10
Z = ifftn(xf)
ax3.imshow(xf, cmap=cm.Reds)
ax6.imshow(np.real(Z), cmap=cm.gray)

The above graph is obtained from the time-domain signal code generated by illustrating the inverse Fourier front transformation.N- dimensional discrete Fourier transform

Conclusion

Here, we come to the end of the tutorial. I’m hoping that I have provided sufficient information about the scientific and technical library of python that is scipy. There are various computing calculations that are time consuming and stressful for the human brain. Hence, by making use of such scientific calculation library in python programming language to carry out this purpose with ease and incredible speed has proved that this library function plays a vital role in data science.
I would suggest you to practice the various examples illustrated along with each concept and try to implement your own examples in order to understand the concept in a better manner and excel in it.
If you are looking to learn more, then do check out our Python certification Training Course which is specially designed to help you get in-depth understanding of all the Python concepts.

Previous Next

Download Interview Questions asked by top MNCs in 2019?

"0 Responses on SciPy Tutorial"

Leave a Message

100% Secure Payments. All major credit & debit cards accepted Or Pay by Paypal.
top

Sales Offer

Sign Up or Login to view the Free SciPy Tutorial.