24-5. Introduction to SciPy

What is SciPy?

SciPy is a robust library designed for scientific and technical computing, built on top of NumPy. It extends NumPy's functionality with advanced tools for tasks like optimization, integration, interpolation, eigenvalue problems, and signal and image processing.

  • SciPy stands for Scientific Python.
  • It is open-source, making it freely accessible and modifiable by the community.
  • Created by Travis Oliphant, the same developer behind NumPy, SciPy has become a cornerstone of the Python scientific computing ecosystem.

Key Features of SciPy

SciPy offers a wide range of functionality, including:

  • Optimization: Tools for minimizing or maximizing functions and solving systems of equations.
  • Integration: Numerical integration of functions and solving ordinary differential equations (ODEs).
  • Linear Algebra: Decomposition and matrix operations beyond what's available in NumPy.
  • Statistics: Advanced statistical analysis tools, including probability distributions and hypothesis testing.
  • Signal and Image Processing: Functions for filtering, Fourier transforms, and working with multi-dimensional image data.
  • Sparse Matrices: Efficient storage and operations for matrices with a large number of zero entries.

Why Use SciPy?

Although SciPy is built on NumPy, it adds optimized and specialized functionality tailored to common scientific and data analysis workflows. Key benefits include:

  • Efficiency: Functions in SciPy are often faster and better optimized than their NumPy equivalents.
  • Breadth: SciPy provides tools for advanced tasks not covered by NumPy, reducing the need for additional libraries.
  • Ease of Use: SciPy's APIs are designed to be intuitive, making complex mathematical operations accessible to users with basic Python knowledge.

Applications of SciPy

SciPy is widely used in fields such as:

  • Physics, engineering, and mathematics for computational modeling.
  • Data science and machine learning for statistical analysis and preprocessing.
  • Signal and image processing for feature extraction and transformation.
  • Economics and finance for optimization and integration tasks.

Language and Codebase

  • Primary Language: SciPy is primarily written in Python for usability, with performance-critical components implemented in C for speed.
  • Source Code: The complete SciPy codebase is hosted on GitHub, allowing developers to contribute or explore its implementation. You can find it here: SciPy GitHub Repository.

Why SciPy Matters

SciPy is more than just a library; it’s a tool that bridges the gap between theoretical computation and practical implementation. Its seamless integration with other Python libraries like NumPy, Matplotlib, and Pandas makes it a cornerstone of the Python scientific stack. Whether you're a researcher, engineer, or data scientist, SciPy empowers you to tackle complex computational problems with ease and precision.

Installing and Using SciPy

Installing SciPy

If Python and PIP are already installed on your system, installing SciPy is straightforward.

Use the following command to install SciPy:

pip install scipy

If the installation fails, consider using a Python distribution such as Anaconda or Spyder, which come with SciPy pre-installed and are ideal for scientific computing.

Importing SciPy 

Once SciPy is installed, you can import specific modules as needed using the from scipy import module syntax.

Example: Importing and Using the constants Module

The constants module in SciPy provides a wide range of mathematical and physical constants.

python
from scipy import constants

print(constants.liter)

In this example, the constants. Liter constant returns the equivalent of 1 liter in cubic meters. SciPy's constants module is a convenient resource for precise mathematical and physical values.

Checking the Installed SciPy Version

You can verify the installed version of SciPy using the __version__ attribute.

Example:

import scipy

# Print the SciPy version
print(scipy.__version_)

This command outputs the current version of SciPy, ensuring compatibility with your project or system requirements.

Constants in SciPy

As SciPy is more focused on scientific implementations, it provides many built-in scientific constants.

These constants can be helpful when you are working with Data Science.

from scipy import constants

print(constants.pi)

Output:

3.141592653589793

Constant Units

A list of all units under the constants module can be seen using the dir() function.

from scipy import constants

print(dir(constants))

Output:

Unit Categories

The units are placed under these categories:

  • Metric
  • Binary
  • Mass
  • Angle
  • Time
  • Length
  • Pressure
  • Volume
  • Speed
  • Temperature
  • Energy
  • Power
  • Force

Metric (SI) Prefixes:

Return the specified unit in meter (e.g. centi returns 0.01)

from scipy import constants

print(constants.yotta)
print(constants.zetta)
print(constants.exa)
print(constants.peta)
print(constants.tera)
print(constants.giga)
print(constants.mega)
print(constants.kilo)
print(constants.hecto)
print(constants.deka)
print(constants.deci)
print(constants.centi)
print(constants.milli)
print(constants.micro)
print(constants.nano)
print(constants.pico)
print(constants.femto)
print(constants.atto)
print(constants.zepto)

Output:

The final example is of length. Length: Return the specified unit in meters (e.g. nautical_mile returns 1852.0)

from scipy import constants

print(constants.inch)
print(constants.foot)
print(constants.yard)
print(constants.mile)
print(constants.mil)
print(constants.pt)
print(constants.point)
print(constants.survey_foot)
print(constants.survey_mile)
print(constants.nautical_mile)
print(constants.fermi)
print(constants.angstrom)
print(constants.micron)
print(constants.au)
print(constants.astronomical_unit)
print(constants.light_year)
print(constants.parsec)

Output:

SciPy Optimizers

What are Optimizers in SciPy?

Optimizers in SciPy are a collection of tools designed to either:

  • Minimize a function: Find the input values that result in the lowest output for a given function.
  • Solve equations: Locate the roots of equations, where the function evaluates to zero.
  • These tools are essential in scientific computing and machine learning, where optimization tasks and equation solving are fundamental.

Optimizing Functions

In machine learning and data analysis, algorithms often involve complex equations that need to be minimized or optimized based on the provided data. SciPy's optimization functions streamline this process.

Finding Roots of an Equation

While NumPy can handle roots for polynomials and linear equations, it lacks functionality for solving non-linear equations. SciPy addresses this limitation with the optimize.root function.

Using optimize.root

The optimize.root function locates the root of a non-linear equation. It requires:

  • fun: A callable function representing the equation.
  • x0: An initial guess for the root.

The function returns an object containing information about the solution. The actual root can be accessed using the .x attribute of the returned object.

Example

from scipy.optimize import root
from math import cos

def eqn(x):
  return x + cos(x)

myroot = root(eqn, 0)

print(myroot.x)

Output:

[-0.73908513]

SciPy Sparse Data

What is Sparse Data

Sparse data is data that has mostly unused elements (elements that don't carry any information ).

It can be an array like this one:

[1, 0, 2, 0, 0, 3, 0, 0, 0, 0, 0, 0]

  • Sparse Data: is a data set where most of the item values are zero.
  • Dense Array: is the opposite of a sparse array: most of the values are not zero.

In scientific computing, when we are dealing with partial derivatives in linear algebra we will come across sparse data.

How to Work With Sparse Data

SciPy has a module, scipy.sparse that provides functions to deal with sparse data.

There are primarily two types of sparse matrices that we use:

  • CSC - Compressed Sparse Column. For efficient arithmetic, fast column slicing.
  • CSR - Compressed Sparse Row. For fast row slicing, faster matrix vector products

We will use the CSR matrix in this tutorial.

CSR Matrix

We can create CSR matrix by passing an arrray into function scipy.sparse.csr_matrix().

import numpy as np
from scipy.sparse import csr_matrix

arr = np.array([0, 0, 0, 0, 0, 1, 1, 0, 2])

print(csr_matrix(arr))

Output: