NumPy stands for Numerical Python, is an open-source Python library that provides support for large, multi-dimensional arrays and matrices. It provides tools for numerical computations and forms the foundation of many other data science libraries like pandas, scikit-learn, and TensorFlow, making it essential for anyone working in data science. NumPy offers robust functionality for handling and manipulating multi-dimensional arrays, performing mathematical operations, and enabling efficient data processing.
The library is particularly useful for:
- Handling large datasets: NumPy's arrays are optimized for efficient storage and computation, allowing it to handle large datasets effectively.
- Vectorized computations: It eliminates the need for loops by supporting operations on entire arrays at once.
- Integration with other libraries: Many Python data science libraries use NumPy arrays as their base data structure.
Key Features of NumPy
NumPy offers a range of features that make it a preferred choice over Python lists for numerical computations. Some of its most notable features include:
- Powerful N-dimensional array object: Enables efficient storage and manipulation of multi-dimensional data.
- Sophisticated broadcasting functions: Simplifies mathematical operations on arrays of different shapes.
- Integration with C/C++ and Fortran: Provides tools for seamless interaction with low-level code for enhanced performance.
- Advanced mathematical capabilities: Includes functions for linear algebra, Fourier transforms, and random number generation.
In addition to its scientific applications, NumPy serves as an efficient multi-dimensional container for generic data. It supports the definition of arbitrary data types, allowing it to integrate seamlessly and efficiently with diverse databases.
Install Python NumPy
Numpy can be installed for Mac and Linux users via the following pip command:
pip install numpy |
Windows does not have any package manager analogous to that in Linux or Mac. You can download the pre-built Windows installer for NumPy.
Once NumPy is installed, import it in your applications by adding the import keyword:
import numpy |
Now NumPy is imported and ready to use.
import numpy |
Output:
[ 1 2 3 4 5 6 7 8 9 10 ] |
Arrays in NumPy
The core structure in NumPy is the homogeneous multidimensional array, which forms the foundation of its functionality.
- Definition: A NumPy array is a table of elements (typically numbers) where all elements are of the same data type. Each element is accessed using a tuple of positive integers as indices.
- Axes and Rank: In NumPy, the dimensions of an array are referred to as axes, and the total number of axes is called the rank.
- Array Class: NumPy arrays are represented by the ndarray class, often referred to simply by its alias, array.
Example:
In this example, a two-dimensional array is created that has the rank of 2 as it has 2 axes.
The first axis(dimension) is of length 2, i.e., the number of rows, and the second axis(dimension) is of length 3, i.e., the number of columns. The overall shape of the array can be represented as (2, 3)
Output:
NumPy Array Creation
There are various ways of Numpy array creation in Python. They are as follows:
- Create NumPy Array with List and Tuple
You can create an array from a regular Python list or tuple using the array() function. The type of the resulting array is deduced from the type of the elements in the sequences.
import numpy as np # Creating array from list with type float a = np.array([[1, 2, 4], [5, 8, 7]], dtype = 'float') print ("Array created using passed list:\n", a) # Creating array from tuple b = np.array((1 , 3, 2)) print ("\nArray created using passed tuple:\n", b) |
Output:
In many scenarios, the elements of an array are initially unknown, but the size is predetermined. NumPy provides several functions to create arrays with placeholder content, which helps avoid the computational expense of dynamically growing arrays.
Examples of such functions include:
- np.zeros: Creates an array filled with zeros.
- np.ones: Creates an array filled with ones.
- np.full: Creates an array filled with a specified value.
- np.empty: Creates an array without initializing its values (use with caution).
- Creating Sequences
For generating sequences of numbers, NumPy offers functions similar to Python's range, but these return arrays instead of lists. These functions are highly efficient and versatile for numerical computations.
Output:
- Create Using arange() Function
This function arange() function returns evenly spaced values within a given interval. Step size is specified.
# Create a sequence of integers |
Output
- Create Using linspace() Function
linspace(): It returns evenly spaced values within a given interval.
# Create a sequence of 10 values in range 0 to 5 |
Output:
- Reshaping an Array Using the reshape Method
The reshape method allows us to modify the shape of an array while maintaining its original size.
For example, consider an array with the shape (a1,a2,a3,…,aN)(a_1, a_2, a_3, \dots, a_N)(a1,a2,a3,…,aN). This array can be reshaped into another array with the shape (b1,b2,b3,…,bM)(b_1, b_2, b_3, \dots, b_M)(b1,b2,b3,…,bM), provided the condition a1×a2×a3×⋯×aN=b1×b2×b3×⋯×bMa_1 \times a_2 \times a_3 \times \dots \times a_N = b_1 \times b_2 \times b_3 \times \dots \times b_Ma1×a2×a3×⋯×aN=b1×b2×b3×⋯×bM is satisfied. In other words, the total number of elements in the array must remain the same after reshaping.
Output:
NumPy Array Indexing
Understanding the fundamentals of NumPy array indexing is crucial for efficiently analyzing and manipulating array objects. NumPy provides several methods for array indexing:
- Slicing: Similar to Python lists, NumPy arrays support slicing. Since arrays can be multidimensional, a slice must be defined for each dimension of the array.
- Integer Array Indexing: This method uses lists of indices to select elements for each dimension. A one-to-one mapping is performed, allowing the creation of a new arbitrary array from the selected elements.
- Boolean Array Indexing: This technique is useful for selecting elements that meet specific conditions. A Boolean mask is applied to the array, returning only the elements that satisfy the given criteria.
Output: