Numpy¶

Numpy Arrays¶

Numpy arrays are similar to lists, but have a few key differences.

Features of numpy arrays:

  • Can be multi-dimensional
  • Can only have one data type, which is typically numbers
  • Ideal for doing math, plotting, and statistics
  • Has a fixed size once created, but values are modifiable
In [1]:
# Import the numpy package!
import numpy as np

A Python list can be converted to a numpy array.

In [2]:
# A basic python list
fib_list = [1, 1, 2, 3, 5, 8]
# Convert list to numpy array
fib_array = np.array(fib_list)
In [25]:
print(f"This is a list: {fib_list}")
print(f"This is an array: {fib_array}")
This is a list: [1, 1, 2, 3, 5, 8]
This is an array: [1 1 2 3 5 8]

Math in Numpy¶

Various math functions are available for numpy arrays.

In [21]:
# Calculating mean of numpy array
mean = np.mean(fib_array)
print(f"Mean: {mean}")
Mean: 3.3333333333333335
In [22]:
# Calculating maximum value of numpy array
max = np.max(fib_array)
print(f"Max: {max}")
Max: 8
In [23]:
# Calculating minimum value of numpy array
min = np.min(fib_array)
print(f"Min: {min}")
Min: 1
In [24]:
# Calculating the sum of a numpy array
sum = np.sum(fib_array)
print(f"Sum: {sum}")
Sum: 20

More Numpy mathematical functions can be found here: https://numpy.org/doc/stable/reference/routines.math.html

You can also perform math expressions with all array elements with numpy, which isn't possible with lists.

In [ ]:
fib_array + 2
Out[ ]:
array([ 3,  3,  4,  5,  7, 10])

Multi-dimensional Arrays¶

A numpy array can have another array within it, which makes it multi-dimensional.

In [12]:
multid_array = np.array([[1,2,3,4,5]])
multid_array
Out[12]:
array([[1, 2, 3, 4, 5]])

A common type of numpy array is a two-dimensional array.

In [13]:
twod_array = np.array([[0,1,2,3,4], [5,6,7,8,9]])
twod_array
Out[13]:
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

You can determine the dimensions, or shape, of an array with np.shape(). The output of the function looks something like this: (2, 3). The first value, 2, stands for the number of rows, while the second value, 3, stands for the number of columns, if you were to visualize the array as a table.

In [41]:
# This array has two rows and five columns
np.shape(twod_array)
Out[41]:
(2, 5)

There needs to be the same number of columns in each row, so an array like np.array([[0,1,2,3], [4,5,6,7,8]]) would return an error, because there is a mismatched number of columns.

Array Slicing¶

Arrays can be sliced independently according to Cartesian coordinates.

In [42]:
twod_array[0:2, 1:3]
Out[42]:
array([[1, 2],
       [6, 7]])

The number of values sliced in each row must be equal among all rows, so if one row slices less values than another, then the lower number of values would be sliced.

For example, if one row slices all values while another slices just two, then only two values would be extracted from each row.

In [43]:
# An example of this
twod_array[:, 1:3]
Out[43]:
array([[1, 2],
       [6, 7]])

You can also slice every other value, like with normal lists.

In [44]:
# Slicing every other value
twod_array[:, 0::2]
Out[44]:
array([[0, 2, 4],
       [5, 7, 9]])

Different Ways to Create an Array¶

There are various other ways to create an array and assign certain values to it.

In [20]:
# Makes an array filled with 0s
zeros = np.zeros(5)
print(f"Zeros array: {zeros}")
# Makes an array filled with 1s
ones = np.ones(5)
print(f"Ones array: {ones}")
# Makes an empty array
empty = np.empty(5)
Zeros array: [0. 0. 0. 0. 0.]
Ones array: [1. 1. 1. 1. 1.]

You can be more specific with the array shape you want. Using (2,4,6) would make a two-dimensional array with four rows and 6 columns, while (1,5) would make a one-dimensional array with 5 entries.

In [ ]:
# A two-dimensional array with four rows and six columns
np.zeros((2,4,6))
Out[ ]:
array([[[0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0.]]])
In [48]:
# A one-dimensional array with five entries
np.ones((1,5))
Out[48]:
array([[1., 1., 1., 1., 1.]])

Specifying Axis in Arrays¶

When an array is multi-dimensional, you need to specify the axis when doing certain math functions. The axis determines if you do math across rows or across columns.

  • axis = 0: Math is done across the rows, resulting in one value for each column
  • axis = 1: Math is done across the columns, resulting in one value for each row
In [19]:
# Example array
axis_array = np.array([[5,5,5,5,5], [3,3,3,3,3]])
# Taking an average across rows
row_average = np.mean(axis_array, axis = 0)
print(f"Row average: {row_average}")
# Taking an average across columns
column_average = np.mean(axis_array, axis = 1)
print(f"Column average: {column_average}")
Row average: [4. 4. 4. 4. 4.]
Column average: [5. 3.]

Numpy Ranges¶

In Python, the range() function can be used to create a sequence of a range of values. Numpy has a similar function, np.arange(), that returns an array of a range of values.

There are a few ways to create a range with numpy:

  • np.arange(number): Creates a range with a certain number of values
  • np.arange(start, stop): Creates a range with values from the start number and stop number (non-inclusive)
  • np.arange(start, stop, interval): Creates a range with values from the start number and stop number (non-inclusive) by a certain interval

np.arange() is zero-based, so ranges without a specified start number start with 0.

Below are some examples:

In [6]:
# A range with a specified number of values
range1 = np.arange(5)
range1
Out[6]:
array([0, 1, 2, 3, 4])
In [ ]:
# A range with values from a start number to a stop number
range2 = np.arange(3, 9)
range2
Out[ ]:
array([3, 4, 5, 6, 7, 8])
In [12]:
# A range with values from a start number to a stop number, over a certain interval
range3 = np.arange(2, 7, 2)
range3
Out[12]:
array([2, 4, 6])

np.arange() can be used in for loops, which is covered in the for loops course notes.

Subsetting Arrays with Conditionals¶

Conditional statements can be applied across an array, which produces an array of the same shape filled with Boolean values (True or False). For example, take a numpy array:

sub_array=np.array([4, 6, 7, 2, 1])

When we apply the conditional statement "is greater than three" to this array with the code:

sub_array > 3

The result is an array of Boolean values:

[True True True False False] We can therefore selectively subset an array with a conditional statement:

sub_array[sub_array>3]

This produces the following output:

[4 6 7]

In [13]:
# The above example
sub_array = np.array([4,6,7,2,1])
sub_index = sub_array > 3
print(sub_array[sub_index])
[4 6 7]

Assorted Other Numpy Functions¶

In [29]:
# A random array for the following examples
ran_array = np.array([2, 4, -1, 5, 10, 7])

np.diffs() returns the differences between consecutive pairs of values in an array. When index i is provided, then for each index i, [i + 1] - [i] is calculated and added to an array of differences.

In [32]:
# An example of np.diffs()
print(f"Our array: {ran_array}")
print(f"Differences array: {np.diff(ran_array)}")
Our array: [ 2  4 -1  5 10  7]
Differences array: [ 2 -5  6  5 -3]

np.argmin() and np.argmax() get the index of the minimum and maximum value of an array, respectively.

In [31]:
# An example of np.argmin() and np.argmax()
print(f"Our array: {ran_array}")
print(f"Min index: {np.argmin(ran_array)}")
print(f"Max index: {np.argmax(ran_array)}")
Our array: [ 2  4 -1  5 10  7]
Min index: 2
Max index: 4

np.cumsum() gets the cumulative sum of an array.

In [33]:
# An example of np.cumsum()
print(f"Our array: {ran_array}")
print(f"Cumulative sum array: {np.cumsum(ran_array)}")
Our array: [ 2  4 -1  5 10  7]
Cumulative sum array: [ 2  6  5 10 20 27]

np.where() allows you to check a condition in an array, then returns the indices where the condition is true.

In [37]:
print(f"Our array: {ran_array}")
print(f"Where array is greater than 4: {np.where(ran_array > 4)}")
Our array: [ 2  4 -1  5 10  7]
Where array is greater than 4: (array([3, 4, 5]),)