{:check ["true"]}

Index

Creating and Indexing with Boolean NDArrays

Boolean Indexing

In [2]:
#
# Import numpy as `np`, and set the display precision to two decimal places
#
import numpy as np
np.set_printoptions(precision=2)
In [3]:
#
# We generate three ndarrays to represent sales number of three products in four different seasons
#
np.random.seed(0)
sales = np.random.uniform(-20, 20, size=(4,3))
products = np.array(['apple', 'orange', 'banana'])
seasons = np.array(['spring', 'summer', 'fall', 'winter'])
In [4]:
#
# The products
#
products
Out[4]:
array(['apple', 'orange', 'banana'], dtype='<U6')
In [5]:
#
# The seasons
#
seasons
Out[5]:
array(['spring', 'summer', 'fall', 'winter'], dtype='<U6')
In [6]:
#
# The sales as a (4,3) ndarray
sales
Out[6]:
array([[ 1.95,  8.61,  4.11],
       [ 1.8 , -3.05,  5.84],
       [-2.5 , 15.67, 18.55],
       [-4.66, 11.67,  1.16]])
In [7]:
#
# We can also pack the seasons and products into a single
# (4,3) ndarray of strings.
#
info = np.array(
    ["%s-%s"%(s,p) for s in seasons for p in products]
).reshape(4,-1)
info
Out[7]:
array([['spring-apple', 'spring-orange', 'spring-banana'],
       ['summer-apple', 'summer-orange', 'summer-banana'],
       ['fall-apple', 'fall-orange', 'fall-banana'],
       ['winter-apple', 'winter-orange', 'winter-banana']], dtype='<U13')

Boolean Arrays

Boolean operations allows us to construct boolean ndarrays from numerical ndarrays using element-wise logical relational operators.

In [8]:
#
# sales that are negative
#
sales < 0
Out[8]:
array([[False, False, False],
       [False,  True, False],
       [ True, False, False],
       [ True, False, False]])

Logical operators can combine multiple boolean ndarrays to express complex conditions.

In [9]:
#
# sales that are between 0 and 2.0
#
np.logical_and(sales >= 0, sales < 2.0) 
Out[9]:
array([[ True, False, False],
       [ True, False, False],
       [False, False, False],
       [False, False,  True]])

It's possible to find the indexes of all the true elements in a boolean ndarray. The result is given as indexes along each dimension.

X,Y = np.where(...)

X is the row entries, and Y is the column entries.

In [10]:
#
# the ndarray indices for negative sales
#
np.where(sales < 0)
Out[10]:
(array([1, 2, 3]), array([1, 0, 0]))

Boolean Indexing

We can make use of boolean arrays as indexes to extract the elements of a ndarray based on the corresponding boolean values.

In [11]:
#
# We can retrieve the actual sales figures for the negative sales
#
sales[sales < 0]
Out[11]:
array([-3.05, -2.5 , -4.66])
In [12]:
#
# We can retrieve the info of the negative sales
#
info[sales < 0]
Out[12]:
array(['summer-orange', 'fall-apple', 'winter-apple'], dtype='<U13')
In [14]:
I,J = np.where(sales < 0)
print(list(zip(I,J)))
[(1, 1), (2, 0), (3, 0)]
In [15]:
seasons[I]
Out[15]:
array(['summer', 'fall', 'winter'], dtype='<U6')
In [16]:
products[J]
Out[16]:
array(['orange', 'apple', 'apple'], dtype='<U6')
In [20]:
L = (sales < 0)
I, J = np.where(L)
for (season, product, loss) in zip(seasons[I], products[J], sales[L]):
    print("We lost %.2f selling %s in %s" % (loss, product, season))
We lost -3.05 selling orange in summer
We lost -2.50 selling apple in fall
We lost -4.66 selling apple in winter
In [ ]: