# 2016-11-17 Finding extremal values in a nd-array

Sometimes, you need to pick up the $N$-th extremal values in a mutli-dimensional matrix.

Let's suppose it is represented as a nd-array (here, I further suppose you are using the numpy library from the python language). Finding extremal values is easy with argsort but this function operated on 1d vectors... Juggling around indices is sometimes not such an easy task, but luckily, we have the unravel_index function.

Let's unwrap an easy solution combining these functions:

Let's first initialize the notebook:

In [1]:
import numpy as np


Let's first a dummy 3-D array:

In [2]:
x = np.array([[0, 5, 4], [2, 1, 3]])
x = np.arange(2*3*4).reshape((4, 3, 2))
x = np.random.permutation(np.arange(2*3*4)).reshape((4, 3, 2))
print (x)

[[[20  6]
[23 18]
[ 0  5]]

[[10  3]
[13  1]
[11  8]]

[[21  7]
[22 14]
[19 12]]

[[17  9]
[ 2 15]
[16  4]]]


Which we can represent as a 1-d array (a vector):

In [3]:
print (x.ravel())

[20  6 23 18  0  5 10  3 13  1 11  8 21  7 22 14 19 12 17  9  2 15 16  4]


We may now find the list of indices to sort it:

In [4]:
print (np.argsort(x.ravel()))

[ 4  9 20  7 23  5  1 13 11 19  6 10 17  8 15 21 22 18  3 16  0 12 14  2]


And we verify that the entries are indeed sorted:

In [5]:
print (x.ravel()[np.argsort(x.ravel())])

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]


To go back to the nd-array, we use the unraval_index function:

In [6]:
help(np.unravel_index)

Help on built-in function unravel_index in module numpy.core.multiarray:

unravel_index(...)
unravel_index(indices, dims, order='C')

Converts a flat index or array of flat indices into a tuple
of coordinate arrays.

Parameters
----------
indices : array_like
An integer array whose elements are indices into the flattened
version of an array of dimensions dims. Before version 1.6.0,
this function accepted just one index value.
dims : tuple of ints
The shape of the array to use for unraveling indices.
order : {'C', 'F'}, optional
Determines whether the indices should be viewed as indexing in
row-major (C-style) or column-major (Fortran-style) order.

Returns
-------
unraveled_coords : tuple of ndarray
Each array in the tuple has the same shape as the indices
array.

--------
ravel_multi_index

Examples
--------
>>> np.unravel_index([22, 41, 37], (7,6))
(array([3, 6, 6]), array([4, 5, 1]))
>>> np.unravel_index([31, 41, 13], (7,6), order='F')
(array([3, 6, 6]), array([4, 5, 1]))

>>> np.unravel_index(1621, (6,7,8,9))
(3, 1, 4, 1)


In [7]:
print (np.unravel_index(np.argsort(x.ravel()), x.shape))

(array([0, 1, 3, 1, 3, 0, 0, 2, 1, 3, 1, 1, 2, 1, 2, 3, 3, 3, 0, 2, 0, 2, 2,
0]), array([2, 1, 1, 0, 2, 2, 0, 0, 2, 0, 0, 2, 2, 1, 1, 1, 2, 0, 1, 2, 0, 0, 1,
1]), array([0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0,
0]))


Such that we can now sort the whole array from the lowest to highest index:

In [8]:
print (x[np.unravel_index(np.argsort(x.ravel()), x.shape)])

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]


We can now pick just the datapoints extremal values of interest :

In [9]:
datapoints = 10
print (np.unravel_index(np.argsort(x.ravel())[:datapoints], x.shape))

(array([0, 1, 3, 1, 3, 0, 0, 2, 1, 3]), array([2, 1, 1, 0, 2, 2, 0, 0, 2, 0]), array([0, 1, 0, 1, 1, 1, 1, 1, 1, 1]))


Let's now try with a more generic example :

In [10]:
x = np.random.rand(4, 3, 2)
print (x)

[[[ 0.43933303  0.35481207]
[ 0.04459993  0.65894333]
[ 0.68128816  0.6476884 ]]

[[ 0.45980888  0.79537777]
[ 0.85516965  0.95558241]
[ 0.26402052  0.25199538]]

[[ 0.75092699  0.88245315]
[ 0.39587322  0.69577732]
[ 0.97255091  0.26780385]]

[[ 0.45577996  0.49798838]
[ 0.36976289  0.32772751]
[ 0.28368218  0.85828921]]]


The indices for the datapoints extremal values of interest are :

In [11]:
print (np.unravel_index(np.argsort(x.ravel())[:datapoints], x.shape))

(array([0, 1, 1, 2, 3, 3, 0, 3, 2, 0]), array([1, 2, 2, 2, 2, 1, 0, 1, 1, 0]), array([0, 1, 0, 1, 0, 1, 1, 0, 0, 0]))


... which correspond to the minimal values of interest :

In [12]:
print (x[np.unravel_index(np.argsort(x.ravel())[:datapoints], x.shape)])

[ 0.04459993  0.25199538  0.26402052  0.26780385  0.28368218  0.32772751
0.35481207  0.36976289  0.39587322  0.43933303]


Note that it is also easy to pick the maximal values :

In [13]:
print (np.unravel_index(np.argsort(-x.ravel())[:datapoints], x.shape))

(array([2, 1, 2, 3, 1, 1, 2, 2, 0, 0]), array([2, 1, 0, 2, 1, 0, 0, 1, 2, 1]), array([0, 1, 1, 1, 0, 1, 0, 1, 0, 1]))