2016-11-17 Finding extremal values in a nd-array

Sometimes, you need to pick up the $N$-th extremal values in a mutli-dimensional matrix.

Let's suppose it is represented as a nd-array (here, I further suppose you are using the numpy library from the python language). Finding extremal values is easy with argsort but this function operated on 1d vectors... Juggling around indices is sometimes not such an easy task, but luckily, we have the unravel_index function.

Let's unwrap an easy solution combining these functions:

Let's first initialize the notebook:

In [1]:
import numpy as np

Let's first a dummy 3-D array:

In [2]:
x = np.array([[0, 5, 4], [2, 1, 3]])
x = np.arange(2*3*4).reshape((4, 3, 2))
x = np.random.permutation(np.arange(2*3*4)).reshape((4, 3, 2))
print (x)
[[[20  6]
  [23 18]
  [ 0  5]]

 [[10  3]
  [13  1]
  [11  8]]

 [[21  7]
  [22 14]
  [19 12]]

 [[17  9]
  [ 2 15]
  [16  4]]]

Which we can represent as a 1-d array (a vector):

In [3]:
print (x.ravel())
[20  6 23 18  0  5 10  3 13  1 11  8 21  7 22 14 19 12 17  9  2 15 16  4]

We may now find the list of indices to sort it:

In [4]:
print (np.argsort(x.ravel()))
[ 4  9 20  7 23  5  1 13 11 19  6 10 17  8 15 21 22 18  3 16  0 12 14  2]

And we verify that the entries are indeed sorted:

In [5]:
print (x.ravel()[np.argsort(x.ravel())])
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]

To go back to the nd-array, we use the unraval_index function:

In [6]:
help(np.unravel_index)
Help on built-in function unravel_index in module numpy.core.multiarray:

unravel_index(...)
    unravel_index(indices, dims, order='C')
    
    Converts a flat index or array of flat indices into a tuple
    of coordinate arrays.
    
    Parameters
    ----------
    indices : array_like
        An integer array whose elements are indices into the flattened
        version of an array of dimensions ``dims``. Before version 1.6.0,
        this function accepted just one index value.
    dims : tuple of ints
        The shape of the array to use for unraveling ``indices``.
    order : {'C', 'F'}, optional
        Determines whether the indices should be viewed as indexing in
        row-major (C-style) or column-major (Fortran-style) order.
    
        .. versionadded:: 1.6.0
    
    Returns
    -------
    unraveled_coords : tuple of ndarray
        Each array in the tuple has the same shape as the ``indices``
        array.
    
    See Also
    --------
    ravel_multi_index
    
    Examples
    --------
    >>> np.unravel_index([22, 41, 37], (7,6))
    (array([3, 6, 6]), array([4, 5, 1]))
    >>> np.unravel_index([31, 41, 13], (7,6), order='F')
    (array([3, 6, 6]), array([4, 5, 1]))
    
    >>> np.unravel_index(1621, (6,7,8,9))
    (3, 1, 4, 1)

In [7]:
print (np.unravel_index(np.argsort(x.ravel()), x.shape))
(array([0, 1, 3, 1, 3, 0, 0, 2, 1, 3, 1, 1, 2, 1, 2, 3, 3, 3, 0, 2, 0, 2, 2,
       0]), array([2, 1, 1, 0, 2, 2, 0, 0, 2, 0, 0, 2, 2, 1, 1, 1, 2, 0, 1, 2, 0, 0, 1,
       1]), array([0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0,
       0]))

Such that we can now sort the whole array from the lowest to highest index:

In [8]:
print (x[np.unravel_index(np.argsort(x.ravel()), x.shape)])
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]

We can now pick just the datapoints extremal values of interest :

In [9]:
datapoints = 10
print (np.unravel_index(np.argsort(x.ravel())[:datapoints], x.shape))
(array([0, 1, 3, 1, 3, 0, 0, 2, 1, 3]), array([2, 1, 1, 0, 2, 2, 0, 0, 2, 0]), array([0, 1, 0, 1, 1, 1, 1, 1, 1, 1]))

Let's now try with a more generic example :

In [10]:
x = np.random.rand(4, 3, 2)
print (x)
[[[ 0.43933303  0.35481207]
  [ 0.04459993  0.65894333]
  [ 0.68128816  0.6476884 ]]

 [[ 0.45980888  0.79537777]
  [ 0.85516965  0.95558241]
  [ 0.26402052  0.25199538]]

 [[ 0.75092699  0.88245315]
  [ 0.39587322  0.69577732]
  [ 0.97255091  0.26780385]]

 [[ 0.45577996  0.49798838]
  [ 0.36976289  0.32772751]
  [ 0.28368218  0.85828921]]]

The indices for the datapoints extremal values of interest are :

In [11]:
print (np.unravel_index(np.argsort(x.ravel())[:datapoints], x.shape))
(array([0, 1, 1, 2, 3, 3, 0, 3, 2, 0]), array([1, 2, 2, 2, 2, 1, 0, 1, 1, 0]), array([0, 1, 0, 1, 0, 1, 1, 0, 0, 0]))

... which correspond to the minimal values of interest :

In [12]:
print (x[np.unravel_index(np.argsort(x.ravel())[:datapoints], x.shape)])
[ 0.04459993  0.25199538  0.26402052  0.26780385  0.28368218  0.32772751
  0.35481207  0.36976289  0.39587322  0.43933303]

Note that it is also easy to pick the maximal values :

In [13]:
print (np.unravel_index(np.argsort(-x.ravel())[:datapoints], x.shape))
(array([2, 1, 2, 3, 1, 1, 2, 2, 0, 0]), array([2, 1, 0, 2, 1, 0, 0, 1, 2, 1]), array([0, 1, 1, 1, 0, 1, 0, 1, 0, 1]))

Comments

Comments powered by Disqus