The Matching Pursuit algorithm is popular in signal processing and applies well to digital images.

I have contributed a python implementation and we will show here how we may use that for extracting a sparse set of edges from an image.



@inbook{Perrinet15bicv,
    title = {Sparse models},
    author = {Perrinet, Laurent U.},
    booktitle = {Biologically-inspired Computer Vision},
    chapter = {13},
    citeulike-article-id = {13566753},
    editor = {Keil, Matthias and Crist\'{o}bal, Gabriel and Perrinet, Laurent U.},
    publisher = {Wiley, New-York},
    year = {2015}
}

setting things up

We will take some "baby steps first to show how it works, and then apply that to an image .

At first, we will define the SparseEdges framework which will create the necessary processing steps, from the raw image, to creating the multi-resolution scheme and the Matching Pursuit algorithm:

In [1]:
from SparseEdges import SparseEdges
mp = SparseEdges('https://raw.githubusercontent.com/bicv/SparseEdges/master/default_param.py')
mp.pe.N = 4
mp.pe.do_mask=False
mp.pe.MP_alpha=1.
mp.pe.do_whitening = False

This defines the following set of parameters:

In [2]:
print('Default parameters: ', mp.pe)
Default parameters:  {'verbose': 30, 'N_image': None, 'seed': 42, 'N_X': 256, 'N_Y': 256, 'noise': 0.33, 'do_mask': False, 'mask_exponent': 3.0, 'do_whitening': False, 'white_name_database': 'serre07_distractors', 'white_n_learning': 0, 'white_N': 0.07, 'white_N_0': 0.0, 'white_f_0': 0.4, 'white_alpha': 1.4, 'white_steepness': 4.0, 'white_recompute': False, 'base_levels': 1.618, 'n_theta': 24, 'B_sf': 0.4, 'B_theta': 0.17453277777777776, 'N': 4, 'MP_alpha': 1.0, 'MP_rho': None, 'eta_SO': 0.0, 'MP_do_mask': True, 'd_width': 45.0, 'd_min': 0.5, 'd_max': 2.0, 'N_r': 6, 'N_Dtheta': 24, 'N_phi': 12, 'N_scale': 5, 'loglevel_max': 7, 'edge_mask': True, 'do_rank': False, 'scale_invariant': True, 'multiscale': True, 'kappa_phase': 0.0, 'weight_by_distance': True, 'svm_n_jobs': 1, 'svm_test_size': 0.2, 'N_svm_grid': 32, 'N_svm_cv': 50, 'C_range_begin': -5, 'C_range_end': 10.0, 'gamma_range_begin': -14, 'gamma_range_end': 3, 'svm_KL_m': 0.34, 'svm_tol': 0.001, 'svm_max_iter': -1, 'svm_log': False, 'svm_norm': False, 'dip_w': 0.2, 'dip_B_psi': 0.1, 'dip_B_theta': 1.0, 'dip_scale': 1.5, 'dip_epsilon': 0.5, 'figpath': 'results', 'do_edgedir': False, 'edgefigpath': 'results/edges', 'matpath': 'data_cache', 'edgematpath': 'data_cache/edges', 'datapath': 'database', 'figsize': 14.0, 'figsize_hist': 8, 'figsize_cohist': 8, 'formats': ['png', 'pdf', 'svg', 'jpg'], 'dpi': 450, 'scale': 0.8, 'scale_circle': 0.08, 'scale_chevrons': 2.5, 'line_width': 1.0, 'line_width_chevrons': 0.75, 'edge_scale_chevrons': 180.0}

The useful imports for a nice notebook:

In [3]:
import matplotlib
matplotlib.rcParams.update({'text.usetex': False})
%matplotlib inline
%config InlineBackend.figure_format='retina'
import matplotlib.pyplot as plt

import numpy as np
np.set_printoptions(precision=4)#, suppress=True)

# where should we store the data + figures generated by this notebook
import os
mp.pe.figpath, name = '../files/bicv/', 'MPtutorial'
fig_width = 14
In [4]:
print('Range of spatial frequencies: ', mp.sf_0)
Range of spatial frequencies:  [ 0.618   0.382   0.2361  0.1459  0.0902  0.0557  0.0344  0.0213  0.0132
  0.0081  0.005 ]
In [5]:
print('Range of angles (in degrees): ', mp.theta*180./np.pi)
Range of angles (in degrees):  [-82.5 -75.  -67.5 -60.  -52.5 -45.  -37.5 -30.  -22.5 -15.   -7.5   0.
   7.5  15.   22.5  30.   37.5  45.   52.5  60.   67.5  75.   82.5  90. ]

testing one step of (Matching + Pursuit)

Here, we will synthesize an image using 2 log-Gabor filters and check that they are correctly retrieved using a few Matching Pursuit steps. An edge will be represented by a nd.array with respectively [x position, y center position, indice of angle in multi resolution, index of scale in multi resolution scheme] and a coefficient (a complex number whose argument determines the phase of the log-Gabor filter):

In [6]:
# one
edge_in, C_in= [3*mp.pe.N_X/4, mp.pe.N_Y/2, 2, 2], 42
# the second
edge_bis, C_bis = [mp.pe.N_X/8, mp.pe.N_Y/4, 8, 4], 4.*np.sqrt(2)*np.exp(1j*np.pi/4.)

Which results in

In [7]:
# filters in Fourier space
FT_lg_in = mp.loggabor(edge_in[0], edge_in[1], sf_0=mp.sf_0[edge_in[3]],
                         B_sf=mp.pe.B_sf, theta= mp.theta[edge_in[2]], B_theta=mp.pe.B_theta)
FT_lg_bis = mp.loggabor(edge_bis[0], edge_bis[1], sf_0=mp.sf_0[edge_bis[3]],
                         B_sf=mp.pe.B_sf, theta= mp.theta[edge_bis[2]], B_theta=mp.pe.B_theta)
# mixing both and shows one
FT_lg_ = C_in *  FT_lg_in + C_bis * FT_lg_bis
image = mp.invert(FT_lg_)

_ = mp.show_FT(FT_lg_, axis=True)

We may start by computing the linear coefficients in the multi resolution scheme

In [8]:
C = mp.linear_pyramid(image)

These coefficients correspond to a measure of how well the image matches with the filters in the multi-resolution scheme. These will be used in the Matching step. We then detect the best match corresponding to the coefficient with maximum absolute activity:

In [9]:
edge_star = mp.argmax(C)
print('Coordinates of the maximum ', edge_star, ', true value: ', edge_in)
print('Value of the maximum ', C[edge_star], ', true value: ', C_in)
Coordinates of the maximum  (192, 128, 2, 2) , true value:  [192.0, 128.0, 2, 2]
Value of the maximum  (42+1.06674026606e-12j) , true value:  42

At this particular orientation and scale, the absolute activity looks like:

In [10]:
fig, a1, a2 = mp.show_spectrum(np.absolute(C[:, :, edge_star[2], edge_star[3]]), axis=True)
_ = a2.plot([edge_star[1]], [edge_star[0]], 'r*')

On our image, this looks like:

In [11]:
fig, a1, a2 = mp.show_spectrum(image, axis=True)
_ = a2.plot([edge_star[1]], [edge_star[0]], 'r*')

Let's now extract and show the "winner" of the Matching Step:

In [12]:
FT_star = mp.loggabor(edge_star[0], edge_star[1], sf_0=mp.sf_0[edge_star[3]],
                         B_sf=mp.pe.B_sf, theta= mp.theta[edge_star[2]], B_theta=mp.pe.B_theta)
im_star = mp.invert(FT_star)
_ = mp.show_FT(FT_star, axis=True)

Great, it looks like we have extracted the first edge. Let's first check that the energy of

  • the sum the extracted component energy,
  • maximum of its power spectrum (equivalent to the above, but we just check),
  • half the mean spectrum energy (half? rememeber the trick used in Log-Gabor filters) are all equal to one:
In [13]:
print(np.sum(im_star**2), mp.FTfilter(im_star, FT_star).max(), np.mean(np.abs(FT_star)**2)/2)
1.0 1.0 1.0

Let's overlay the extracted edge on the image:

In [14]:
edge_star_in = np.array([edge_star[0], edge_star[1], mp.theta[edge_in[2]], mp.sf_0[edge_star[3]], np.absolute(C[edge_star]), np.angle(C[edge_star])])
mp.pe.figsize_edges = 9
fig, a = mp.show_edges(edge_star_in[:, np.newaxis], image=image)

Now, we may substract the residual from the image, it is the Pursuit step:

In [15]:
image_res = (image - C[edge_star] * im_star).real 
fig, a1, a2 = mp.show_spectrum(image_res, axis=True)
_ = a2.plot([edge_star[1]], [edge_star[0]], 'r*')

Looks pretty clear now that only the second edge remains. We may now repeat another Matching step on the residual image:

In [16]:
C = mp.linear_pyramid(image_res)
edge_star_bis = mp.argmax(C)
print('Coordinates of the maximum ', edge_star_bis, ', true value: ', edge_bis)
print('Value of the maximum ', C[edge_star_bis], ', true value: ', C_bis)
Coordinates of the maximum  (32, 64, 8, 4) , true value:  [32.0, 64.0, 8, 4]
Value of the maximum  (4+4j) , true value:  (4+4j)

Note: to store the complex value of the coefficient, we use the fact that it is easy to transform a complex number to 2 reals and back:

In [17]:
z = np.sqrt(2)*np.exp(1j*np.pi/4.)
z_, z_p = np.absolute(z), np.angle(z)
print(z, z_*np.exp(1j*z_p))
(1+1j) (1+1j)

Note that the linear coefficients corresponding to the first winning filter are canceled:

In [18]:
print('Value of the residual ', C[edge_star], ', initial value: ', C_in)
Value of the residual  (-1.66264965343e-16+1.18403307454e-12j) , initial value:  42

Let's overlay the extracted edge on the image:

In [19]:
fig, a1, a2 = mp.show_spectrum(image_res, axis=True)
_ = a2.plot([edge_star_bis[1]], [edge_star_bis[0]], 'r*')

We have well extracted the two edges:

In [20]:
edge_stars = np.vstack((edge_star_in,
                        np.array([edge_star_bis[0], edge_star_bis[1], mp.theta[edge_star_bis[2]], mp.sf_0[edge_star_bis[3]], np.absolute(C[edge_star_bis]), np.angle(C[edge_star_bis])])))
print(edge_stars)
mp.pe.figsize_edges = 9
fig, a = mp.show_edges(edge_stars.T, image=image)
[[  1.9200e+02   1.2800e+02  -1.1781e+00   2.3608e-01   4.2000e+01
    2.5399e-14]
 [  3.2000e+01   6.4000e+01  -3.9270e-01   9.0179e-02   5.6569e+00
    7.8540e-01]]

testing four steps of Matching Pursuit

Let's redo these steps using the run_mp function:

In [21]:
# filters in Fourier space
FT_lg_in = mp.loggabor(edge_in[0], edge_in[1], sf_0=mp.sf_0[edge_in[3]],
                         B_sf=mp.pe.B_sf, theta= mp.theta[edge_in[2]], B_theta=mp.pe.B_theta)
FT_lg_bis = mp.loggabor(edge_bis[0], edge_bis[1], sf_0=mp.sf_0[edge_bis[3]],
                         B_sf=mp.pe.B_sf, theta= mp.theta[edge_bis[2]], B_theta=mp.pe.B_theta)
# mixing both and shows one
FT_lg_ = C_in *  FT_lg_in + C_bis * FT_lg_bis
In [22]:
fig = mp.show_FT(FT_lg_)
In [23]:
image = mp.invert(FT_lg_)
edges, C_res = mp.run_mp(image, verbose=True)
Edge  0 / 4  - Max activity  :  42.0  phase=  1.45523131147e-12  deg,  @  (192, 128, 2, 2)
Edge  1 / 4  - Max activity  :  5.65685424949  phase=  45.0  deg,  @  (32, 64, 8, 4)
Edge  2 / 4  - Max activity  :  1.17664859937e-12  phase=  -65.0075813426  deg,  @  (192, 128, 2, 2)
Edge  3 / 4  - Max activity  :  1.28656660001e-14  phase=  -179.725736994  deg,  @  (192, 128, 3, 2)
In [24]:
fig, a = mp.show_edges(edges, image=image)

Testing edge detection on a natural image

trying out on a flikr image from @Doug88888:

In [25]:
!curl https://farm7.staticflickr.com/6058/6370387703_5e718ea681_q_d.jpg -o ../files/bicv/6370387703_5e718ea681_q_d.jpg
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 10602  100 10602    0     0  29320      0 --:--:-- --:--:-- --:--:-- 29368
In [26]:
%%writefile ../files/bicv/MPtutorial.py

import numpy as np
from SparseEdges import SparseEdges
mp = SparseEdges('https://raw.githubusercontent.com/bicv/SparseEdges/master/default_param.py')
mp.pe.figpath = '../files/bicv'
mp.pe.N = 2048
# mp.pe.MP_alpha = .8
mp.pe.MP_alpha = 1.

# defining input image (get it @ https://www.flickr.com/photos/doug88888/6370387703)
#image = mp.imread('https://farm7.staticflickr.com/6058/6370387703_5e718ea681_q_d.jpg')
image = mp.imread('6370387703_5e718ea681_q_d.jpg')
#print(image.shape)
mp.init()
image = mp.normalize(image, center=True)

matname = 'MPtutorial.npy'
try:
    edges = np.load(matname)
except:
    edges, C_res = mp.run_mp(image, verbose=True)
    np.save(matname, edges)    

matname_RMSE = 'MPtutorial_RMSE.npy'
try:
    RMSE = np.load(matname_RMSE)
except:
    RMSE = np.ones(mp.pe.N)
    image_ = image.copy()
    image_rec = np.zeros_like(image_)
    if mp.pe.do_whitening: image_ = mp.whitening(image_)
    for i_N in range(mp.pe.N):
        image_rec += mp.reconstruct(edges[:, i_N][:, np.newaxis])
        RMSE[i_N] =  ((image_*mp.mask-image_rec*mp.mask)**2).sum()
    np.save(matname_RMSE, RMSE)        
Overwriting ../files/bicv/MPtutorial.py
In [27]:
%cd -q ../files/bicv/
%run MPtutorial.py
%cd -q ../../posts
<matplotlib.figure.Figure at 0x116cf83c8>

Showing the original image with the edges overlaid:

In [28]:
#edges = np.load(matname)

fig_width_pt = 318.670  # Get this from LaTeX using \showthe\columnwidth
inches_per_pt = 1.0/72.27               # Convert pt to inches
fig_width = fig_width_pt*inches_per_pt  # width in inches
mp.pe.figsize_edges = .382 * fig_width  # useful for papers
mp.pe.figsize_edges = 9 # useful in notebooks
mp.pe.line_width = 1.
mp.pe.scale = 1.
fig, a = mp.show_edges(edges, image=image, show_phase=True)

mp.savefig(fig, name + '_rec')