2017-03-29 testing COMPs-Pcum

In this notebook, we will study how homeostasis (cooperation) may be an essential ingredient to this algorithm working on a winner-take-all basis (competition). This extension has been published as Perrinet, Neural Computation (2010) (see http://invibe.net/LaurentPerrinet/Publications/Perrinet10shl ). In particular, we will show how one can build the non-linear functions based on the activity of each filter and which implement homeostasis.

See also the other posts on unsupervised learning,

This is joint work with Victor Boutin.

Conclusion: using Pcum functions seems to work, but one needs to learn the non-linear functions in a smooth manner. This will be done in a next post.

In [1]:
%load_ext autoreload
%autoreload 2

import matplotlib.pyplot as plt
import matplotlib
%matplotlib inline
#%config InlineBackend.figure_format='svg'

import numpy as np
np.set_printoptions(formatter = dict( float = lambda x: "%.3g" % x ), precision=3, suppress=True, threshold=np.inf)

import time

starting with a short learning without homeostasis

In [2]:
matname = '2017-05-31_Testing_COMPs'
DEBUG_DOWNSCALE = 1
#matname = '2017-05-31_Testing_COMPs-DEBUG'
#DEBUG_DOWNSCALE = 10

seed = 42
nb_quant = 256
C = 4.
do_sym = False

from shl_scripts.shl_experiments import SHL
shl = SHL(DEBUG_DOWNSCALE=DEBUG_DOWNSCALE, 
           eta=0.05, verbose=2, record_each=50, n_iter=1000, eta_homeo=0., alpha_homeo=1., 
          do_sym=do_sym, nb_quant=nb_quant, C=C)
data = shl.get_data(seed=seed, matname=matname)
loading the data called : /tmp/data_cache/2017-05-31_Testing_COMPs_data
In [3]:
test_size = data.shape[0]//2
data_training = data[:test_size, :]
data_test = data[test_size:,:]   
#DEBUG
test_size = data.shape[0]//20
data_training = data[:(data.shape[0]-test_size),:].copy()
data_test = data[:test_size, :].copy()

We start off by using a short learning with no homeostasis such that we end up with a unbalanced dictionary:

In [4]:
dico_partial_learning = shl.learn_dico(data=data_training, matname=matname)
loading the dico called : 2017-05-31_Testing_COMPs
In [5]:
dico_partial_learning.P_cum
In [6]:
fig, ax = shl.show_dico(dico_partial_learning, data=data, title=matname)
fig.show()
In [7]:
fig, ax = shl.time_plot(dico_partial_learning, variable='prob_active');
fig.show()

classical Matching Pursuit

In [8]:
#n_samples, n_pixels = data_test.shape
#n_dictionary, n_pixels = dico_partial_learning.dictionary.shape
norm_each_filter = np.sqrt(np.sum(dico_partial_learning.dictionary**2, axis=1))
print('L2 norm of each filter=', norm_each_filter)
dico_partial_learning.dictionary /= norm_each_filter[:,np.newaxis]

sparse_code_mp = shl.code(data_test, dico_partial_learning, matname=matname)
L2 norm of each filter= [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
loading the code called : /tmp/data_cache/2017-05-31_Testing_COMPs_coding.npy
In [9]:
def plot_proba_histogram(coding):
    n_dictionary=coding.shape[1]

    p = np.count_nonzero(coding, axis=0)/coding.shape[1]
    p /= p.sum()
    rel_ent = np.sum( -p * np.log(p)) / np.log(n_dictionary)
    #print('Entropy / Entropy_max=', rel_ent )

    fig = plt.figure(figsize=(16, 4))
    ax = fig.add_subplot(111)
    ax.bar(np.arange(n_dictionary), p*n_dictionary)
    ax.set_title('distribution of the selection probability - entropy= ' + str(rel_ent)  )
    ax.set_ylabel('pdf')
    ax.set_xlim(0)
    ax.axis('tight')
    return fig, ax

fig, ax = plot_proba_histogram(sparse_code_mp)
In [10]:
p = np.count_nonzero(sparse_code_mp, axis=0)
print('relative proba of being selected', p/p.mean())
print('most selected filter : {}'.format(np.argmax(p)))
print('less selected filter : {}'.format(np.argmin(p)))
relative proba of being selected [0.9 1.18 1.14 1.02 1.04 0.801 0.892 1.13 0.917 0.791 0.969 1.23 1.02 1.49
 1.34 1.15 0.833 0.773 0.855 0.72 0.794 0.77 1.13 1.09 1.37 0.841 0.734
 1.28 0.891 1.03 1.28 1.09 1.05 0.946 0.945 1.03 1.08 1.65 0.836 1.26 0.91
 1.01 1.04 0.922 1.11 1.16 0.941 0.882 0.946 1.48 1.05 0.792 0.864 0.891
 1.73 0.919 1.42 0.971 1.3 1.42 0.925 1.13 0.921 1.06 1 1.07 0.822 0.88
 0.761 1.05 0.833 1.09 1.01 1.6 0.905 1.02 0.634 0.997 0.939 0.828 0.871
 0.913 1.21 0.939 1.07 0.842 0.639 0.776 1 0.838 0.746 0.843 0.787 1.06
 0.708 0.992 0.788 0.963 1.11 0.717 1.2 0.865 0.996 0.786 0.981 0.689 0.629
 0.756 0.873 0.8 1 0.732 0.948 1.18 0.829 1.14 0.912 0.817 0.756 1.03 0.838
 0.863 1.11 1.02 0.968 0.896 1.09 1.54 0.933 1.3 0.933 0.873 1.03 0.866
 0.672 1.03 0.781 0.86 0.885 0.973 0.868 1.47 0.758 0.928 0.904 0.793 0.883
 1.08 0.976 0.676 0.872 0.824 1.05 0.722 0.762 0.918 0.948 0.819 1.4 0.86
 1.13 0.915 1.11 0.706 0.975 0.919 0.824 1.09 1.51 0.958 0.899 1.12 0.859
 1.18 1.17 1.26 0.98 0.986 0.881 1.03 0.973 0.995 1.03 0.864 1.54 1.02 1.42
 1.29 0.785 0.791 1.44 1.45 0.825 0.829 1.28 0.748 1.21 0.819 0.823 1.02
 1.14 1.1 0.998 1.16 0.954 1.28 1.09 0.928 1.03 1.25 0.846 1.23 0.863 0.984
 1.13 0.976 0.99 0.924 0.907 0.731 1.08 1.28 0.968 1.29 1.33 0.828 0.781
 0.95 0.846 0.998 0.737 1.38 1.2 1.12 1.05 0.903 1.09 1.22 0.723 0.742 0.94
 0.931 0.923 0.943 0.92 1.64 0.738 0.839 0.897 0.79 1.31 1.33 1.4 0.98
 0.882 0.642 0.8 1.38 0.712 0.801 0.695 1.62 0.945 1.14 0.77 1.2 1.16 0.839
 0.958 0.82 1.08 2.4 1.04 0.774 0.876 0.966 0.807 0.818 0.894 0.943 1.31
 0.987 0.853 0.689 1.1 1.02 1.02 0.994 0.856 0.859 0.984 1.03 0.979 0.972
 1.47 1.01 1.03 1.19 0.931 0.944 1.04 0.978 1.16 1.16 0.854 0.761 0.842
 1.26 0.95 1.18 0.837 0.797 1.48 0.857 1.15 1.12 1.12 0.98 0.833 1.23 0.915
 1.09 0.813 0.874]
most selected filter : 271
less selected filter : 106

COMP : learning modulations

defining a prior function for coefficients to avoid overflows:

In [11]:
def prior(code, C=5., do_sym=True):
    if do_sym:
        return 1.-np.exp(-np.abs(code)/C)
    else:
        return (1.-np.exp(-code/C))*(code>0)
    
fig=plt.figure(figsize=(5, 5))
ax = plt.subplot(111)
x=np.linspace(0, 30, 100)                   
ax.plot(x, np.ones_like(x), '--')
ax.plot(x, prior(x, C=C, do_sym=do_sym))
fig.show()
In [12]:
def get_P_cum(code, nb_quant=100, code_bins=None):    
    n_samples, nb_filter = code.shape
    code_bins = np.linspace(0., 1, nb_quant, endpoint=True)
    P_cum = np.zeros((nb_filter, nb_quant))
    for i in range(nb_filter):
        p, bins = np.histogram(prior(code[:, i], C=C, do_sym=do_sym), bins=code_bins, density=True)
        p /= p.sum()
        P_cum[i, :] = np.hstack((0, np.cumsum(p)))
    return P_cum

P_cum = get_P_cum(sparse_code_mp, nb_quant)
def show_Pcum_function(code_bins, Pcum, title=None, fname=None, **kwargs): ''' display the cumulative probability ''' subplotpars = matplotlib.figure.SubplotParams(left=0., right=1., bottom=0., top=1., wspace=0.05, hspace=0.05,) fig = plt.figure(figsize=(10, 40), subplotpars=subplotpars) n = int(np.sqrt(Pcum.shape[0])) for i in range(Pcum.shape[0]): ax = fig.add_subplot(n, n, i+1) ax.plot(code_bins[:-1], Pcum[i,:]) #ax.set_ylim([0.9, 1]) ax.axis('tight') ax.set_title(str(i)) if title is not None: fig.suptitle(title, fontsize=12, backgroundcolor = 'white', color = 'k') if not fname is None: fig.savefig(fname, dpi=200) return fig, ax fig, ax = show_Pcum_function(code_bins, P_cum) fig.show()
In [13]:
from shl_scripts.shl_tools import plot_P_cum
fig, ax = plot_P_cum(P_cum, verbose=False, alpha=.15);
ax.set_ylim(0.85, 1.01)
#ax.set_xmargin(0.);
Out[13]:
(0.85, 1.01)

COMP : using modulations

Computing a z-score using np.interp:

In [14]:
code_bins = np.linspace(0., 1, nb_quant+1, endpoint=True)
def z_score(code_bins, Pcum, c):
    z_res = np.zeros_like(c)
    for i in range(Pcum.shape[0]):
        z_res[i] = np.interp(c[i], code_bins[:-1], Pcum[i, :])
    return z_res


corr = (data_test @ dico_partial_learning.dictionary.T)[0, :]
print('correlation=', corr)
print('transformed correlation=', prior(corr, C=C, do_sym=do_sym))
print('Z-scores=', z_score(code_bins, P_cum, prior(corr, C=C, do_sym=do_sym)))
correlation= [0.473 0.262 -0.179 0.0131 -0.411 -0.156 -0.72 1.04 -0.193 -0.0292 -0.133
 0.508 -0.0219 -0.218 1.32 0.203 0.0923 0.155 0.322 -0.101 -0.128 0.256
 0.478 -0.116 0.0748 0.255 0.739 0.733 0.195 0.0246 0.316 0.517 -0.107
 -0.0646 0.35 0.549 0.243 -0.116 -0.207 0.0741 0.775 0.332 -0.101 0.22
 -0.447 1.31 -1.72 -0.0894 -0.092 0.00393 -0.255 0.504 -0.315 0.3 -0.444
 -0.327 0.0701 0.187 -0.898 -0.443 0.355 -0.698 -0.48 0.386 -0.259 0.138
 0.334 -0.146 0.119 0.951 0.429 -0.379 -0.441 0.253 0.342 -0.169 -0.467
 -0.203 -1.11 0.567 -0.055 -0.601 0.428 0.153 -0.428 0.712 0.528 0.136
 -0.332 -0.603 0.0228 0.0413 -0.956 -0.477 -0.212 0.382 0.486 0.16 -0.271
 -0.278 0.784 0.321 -0.658 0.214 -0.0523 0.311 0.06 -0.148 0.151 0.32 0.491
 0.16 0.831 -0.992 -0.187 0.0966 -0.125 -0.685 -0.625 0.0762 0.00117 0.556
 1.11 -0.737 0.425 0.42 0.118 -0.987 -0.119 1.26 -0.613 0.944 0.269 0.2
 0.107 0.054 0.721 -0.682 0.489 0.488 -0.195 -0.717 0.726 0.152 -0.153
 -0.0952 0.427 -1.06 0.177 -0.347 0.112 0.473 -0.00344 -0.426 0.675 -0.318
 0.218 0.0969 -0.377 -0.891 -0.16 -0.0186 0.0333 -0.299 -0.893 -0.181 1.99
 0.626 -0.292 0.136 0.36 0.471 0.147 0.24 0.533 0.493 0.244 0.189 -0.0662
 -0.605 0.591 -0.123 -0.299 0.295 0.0535 0.179 0.22 -0.288 -0.523 1.05 0.47
 0.238 -0.221 -0.0599 -0.83 0.527 0.301 -0.111 0.0515 0.694 -0.317 -0.014
 0.0442 0.0848 -0.481 0.232 0.697 0.0962 0.25 -0.112 0.178 0.295 -0.383
 0.0509 -0.276 -0.191 0.391 -0.0278 0.645 -0.176 -0.123 0.124 0.0391 0.112
 -0.263 -0.45 0.298 -0.509 -0.452 0.567 -0.489 -1.02 -0.24 -0.235 0.148
 0.426 -0.273 -0.35 -0.143 -0.57 -0.426 -0.157 -0.332 0.717 -0.628 0.0103
 -0.825 0.24 -0.997 0.339 -0.569 -0.649 0.244 -0.365 0.443 -0.882 0.00136
 0.498 -0.482 -0.335 0.756 -0.448 0.469 1.13 -0.164 -0.108 -0.0952 0.743
 -0.149 -0.706 -0.397 -0.0483 -0.678 0.274 -0.658 -0.345 0.355 0.0348
 -0.617 -0.109 1.05 0.574 -0.55 0.415 -0.201 -0.907 -1.17 -1.12 -1.46
 -0.785 0.0877 -1.08 0.223 0.0791 0.0797 -0.28 0.264 0.962 -0.24 -0.484
 -0.0307 0.805 0.467 -0.218 0.0526 0.102 -0.449 -0.252 -0.115 0.575 0.722
 0.147 -0.0878 0.372 -0.315 -0.321 0.0828 0.322 -0.414 -0.547 -0.134 1.04
 0.208 -0.852]
transformed correlation= [0.112 0.0635 -0 0.00326 -0 -0 -0 0.23 -0 -0 -0 0.119 -0 -0 0.282 0.0495
 0.0228 0.0379 0.0772 -0 -0 0.0619 0.113 -0 0.0185 0.0618 0.169 0.167
 0.0477 0.00613 0.0761 0.121 -0 -0 0.0837 0.128 0.0589 -0 -0 0.0183 0.176
 0.0796 -0 0.0534 -0 0.279 -0 -0 -0 0.000982 -0 0.118 -0 0.0722 -0 -0
 0.0174 0.0458 -0 -0 0.0848 -0 -0 0.0921 -0 0.0339 0.0801 -0 0.0292 0.212
 0.102 -0 -0 0.0613 0.082 -0 -0 -0 -0 0.132 -0 -0 0.102 0.0375 -0 0.163
 0.124 0.0334 -0 -0 0.00568 0.0103 -0 -0 -0 0.0911 0.114 0.0393 -0 -0 0.178
 0.0771 -0 0.052 -0 0.0748 0.0149 -0 0.0369 0.0768 0.116 0.0391 0.188 -0 -0
 0.0239 -0 -0 -0 0.0189 0.000292 0.13 0.241 -0 0.101 0.0997 0.029 -0 -0
 0.27 -0 0.21 0.065 0.0487 0.0263 0.0134 0.165 -0 0.115 0.115 -0 -0 0.166
 0.0373 -0 -0 0.101 -0 0.0434 -0 0.0277 0.112 -0 -0 0.155 -0 0.053 0.0239
 -0 -0 -0 -0 0.00829 -0 -0 -0 0.391 0.145 -0 0.0335 0.086 0.111 0.0361
 0.0583 0.125 0.116 0.0592 0.0461 -0 -0 0.137 -0 -0 0.0712 0.0133 0.0438
 0.0534 -0 -0 0.231 0.111 0.0577 -0 -0 -0 0.123 0.0724 -0 0.0128 0.159 -0
 -0 0.011 0.021 -0 0.0564 0.16 0.0238 0.0605 -0 0.0436 0.0711 -0 0.0127 -0
 -0 0.0931 -0 0.149 -0 -0 0.0304 0.00973 0.0275 -0 -0 0.0718 -0 -0 0.132 -0
 -0 -0 -0 0.0364 0.101 -0 -0 -0 -0 -0 -0 -0 0.164 -0 0.00257 -0 0.0583 -0
 0.0812 -0 -0 0.0592 -0 0.105 -0 0.000341 0.117 -0 -0 0.172 -0 0.111 0.245
 -0 -0 -0 0.17 -0 -0 -0 -0 -0 0.0662 -0 -0 0.0849 0.00866 -0 -0 0.23 0.134
 -0 0.0986 -0 -0 -0 -0 -0 -0 0.0217 -0 0.0542 0.0196 0.0197 -0 0.0639 0.214
 -0 -0 -0 0.182 0.11 -0 0.0131 0.0252 -0 -0 -0 0.134 0.165 0.0361 -0 0.0888
 -0 -0 0.0205 0.0774 -0 -0 -0 0.229 0.0506 -0]
Z-scores= [0.966 0.952 0 0.796 0 0 0 0.969 0 0 0 0.963 0 0 0.974 0.959 0.963 0.965
 0.965 0 0 0.968 0.969 0 0.956 0.964 0.973 0.972 0.961 0.953 0.955 0.965 0
 0 0.966 0.968 0.953 0 0 0.95 0.973 0.96 0 0.969 0 0.972 0 0 0 0.234 0
 0.968 0 0.966 0 0 0.947 0.958 0 0 0.961 0 0 0.96 0 0.961 0.967 0 0.966
 0.972 0.976 0 0 0.956 0.963 0 0 0 0 0.974 0 0 0.963 0.967 0 0.969 0.975
 0.966 0 0 0.966 0.961 0 0 0 0.96 0.969 0.959 0 0 0.963 0.969 0 0.966 0
 0.971 0.972 0 0.961 0.967 0.963 0.97 0.969 0 0 0.957 0 0 0 0.958 0.0719
 0.969 0.972 0 0.965 0.964 0.951 0 0 0.967 0 0.969 0.958 0.971 0.969 0.957
 0.973 0 0.965 0.962 0 0 0.971 0.958 0 0 0.964 0 0.959 0 0.961 0.969 0 0
 0.97 0 0.958 0.963 0 0 0 0 0.949 0 0 0 0.983 0.968 0 0.957 0.964 0.957
 0.971 0.968 0.956 0.968 0.957 0.959 0 0 0.969 0 0 0.965 0.933 0.964 0.953
 0 0 0.975 0.958 0.962 0 0 0 0.97 0.961 0 0.962 0.971 0 0 0.954 0.953 0
 0.963 0.965 0.958 0.963 0 0.964 0.949 0 0.955 0 0 0.962 0 0.965 0 0 0.959
 0.955 0.957 0 0 0.968 0 0 0.968 0 0 0 0 0.956 0.962 0 0 0 0 0 0 0 0.965 0
 0.609 0 0.967 0 0.966 0 0 0.95 0 0.966 0 0.0841 0.965 0 0 0.975 0 0.962
 0.966 0 0 0 0.97 0 0 0 0 0 0.967 0 0 0.968 0.962 0 0 0.969 0.963 0 0.972 0
 0 0 0 0 0 0.955 0 0.957 0.959 0.942 0 0.956 0.968 0 0 0 0.968 0.955 0
 0.961 0.966 0 0 0 0.957 0.968 0.964 0 0.963 0 0 0.95 0.962 0 0 0 0.967
 0.967 0]

Such that we can redefine Matching Pursuit by including this non-linear function:

In [15]:
l0_sparseness = shl.l0_sparseness

def comp(data, dico, code_bins, P_cum, l0_sparseness=l0_sparseness, verbose=0):
    if verbose!=0:
        t0 = time.time()
    n_samples, n_dictionary = data.shape[0], dico.shape[0]
    sparse_code = np.zeros((n_samples, n_dictionary))
    corr = (data @ dico.T)
    Xcorr = (dico @ dico.T)
    
    for i_sample in range(n_samples):
        c = corr[i_sample, :].copy()
        ind_list=list()
        for i_l0 in range(int(l0_sparseness)):
            zi = z_score(code_bins, P_cum, prior(c, C=C, do_sym=do_sym))
            # ind  = np.argmax(zi * (np.abs(sparse_code[i_sample, :])==0))
            ind  = np.argmax(zi)
            ind_list.append(ind)
            #if ind == 0: 
            #    print(i_l0, c, zi)
            #    break

            c_ind = c[ind] / Xcorr[ind, ind]
            sparse_code[i_sample, ind] += c_ind
            c -= c_ind * Xcorr[ind, :]

        if verbose!=0:
            if i_sample in range(2):
                zi = z_score(code_bins, P_cum, prior(c, C=C, do_sym=do_sym))
                print(ind_list, [zi[i] for i in ind_list], np.median(zi), zi.max(), [c[i] for i in ind_list], c.min(), c.max())
    if verbose!=0:
        duration = time.time()-t0
        print('coding duration : {0}s'.format(duration))

    return sparse_code

sparse_code = comp(data_test, dico_partial_learning.dictionary, code_bins, P_cum, verbose=1)
[166, 86, 26, 79, 106, 70, 260, 283, 133, 86, 189, 161, 26, 76, 161] [0.96214022005395128, 0.97053222656250004, 0.96965568103715272, 0.97092450839417577, 0.0, 0.97245891222756742, 0.97003388851022587, 0.96826171875, 0.96015263243505733, 0.97053222656250004, 0.9641061132931994, 0.0, 0.96965568103715272, 0.0, 0.0] 0.934611659204 0.972458912228 [0.038839715994052643, 0.046539745153281853, 0.34676261041584366, 0.11126160254558828, -0.029471835018308887, 0.12976334587371846, 0.21379280636240519, 0.026048953148890344, 0.040047631856148166, 0.046539745153281853, 0.08015144562627903, 0.0, 0.34676261041584366, -0.096117319991637168, 0.0] -1.05736358862 1.08992144016
[149, 310, 146, 246, 236, 41, 85, 284, 216, 240, 39, 223, 259, 73, 93] [0.0, 0.0, 0.96618253518974007, 0.97561399411137406, 0.96718484910097291, 0.0, 0.75856573340767519, 0.0, 0.9550050772542249, 0.95683519425061481, 0.96494534556429701, 0.0, 0.97898041726429486, 0.94927300576331153, 0.0] 0.0 0.986237820695 [-0.4252931415140212, -0.8044819063347336, 0.59471297880959262, 1.1468743112180486, 0.40399339168218573, -0.076702141193551468, 0.012350001729487536, -0.019502276060465347, 0.06980797031864161, 0.043586537561685609, 0.62801326874320318, -0.31958099297545639, 1.5369779781735433, 0.074258661017260377, 0.0] -3.83348270923 3.17606247677
coding duration : 1432.9442389011383s
print(code_bins[:, 1])
In [16]:
sparse_code.shape
Out[16]:
(40960, 324)
In [17]:
np.mean(np.count_nonzero(sparse_code, axis=0)/sparse_code.shape[1])
Out[17]:
5.498799725651577
fig, ax = plot_proba_histogram(sparse_code)
In [18]:
np.count_nonzero(sparse_code, axis=0)
Out[18]:
array([  813,   751,   673,   874,   934,  1119,  2444,   888,  1140,
        2219,   849,   953,   647,  1820,   982,   896,  1934,  3011,
        1207,  4504,  2926,  4745,  1745,   555,   843,  1202,  6505,
        1882,   962,  1860,   505,  1147,  1566,   728,  1419,   832,
         568,   453,  1736,  1046,  2843,   900,  1528,  5315,  7206,
         875,  1097,   961,   561,   654,   582,  1541,   928,  1876,
         673,   763,   625,   635,   858,   718,   657,  1292,  3799,
         705,  1324,   689,  1888,  1164,  3464,   991, 13060,   697,
         747,  1091,  1164,   660, 15713,   831,  1683,  8249,   825,
         699,  1231,  2849,   640,  1228, 11124,  2046,  1088,  1490,
        1927,  1467,  1788,  1267,  5126,   781,  2097,   874,   612,
        5366,   654,  2445,   927,  3592,   778,  5174, 14047,  4089,
         728,  1532,   790,  7755,   974,   534,  1838,  1212,  1115,
        2571,  2775,   822,  1367,  1625,  1151,  1259,  1119,   790,
         597,   702,  1014,   498,   977,   930,   920,  6870,  7010,
         807,  2681,  1459,   897,   754,  1024,   982,  1547,   705,
         650,  1108,   827,   906,   665,  7384,   866,  1526,  1033,
        7764,  1187,  1424,   871,  1869,   891,  1383,   895,  3562,
         556,  4084,   837,  2083,  1473,   807,   810,   595,   816,
         829,  7351,  2507,   771,  1231,   610,   758,  1965,  1164,
        1151,   977,  1086,  1104,  1466,  1288,  1081,   829,  7947,
        4224,   693,  1141,  3761,  1586,  1140,  1919,   805,  3426,
        1120,  1935,   714,  1252,   942,   965,   935,   805,  1020,
         813,   987,   757,   650,   580,  1314,   679,   850,   666,
        1216,  3236,   509,  5484,   882,   544,   769,  1277,   610,
        2063,  2460,   781,   687,  1421,  3779,   575,   997,  1006,
         788,   791,  1643,   627,  2636,  5804,   980,   976,  1679,
         803,  1356,   624,  2146,  2103,   496,  1697,   684,   972,
         710,  1099,  1154,  9155,  4072,   634,  7165,  3237,  5670,
         984,   627,   762,  6847,  1717,   732,  1212,  1179,  1399,
        3226,  1390,   751,  1914,  1373,   741,  1787,  1263,  1458,
         994,   940,   855,  1017,  7097,  1367,   537,   689,   867,
        1471,   979,   608,   787,   698,   973,   593,  1196,   797,
         968,   493,   725,  1853,   928,   626,   824,   716,  2269,
        1611,   686,  1503,   639,   963,  2806,   561,   943,  1006,
        1932,   762,   842,  1043,  1334,   938,   871,  2588,   870])
sparse_code
In [19]:
def plot_scatter_MpVsComp(sparse_vector, my_sparse_code):
    fig = plt.figure(figsize=(16, 16))
    ax = fig.add_subplot(111)
    a_min = np.min((sparse_vector.min(), my_sparse_code.min()))
    a_max = np.max((sparse_vector.max(), my_sparse_code.max()))
    ax.plot(np.array([a_min, a_max]), np.array([a_min, a_max]), 'k--', lw=2)
    
    ax.scatter(sparse_vector.ravel(), my_sparse_code.ravel(), alpha=0.01)
    ax.set_title('MP')
    ax.set_ylabel('COMP')
    #ax.set_xlim(0)
    #ax.set_ylim(0)
    ax.axis('equal')
    return fig, ax

#fig, ax = plot_scatter_MpVsComp(sparse_code_mp, sparse_code)
#fig.show()

testing that COMP with flat Pcum is equivalent to MP

In [20]:
n_samples, nb_filter = sparse_code.shape

P_cum = np.linspace(0, 1, nb_quant, endpoint=True)[np.newaxis, :] * np.ones((nb_filter, 1))

sparse_code = comp(data_test, dico_partial_learning.dictionary, code_bins, P_cum, verbose=1)

fig, ax = plot_proba_histogram(sparse_code_mp)
fig.show()
fig, ax = plot_proba_histogram(sparse_code)
[166, 122, 129, 14, 131, 199, 263, 297, 313, 69, 310, 95, 40, 167, 7] [0.0, 0.0, 0.0, 0.030777204170586811, 0.059818775975851148, 0.048776509334593626, 0.018682428542351817, 0.0, 0.012497854219154137, 0.0090753790620992196, 0.010950827980629435, 0.011103062895501227, 0.0031561004861847173, 0.0048792511898213039, 0.0] 0.00506335481067 0.154708311504 [-0.014144489199123081, -0.29764948677990238, -0.22589313762029484, 0.12454694665524793, 0.24573651026061044, 0.19922381923942906, 0.075139139142544137, -0.041528525401208057, 0.050108691526164269, 0.03632414578129424, 0.043871921218744878, 0.044485218588332848, 0.012594896004748071, 0.019488163017774635, 0.0] -0.978132955591 0.669435349149
[85, 149, 310, 211, 146, 162, 200, 236, 140, 97, 41, 216, 312, 241, 93] [0.0, 0.0, 0.0, 0.0, 0.0, 0.059102726754637201, 0.063444021513770973, 0.12797471467545196, 0.0, 0.0, 0.0, 0.1273510143708485, 0.15853971420261212, 0.0, 0.0] 0.0 0.497968310549 [-0.17361235865695357, -0.92347973998179156, -0.71709176388623186, -0.24471160299630995, -0.76985156720283876, 0.24270388363417839, 0.26112561279138846, 0.54545503284755137, -0.1156117148973459, -0.15247208862826278, -0.025169257602821304, 0.54260792695056437, 0.68752301717583575, -0.02132989665549204, 0.0] -3.37655924572 2.74089955448
coding duration : 1404.2503921985626s
In [ ]:
fig, ax = plot_scatter_MpVsComp(sparse_code_mp, sparse_code)

gradient descent

In [ ]:
P_cum = np.linspace(0, 1, nb_quant, endpoint=True)[np.newaxis, :] * np.ones((nb_filter, 1))
print('Shape of modulation function', P_cum.shape)

eta_homeo = .05

for i in range(1000):
    P_cum_ = get_P_cum(sparse_code, nb_quant=nb_quant)
    P_cum = (1-eta_homeo) * P_cum + eta_homeo  * P_cum_
    sparse_code = comp(data_test, dico_partial_learning.dictionary, code_bins, P_cum, verbose=0)
    if i % 100 == 0:
        print('Learning step', i)
        fig, ax = plot_proba_histogram(sparse_code)
Shape of modulation function (324, 256)
Learning step 0
In [ ]:
from shl_scripts.shl_tools import plot_P_cum
fig, ax = plot_P_cum(P_cum, verbose=False);
ax.set_ylim(0.85, 1.01);

One observes that the distribution gets progressively more uniform which was our goal. However, this implementation makes the Matching Pursuit algorithm very slow such that we need to speed things up. This will be done in the next post.

Comments

Comments powered by Disqus