ADAPTIVE BLIND SIGNAL AND IMAGE PROCESSING 
Andrzej CICHOCKI & Shunichi AMARI
Contents


Preface 
xxix 
1 Introduction to Blind Signal Processing: Problems and Applications 
1 
1.1 Problem Formulations  An Overview 
2 
1.1.1 Generalized Blind Signal Processing Problem 
2 
1.1.2 Instantaneous Blind Source Separation and Independent Component Analysis 
5 
1.1.3 Independent Component Analysis for Noisy Data 
11 
1.1.4 Multichannel Blind Deconvolution and Separation 
14 
1.1.5 Blind Extraction of Signals 
18 
1.1.6 Generalized Multichannel Blind Deconvolution  State Space Models 
19 
1.1.7 Nonlinear State Space Models  SemiBlind Signal Processing 
21 
1.1.8 Why State Space Demixing Models? 
22 
1.2 Potential Applications of Blind and SemiBlind Signal Processing 
23 
1.2.1 Biomedical Signal Processing 
24 
1.2.2 Blind Separation of Electrocardiographic Signals of Fetus and Mother 
25 
1.2.3 Enhancement and Decomposition of EMG Signals 
27 
1.2.4 EEG and Data MEG Processing 
27 
1.2.5 Application of ICA/BSS for Noise and Interference Cancellation in Multisensory Biomedical Signals 
29 
1.2.6 Cocktail Party Problem 
34 
1.2.7 Digital Communication Systems 
35 
1.2.7.1 Why Blind? 
37 
1.2.8 Image Restoration and Understanding 
37 
2 Solving a System of Algebraic Equations and Related Problems 
43 
2.1 Formulation of the Problem for Systems of Linear Equations 
44 
2.2 LeastSquares Problems 
45 
2.2.1 Basic Features of the LeastSquares Solution 
45 
2.2.2 Weighted LeastSquares and Best Linear Unbiased Estimation 
47 
2.2.3 Basic Network StructureLeastSquares Criteria 
49 
2.2.4 Iterative Parallel Algorithms for Large and Sparse Systems 
49 
2.2.5 Iterative Algorithms with Nonnegativity Constraints 
51 
2.2.6 Robust Circuit Structure by Using the Interactively Reweighted LeastSquares Criteria 
54 
2.2.7 Tikhonov Regularization and SVD 
57 
2.3 Least Absolute Deviation (1norm) Solution of Systems of Linear Equations 
61 
2.3.1 Neural Network Architectures Using a Smooth Approximation and Regularization 
62 
2.3.2 Neural Network Model for LAD Problem Exploiting Inhibition Principles 
64 
2.4 Total LeastSquares and Data LeastSquares Problems 
67 
2.4.1 Problems Formulation 
67 
2.4.1.1 A Historical Overview of the TLS Problem 
67 
2.4.2 Total LeastSquares Estimation 
69 
2.4.3 Adaptive Generalized Total LeastSquares 
73 
2.4.4 Extended TLS for Correlated Noise Statistics 
75 
2.4.4.1 Choice of RNN in Some Practical Situations 
77 
2.4.5 Adaptive Extended Total LeastSquares 
77 
2.4.6 An Illustrative Example  Fitting a Straight Line to a Set of Points 
78 
2.5 Sparse Signal Representation and Minimum Fuel Consumption Problem 
79 
2.5.1 Approximate Solution of Minimum Fuel Problem Using Iterative LS Approach 
81 
2.5.2 FOCUSS Algorithms 
83 
3 Principal/Minor Component Analysis and Related Problems 
87 
3.1 Introduction 
87 
3.2 Basic Properties of PCA 
88 
3.2.1 Eigenvalue Decomposition 
88 
3.2.2 Estimation of Sample Covariance Matrices 
90 
3.2.3 Signal and Noise Subspaces  AIC and MDL Criteria for their Estimation 
91 
3.2.4 Basic Properties of PCA 
93 
3.3 Extraction of Principal Components 
94 
3.4 Basic Cost Functions and Adaptive Algorithms for PCA 
98 
3.4.1 The Rayleigh Quotient  Basic Properties 
98 
3.4.2 Basic Cost Functions for Computing Principal and Minor Components 
99 
3.4.3 Fast PCA Algorithm Based on the Power Method 
101 
3.4.4 Inverse Power Iteration Method 
104 
3.5 Robust PCA 
104 
3.6 Adaptive Learning Algorithms for MCA 
107 
3.7 Uni.ed Parallel Algorithms for PCA/MCA and PSA/MSA 
110 
3.7.1 Cost Function for Parallel Processing 
111 
3.7.2 Gradient of J(W) 
112 
3.7.3 Stability Analysis 
113 
3.7.4 Uni.ed Stable Algorithms 
116 
3.8 SVD in Relation to PCA and Matrix Subspaces 
118 
3.9 Multistage PCA for BSS 
119 
Appendix A. Basic Neural Networks Algorithms for Real and ComplexValued PCA 
122 
Appendix B. Hierarchical Neural Network for Complexvalued PCA 
125 
4 Blind Decorrelation and SOS for Robust Blind Identi.cation 
129 
4.1 Spatial Decorrelation  Whitening Transforms 
130 
4.1.1 Batch Approach 
130 
4.1.2 Optimization Criteria for Adaptive Blind Spatial Decorrelation 
132 
4.1.3 Derivation of Equivariant Adaptive Algorithms for Blind Spatial Decorrelation 
133 
4.1.4 Simple Local Learning Rule 
136 
4.1.5 GramSchmidt Orthogonalization 
138 
4.1.6 Blind Separation of Decorrelated Sources Versus Spatial Decorrelation 
139 
4.1.7 Bias Removal for Noisy Data 
139 
4.1.8 Robust Prewhitening  Batch Algorithm 
140 
4.2 SOS Blind Identi.cation Based on EVD 
141 
4.2.1 Mixing Model 
141 
4.2.2 Basic Principles: SD and EVD 
143 
4.3 Improved Blind Identi.cation Algorithms Based on EVD/SVD 
148 
4.3.1 Robust Orthogonalization of Mixing Matrices for Colored Sources 
148 
4.3.2 Improved Algorithm Based on GEVD 
153 
4.3.3 Improved Twostage Symmetric EVD/SVD Algorithm 
155 
4.3.4 BSS and Identi.cation Using Bandpass Filters 
156 
4.4 Joint Diagonalization  Robust SOBI Algorithms 
157 
4.4.1 Modi.ed SOBI Algorithm for Nonstationary Sources: SONS Algorithm 
160 
4.4.2 Computer Simulation Experiments 
161 
4.4.3 Extensions of Joint Approximate Diagonalization Technique 
162 
4.4.4 Comparison of the JAD and Symmetric EVD 
163 
4.5 Cancellation of Correlation 
164 
4.5.1 Standard Estimation of Mixing Matrix and Noise Covariance Matrix 
164 
4.5.2 Blind Identi.cation of Mixing Matrix Using the Concept of Cancellation of Correlation 
165 
Appendix A. Stability of the Amari’s Natural Gradient and the AtickRedlich Formula 
168 
Appendix B. Gradient Descent Learning Algorithms with Invariant Frobenius Norm of the Separating Matrix 
171 
Appendix C. JADE Algorithm 
173 
5 Sequential Blind Signal Extraction 
177 
5.1 Introduction and Problem Formulation 
178 
5.2 Learning Algorithms Based on Kurtosis as Cost Function 
180 
5.2.1 A Cascade Neural Network for Blind Extraction of NonGaussian Sources with Learning Rule Based on Normalized Kurtosis 
181 
5.2.2 Algorithms Based on Optimization of Generalized Kurtosis 
184 
5.2.3 KuicNet Learning Algorithm 
186 
5.2.4 Fixedpoint Algorithms 
187 
5.2.5 Sequential Extraction and De.ation Procedure 
191 
5.3 On Line Algorithms for Blind Signal Extraction of Temporally Correlated Sources 
193 
5.3.1 On Line Algorithms for Blind Extraction Using Linear Predictor 
195 
5.3.2 Neural Network for Multiunit Blind Extraction 
197 
5.4 Batch Algorithms for Blind Extraction of Temporally Correlated Sources 
199 
5.4.1 Blind Extraction Using a First Order Linear Predictor 
201 
5.4.2 Blind Extraction of Sources Using Bank of Adaptive Bandpass Filters 
202 
5.4.3 Blind Extraction of Desired Sources Correlated with Reference Signals 
205 
5.5 Statistical Approach to Sequential Extraction of Independent Sources 
206 
5.5.1 Log Likelihood and Cost Function 
206 
5.5.2 Learning Dynamics 
208 
5.5.3 Equilibrium of Dynamics 
209 
5.5.4 Stability of Learning Dynamics and Newton’s Method 
210 
5.6 Statistical Approach to Temporally Correlated Sources 
212 
5.7 Online Sequential Extraction of Convolved and Mixed Sources 
214 
5.7.1 Formulation of the Problem 
214 
5.7.2 Extraction of Single i.i.d. Source Signal 
215 
5.7.3 Extraction of Multiple i.i.d. Sources 
217 
5.7.4 Extraction of Colored Sources from Convolutive Mixture 
218 
5.8 Computer Simulations: Illustrative Examples 
219 
5.8.1 Extraction of Colored Gaussian Signals 
219 
5.8.2 Extraction of Natural Speech Signals from Colored Gaussian Signals 
221 
5.8.3 Extraction of Colored and White Sources 
222 
5.8.4 Extraction of Natural Image Signal from Interferences 
223 
5.9 Concluding Remarks 
224 
Appendix A. Global Convergence of Algorithms for Blind Source Extraction Based on Kurtosis 
225 
Appendix B. Analysis of Extraction and De.ation Procedure 
227 
Appendix C. Conditions for Extraction of Sources Using Linear Predictor Approach 
228 
6 Natural Gradient Approach to Independent Component Analysis 
231 
6.1 Basic Natural Gradient Algorithms 
232 
6.1.1 KullbackLeibler Divergence  Relative Entropy as Measure of Stochastic Independence 
232 
6.1.2 Derivation of Natural Gradient Basic Learning Rules 
235 
6.2 Generalizations of Basic Natural Gradient Algorithm 
237 
6.2.1 Nonholonomic Learning Rules 
237 
6.2.2 Natural Riemannian Gradient in Orthogonality Constraint 
239 
6.2.2.1 Local Stability Analysis 
240 
6.3 NG Algorithms for Blind Extraction 
242 
6.3.1 Stiefel Manifolds Approach 
242 
6.4 Generalized Gaussian Distribution Model 
243 
6.4.1 The Moments of the Generalized Gaussian Distribution 
248 
6.4.2 Kurtosis and Gaussian Exponent 
249 
6.4.3 The Flexible ICA Algorithm 
250 
6.4.4 Pearson Model 
253 
6.5 Natural Gradient Algorithms for Nonstationary Sources 
254 
6.5.1 Model Assumptions 
254 
6.5.2 Second Order Statistics Cost Function 
255 
6.5.3 Derivation of NG Learning Algorithms 
255 
Appendix A. Derivation of Local Stability Conditions for NG ICA Algorithm (6.19) 
258 
Appendix B. Derivation of the Learning Rule (6.32) and Stability Conditions for ICA 
260 
Appendix C. Stability of Generalized Adaptive Learning Algorithm 
262 
Appendix D. Dynamic Properties and Stability of Nonholonomic NG Algorithms 
264 
Appendix E. Summary of Stability Conditions 
267 
Appendix F. Natural Gradient for Nonsquare Separating Matrixl 
268 
Appendix G. Lie Groups and Natural Gradient for General Case 
269 
G.0.1 Lie Group Gl(n,m) 
270 
G.0.2 Derivation of Natural Learning Algorithm for m > n 
271 
7 Locally Adaptive Algorithms for ICA and their Implementations 
273 
7.1 Modi.ed JuttenH´erault Algorithms for Blind Separation of Sources 
274 
7.1.1 Recurrent Neural Network 
274 
7.1.2 Statistical Independence 
274 
7.1.3 Selfnormalization 
277 
7.1.4 Feedforward Neural Network and Associated Learning Algorithms 
278 
7.1.5 Multilayer Neural Networks 
282 
7.2 Iterative Matrix Inversion Approach to Derivation of Family of Robust ICA Algorithms 
285 
7.2.1 Derivation of Robust ICA Algorithm Using Generalized Natural Gradient Approach 
288 
7.2.2 Practical Implementation of the Algorithms 
289 
7.2.3 Special Forms of the Flexible Robust Algorithm 
291 
7.2.4 Decorrelation Algorithm 
291 
7.2.5 Natural Gradient Algorithms 
291 
7.2.6 Generalized EASI Algorithm 
291 
7.2.7 Nonlinear PCA Algorithm 
292 
7.2.8 Flexible ICA Algorithm for Unknown Number of Sources and their Statistics 
293 
7.3 Computer Simulations 
294 
Appendix A. Stability Conditions for the Robust ICA Algorithm (7.50) [332] 
300 
8 Robust Techniques for BSS and ICA with Noisy Data 
305 
8.1 Introduction 
305 
8.2 Bias Removal Techniques for Prewhitening and ICA Algorithms 
306 
8.2.1 Bias Removal for Whitening Algorithms 
306 
8.2.2 Bias Removal for Adaptive ICA Algorithms 
307 
8.3 Blind Separation of Signals Buried in Additive Convolutive Reference Noise 
310 
8.3.1 Learning Algorithms for Noise Cancellation 
311 
8.4 Cumulants Based Adaptive ICA Algorithms 
314 
8.4.1 Cumulants Based Cost Functions 
314 
8.4.2 Family of Equivariant Algorithms Employing the Higher Order Cumulants 
315 
8.4.3 Possible Extensions 
317 
8.4.4 Cumulants for Complex Valued Signals 
318 
8.4.5 Blind Separation with More Sensors than Sources 
318 
8.5 Robust Extraction of Arbitrary Group of Source Signals 
320 
8.5.1 Blind Extraction of Sparse Sources with Largest Positive Kurtosis Using Prewhitening and SemiOrthogonality Constraint 
320 
8.5.2 Blind Extraction of an Arbitrary Group of Sources without Prewhitening 
323 
8.6 Recurrent Neural Network Approach for Noise Cancellation 
325 
8.6.1 Basic Concept and Algorithm Derivation 
325 
8.6.2 Simultaneous Estimation of a Mixing Matrix and Noise Reduction 
328 
8.6.2.1 Regularization 
329 
8.6.3 Robust Prewhitening and Principal Component Analysis (PCA) 
331 
8.6.4 Computer Simulation Experiments for AmariHopfield Network 
331 
Appendix A. Cumulants in Terms of Moments 
333 
9 Multichannel Blind Deconvolution: Natural Gradient Approach 
335 
9.1 SIMO Convolutive Models and Learning Algorithms for Estimation of Source Signal 
336 
9.1.1 Equalization Criteria for SIMO Systems 
338 
9.1.2 SIMO Blind Identi.cation and Equalization via Robust ICA/BSS 
340 
9.1.3 Feedforward Deconvolution Model and Natural Gradient Learning Algorithm 
342 
9.1.4 Recurrent Neural Network Model and Hebbian Learning Algorithm 
343 
9.2 Multichannel Blind Deconvolution with Constraints Imposed on FIR Filters 
346 
9.3 General Models for MultipleInput MultipleOutput Blind Deconvolution 
349 
9.3.1 Fundamental Models and Assumptions 
349 
9.3.2 SeparationDeconvolution Criteria 
351 
9.4 Relationships Between BSS/ICA and MBD 
354 
9.4.1 Multichannel Blind Deconvolution in the Frequency Domain 
354 
9.4.2 Algebraic Equivalence of Various Approaches 
355 
9.4.3 Convolution as Multiplicative Operator 
357 
9.4.4 Natural Gradient Learning Rules for Multichannel Blind Deconvolution (MBD) 
358 
9.4.5 NG Algorithms for Double In.nite Filters 
359 
9.4.6 Implementation of Algorithms for Minimum Phase Noncausal System 
360 
9.4.6.1 Batch Update Rules 
360 
9.4.6.2 Online Update Rule 
360 
9.4.6.3 Block Online Update Rule 
360 
9.5 Natural Gradient Algorithms with Nonholonomic Constraints 
362 
9.5.1 Equivariant Learning Algorithm for Causal FIR Filters in the Lie Group Sense 
363 
9.5.2 Natural Gradient Algorithm for Fully Recurrent Network 
367 
9.6 MBD of Nonminimum Phase System Using Filter Decomposition Approach 
368 
9.6.1 Information Backpropagation 
370 
9.6.2 Batch Natural Gradient Learning Algorithm 
371 
9.7 Computer Simulations Experiments 
373 
9.7.1 The Natural Gradient Algorithm vs. the Ordinary Gradient Algorithm 
373 
9.7.2 Information Backpropagation Example 
375 
Appendix A. Lie Group and Riemannian Metric on FIR Manifold 
376 
A.0.1 Lie Group 
377 
A.0.2 Riemannian Metric and Natural Gradient in the Lie Group Sense 
379 
Appendix B. Properties and Stability Conditions for the Equivariant Algorithm 
381 
B.0.1 Proof of Fundamental Properties and Stability Analysis of Equivariant NG Algorithm (9.126) 
381 
B.0.2 Stability Analysis of the Learning Algorithm 
381 
10 Estimating Functions and Supere.ciency for ICA and Deconvolution 
383 
10.1 Estimating Functions for Standard ICA 
384 
10.1.1 What is Estimating Function? 
384 
10.1.2 Semiparametric Statistical Model 
385 
10.1.3 Admissible Class of Estimating Functions 
386 
10.1.4 Stability of Estimating Functions 
389 
10.1.5 Standardized Estimating Function and Adaptive Newton Method 
392 
10.1.6 Analysis of Estimation Error and Superefficiency 
393 
10.1.7 Adaptive Choice of Function 
395 
10.2 Estimating Functions in Noisy Case 
396 
10.3 Estimating Functions for Temporally Correlated Source Signals 
397 
10.3.1 Source Model 
397 
10.3.2 Likelihood and Score Functions 
399 
10.3.3 Estimating Functions 
400 
10.3.4 Simultaneous and Joint Diagonalization of Covariance Matrices and Estimating Functions 
401 
10.3.5 Standardized Estimating Function and Newton Method 
404 
10.3.6 Asymptotic Errors 
407 
10.4 Semiparametric Models for Multichannel Blind Deconvolution 
407 
10.4.1 Notation and Problem Statement 
408 
10.4.2 Geometrical Structures on FIR Manifold 
409 
10.4.3 Lie Group 
410 
10.4.4 Natural Gradient Approach for Multichannel Blind Deconvolution 
410 
10.4.5 E.cient Score Matrix Function and its Representation 
413 
10.5 Estimating Functions for MBD 
415 
10.5.1 Supere.ciency of Batch Estimator 
418 
Appendix A. Representation of Operator K(z) 
419 
11 Blind Filtering and Separation Using a StateSpace Approach 
423 
11.1 Problem Formulation and Basic Models 
424 
11.1.1 Invertibility by State Space Model 
427 
11.1.2 Controller Canonical Form 
428 
11.2 Derivation of Basic Learning Algorithms 
428 
11.2.1 Gradient Descent Algorithms for Estimation of Output Matrices W= [C,D] 
429 
11.2.2 Special Case  Multichannel Blind Deconvolution with Causal FIR Filters 
432 
11.2.3 Derivation of the Natural Gradient Algorithm for State Space Model 
432 
11.3 Estimation of Matrices [A,B] by Information Backpropagation 
434 
11.4 State Estimator The Kalman Filter 
437 
11.4.1 Kalman Filter 
437 
11.5 Two Sttage Separation Algorithm 
439 
Appendix A. Derivation of the Cost Function 
440 
12 Nonlinear State Space Models  SemiBlind Signal Processing 
443 
12.1 General Formulation of The Problem 
443 
12.1.1 Invertibility by State Space Model 
447 
12.1.2 Internal Representation 
447 
12.2 SupervisedUnsupervised Learning Approach 
448 
12.2.1 Nonlinear Autoregressive Moving Average Model 
448 
12.2.2 Hyper Radial Basis Function Neural Network Model 
449 
12.2.3 Estimation of Parameters of HRBF Networks Using Gradient Approach 
451 
13 Appendix A Mathematical Preliminaries 
453 
13.1 Matrix Analysis 
453 
13.1.1 Matrix inverse update rules 
453 
13.1.2 Some properties of determinant 
454 
13.1.3 Some properties of the MoorePenrose pseudoinverse 
454 
13.1.4 Matrix Expectations 
455 
13.1.5 Di.erentiation of a scalar function with respect to a vector 
456 
13.1.6 Matrix di.erentiation 
457 
13.1.7 Trace 
458 
13.1.8 Matrix di.erentiation of trace of matrices 
459 
13.1.9 Important Inequalities 
460 
13.2 Distance measures 
462 
13.2.1 Geometric distance measures 
462 
13.2.2 Distances between sets 
462 
13.2.3 Discrimination measures 
463 
References 
465 
14 Glossary of Symbols and Abbreviations 
547 
Index 
552 
List of Figures
1.1 Block diagrams illustrating blind signal processing or blind identi.cation problem 
3 
1.2 (a) Conceptual model of system inverse problem. (b) Modelreference adaptive inverse control. For the switch in position 1 the system performs a standard adaptive inverse by minimizing the norm of error vector e, for switch in position 2 the system estimates errors blindly 

1.3 Block diagram illustrating the basic linear instantaneous blind source separation (BSS) problem: (a) General block diagram represented by vectors and matrices, (b) detailed architecture. In general, the number of sensors can be larger, equal to or less than the number of sources. The number of sources is unknown and can change in time [264, 275]. 
4 
1.4 Basic approaches for blind source separation with some a priori knowledge. 
9 
1.5 Illustration of exploiting spectral diversity in BSS. Three unknown sources and their available mixture and spectrum of the mixed signal. The sources are extracted by passing the 

mixed signal by three bandpass .lters (BPF) with suitable frequency characteristics depicted in the bottom figure. 
11 
1.6 Illustration of exploiting timefrequency diversity in BSS. (a) Original unknown source signals and available mixed signal. (b) Timefrequency representation of the mixed signal. Due to nonoverlapping timefrequency signatures of the sources by masking and synthesis (inverse transform), we can extract the desired sources 
12 
1.7 Standard model for noise cancellation in a single channel using a nonlinear adaptive .lter or neural network 
13 
1.8 Illustration of noise cancellation and blind separation  deconvolution problem 
14 
1.9 Diagram illustrating the single channel convolution and inverse deconvolution process 
15 
1.10 Diagram illustrating standard multichannel blind deconvolution problem (MBD) 
15 
1.11 Exemplary models of synaptic weights for the feedforward adaptive system (neural network) shown in Fig.1.3 : (a) Basic FIR .lter model, (b) Gamma .lter model, (c) Laguerre filter model 
17 
1.12 Block diagram illustrating the sequential blind extraction of sources or independent components. Synaptic weights wij can be timevariable coe.cients or adaptive .lters (see Fig.1.11) 
18 
1.13 Conceptual statespace model illustrating general linear statespace mixing and selfadaptive demixing model for Dynamic ICA (DICA). Objective of learning algorithms is estimation of a set of matrices {A,B,C,D,L} [287, 289, 290, 1359, 1360, 1361] 
20 
1.14 Block diagram of a simpli.ed nonlinear demixing NARMA model. For the switch in open position we have feedforward MA model and for the switch closed we have a recurrent ARMA model 
22 
1.15 Simpli.ed model of RBF neural network applied for nonlinear semiblind single channel equalization of binary sources, if the switch is in position 1, we have supervised learning, and unsupervised learning if it is in position 2 
23 
1.16 Exemplary biomedical applications of blind signal processing: (a) A multirecording monitoring system for blind enhancement of sources, cancellation of noise, elimination of artifacts and detection of evoked potentials, (b) blind separation of the fetal electrocardiogram (FECG) and maternal electrocardiogram (MECG) from skin electrode signals recorded from a pregnant women, (c) blind enhancement and independent components of multichannel electromyographic (EMG) signals 
26 
1.17 Noninvasive multielectrodes recording of activation of the brain using EEG or MEG 
28 
1.18 (a) A subset of the 122MEG channels. (b) Principal and (c) independent components of the data. (d) Field patterns corresponding to the first two independent components. In (e) the superposition of the localizations of the dipole originating IC1 (black circles, corresponding to the auditory cortex activation) and IC2 (white circles, corresponding to the SI cortex activation) onto magnetic resonance images (MRI) of the subject. The bars illustrate the orientation of the source net current. Results are obtained in collaboration with researchers from the Helsinki University of Technology, Finland [264] 
30 
1.19 Conceptual models for removing undesirable components like noise and artifacts and enhancing multisensory (e.g., EEG/MEG) data: (a) Using expert decision and hard switches, (b) using soft switches (adaptive nonlinearities in time, frequency or timefrequency domain), (c) using nonlinear adaptive .lters and hard switches [286, 1254] 
32 
1.20 Adaptive .lter con.gured for line enhancement (switches in position 1) and for standard noise cancellation (switches in position 2) 
34 
1.21 Illustration of the “cocktail party” problem and speech enhancement 
35 
1.22 Wireless communication scenario 
36 
1.23 Blind extraction of binary image from superposition of several images [761] 
37 
1.24 Blind separation of text binary images from a single overlapped image [761] 
38 
1.25 Illustration of image restoration problem: (a) Original image (unknown), (b) distorted (blurred) available image, (c) restored image using blind deconvolution approach, (d) .nal restored image obtained after smoothing (postprocessing) [329, 330] 
39 
2.1 Architecture of the AmariHop.eld continuoustime (analog) model of recurrent neural network (a) block diagram, (b) detailed architecture 
56 
2.2 Detailed architecture of the AmariHop.eld continuoustime (analog) model of recurrent neural network with regularization 
63 
2.3 This .gure illustrates the optimization criteria employed in the total leastsquares (TLS), leastsquares (LS) and data leastsquares (DLS) estimation procedures for the problem of finding a straight line approximation to a set of points. The TLS optimization assumes that the measurements of the x and y variables are in error, and seeks an estimate such that the sum of the squared values of the perpendicular distances of each of the points from the straight line approximation is minimized. The LS criterion assumes that only the measurements of the y variable is in error, and therefore the error associated with each point is parallel to the y axis. Therefore the LS minimizes the sum of the squared values of such errors. The DLS criterion assumes that only the measurements of the x variable is in error 
68 
2.4 Straight lines .t for the .ve points marked by ‘x’ obtained using the: (a) LS (L2 norm), (b) TLS, (c) DLS, (d) L1norm, (e) L∞ norm, and (f ) combined results 
70 
2.5 Straight lines .t for the .ve points marked by ‘x’ obtained using the LS, TLS and ETLS methods 
80 
3.1 Sequential extraction of principal components 
96 
3.2 Online on chip implementation of fast RLS learning algorithm for the principal component estimation 
97 
4.1 Basic model for blind spatial decorrelation of sensor signals 
130 
4.2 Illustration of basic transformation of two sensor signals with uniform distributions 
131 
4.3 Block diagram illustrating the implementation of the learning algorithm (4.31) 
135 
4.4 Implementation of the local learning rule (4.48) for the blind decorrelation 
137 
4.5 Illustration of processing of signals by using a bank of bandpass .lters: (a) Filtering a vector x of sensor signals by a bank of subband .lters, (b) typical frequency characteristics of bandpass filters 
152 
4.6 Comparison of performance of various algorithms as a function of the signal to noise ratio (SNR) [223, 235] 
162 
4.7 Blind identi.cation and estimation of sparse images: (a) Original sources, (b) mixed available images, (c) reconstructed images using the proposed algorithm (4.166)(4.167) 
168 
5.1 Block diagrams illustrating: (a) Sequential blind extraction of sources and independent components, (b) implementation of extraction and de.ation principles. LAE and LAD mean learning algorithm for extraction and de.ation, respectively 
180 
5.2 Block diagram illustrating blind LMS algorithm 
184 
5.3 Implementation of BLMS and KuicNet algorithms 
187 
5.4 Block diagram illustrating the implementation of the generalized .xedpoint learning algorithm developed by Hyv¨arinenOja [595]. means averaging operator. In the special case of optimization of standard kurtosis, where g(y1) = y3 1 and g(y1) = 3y21 
189 
5.5 Block diagram illustrating implementation of learning algorithm for temporally correlated sources 
194 
5.6 The neural network structure for oneunit extraction using a linear predictor 
196 
5.7 The cascade neural network structure for multiunit extraction 
198 
5.8 The conceptual model of single processing unit for extraction of sources using adaptive bandpass filter 
202 
5.9 Frequency characteristics of 4th order Butterworth bandpass .lter with adjustable center frequency and .xed bandwidth 
204 
5.10 Exemplary computer simulation results for mixture of three colored Gaussian signals, where sj, x1j, and yj stand for the jth source signals, whiten mixed signals, and extracted signals, respectively. The sources signals were extracted by employing the learning algorithm (5.73)(5.74) with L = 5 [1142] 
220 
5.11 Exemplary computer simulation results for mixture of natural speech signals and a colored Gaussian noise, where sj and x1j, stand for the jth source signal and mixed signal,respectively. The signals yj was extracted by using the neural network shown in Fig. 5.7 and associated learning algorithm (5.91) with q = 1, 5, 12 
221 
5.12 Exemplary computer simulation results for mixture of three noni.i.d. signals and two i.i.d. random sequences, where sj, x1j, and yj stand for the jth source signals, mixed signals, and extracted signals, respectively. The learning algorithm (5.81) with L = 10 was employed [1142] 
222 
5.13 Exemplary computer simulation results for mixture of three 512 × 512 image signals, where sj and x1j stand for the jth original images and mixed images, respectively, and y1 the image extracted by the extraction processing unit shown in Fig. 5.6. The learning algorithm (5.91) with q = 1 was employed [68, 1142] 
223 
6.1 Block diagram illustrating standard independent component analysis (ICA) and blind source separation (BSS) problem 
232 
6.2 Block diagram of fully connected recurrent network 
237 
6.3 (a) Plot of the generalized Gaussian pdf for various values of parameter r (with σ2 = 1) and (b) corresponding nonlinear activation functions 
244 
6.4 (a) Plot of generalized Cauchy pdf for various values of parameter r (with σ2 = 1) and (b) corresponding nonlinear activation functions 
248 
6.5 The plot of kurtosis κ4(r) versus Gaussian exponent r: (a) for leptokurtic signal; (b) for platykurtic signal [232] 
250 
6.6 (a) Architecture of feedforward neural network. (b) Architecture of fully connected recurrent neural network 
256 
7.1 Block diagrams: (a) Recurrent and (b) feedforward neural network for blind source separation 
275 
7.2 (a) Neural network model and (b) implementation of the JuttenH´erault basic continuoustime algorithm for two channels 
276 
7.3 Block diagram of the continuoustime locally adaptive learning algorithm (7.23) 
280 
7.4 Detailed analog circuit illustrating implementation of the locally adaptive learning algorithm (7.24) 
281 
7.5 (a) Block diagram illustrating implementation of continuoustime robust learning algorithm, (b) illustration of implementation of the discretetime robust learning algorithm 
283 
7.6 Various con.gurations of multilayer neural networks for blind source separation: (a) Feedforward model, (b) recurrent model, (c) hybrid model (LA means learning algorithm) 
284 
7.7 Computer simulation results for Example 1: (a) Waveforms of primary sources s1, s2, s2, (b) sensors signals x1, x2, x3 and (c) estimated sources y1, y2, y3 using the algorithm (7.32) 
295 
7.8 Exemplary computer simulation results for Example 2 using the algorithm (7.25). (a) Waveforms of primary sources, (b) noisy sensor signals and (c) reconstructed source signals 
297 
7.9 Blind separation of speech signals using the algorithm (7.80): (a) Primary source signals, (b) sensor signals, (c) recovered source signals 
298 
7.10 (a) Eight ECG signals are separated into: Four maternal signals, two fetal signals and two noise signals. (b) Detailed plots of extracted fetal ECG signals. The mixed signals were obtained from 8 electrodes located on the abdomen of a pregnant woman. The signals are 2.5 seconds long, sampled at 200 Hz 
299 
8.1 Ensembleaveraged value of the performance index for uncorrelated measurement noise in the .rst example: dotted line represents the original algorithm (8.8) with noise, dashed line represents the bias removal algorithm (8.10) with noise, solid line represents the original algorithm (8.8) without noise [404] 
309 
8.2 Conceptual block diagram of mixing and demixing systems with noise cancellation. It is assumed that reference noise is available 
311 
8.3 Block diagrams illustrating multistage noise cancellation and blind source separation: (a) Linear model of convolutive noise, (b) more general model of additive noise modelled by nonlinear dynamical systems (NDS) and adaptive neural networks (NN); LA1 and LA2 denote learning algorithms performing the LMS or backpropagation supervising learning rules whereas LA3 denotes a learning algorithm for BSS 
313 
8.4 Analog AmariHop.eld neural network architecture for estimating the separating matrix and noise reduction 
328 
8.5 Architecture of AmariHop.eld recurrent neural network for simultaneous noise reduction and mixing matrix estimation: Conceptual discretetime model with optional PCA 
329 
8.6 Detailed architecture of the discretetime AmariHopfield recurrent neural network with regularization 
330 
8.7 Exemplary simulation results for the neural network in Fig.8.4 for signals corrupted by the Gaussian noise. The .rst three signals are the original sources, the next three signals are the noisy sensor signals, and the last three signals are the online estimated source signals using the learning rule given in (8.92)(8.93). The horizontal axis represents time in seconds 
332 
8.8 Exemplary simulation results for the neural network in Fig. 8.4 for impulsive noise. The .rst three signals are the mixed sensors signals contaminated by the impulsive (Laplacian) noise, the next three signals are the source signals estimated using the learning rule (8.8) and the last three signals are the online estimated source signals using the learning rule (8.92)(8.93) 
333 
9.1 Conceptual models of singleinput/multipleoutput (SIMO) dynamical system: (a) Recording by an array of microphones an unknown acoustic signal distorted by reverberation, (b) array of antenna receiving distorted version of transmitted signal, (c) illustration of oversampling principle for two channels 
337 
9.2 Functional diagrams illustrating SIMO blind equalization models: (a) Feedforward model, (b) recurrent model, © detailed structure of the recurrent model 
344 
9.3 Block diagrams illustrating the multichannel blind deconvolution problem: (a) Recurrent neural network, (b) feedforward neural network (for simplicity, models for two channels are shown only) 
347 
9.4 Illustration of the multichannel deconvolution models: (a) Functional block diagram of the feedforward model, (b) architecture of feedforward neural network (each synaptic weight Wij(z, k) is an FIR or stable IIR .lter, (c) architecture of the fully connected recurrent neural network 
350 
9.5 Exemplary architectures for two stage multichannel deconvolution 
353 
9.6 Illustration of the Lie group’s inverse of an FIR filter, where H(z) is an FIR .lter of length L = 50, W(z) is the Lie group’s inverse of H(z), and G(z) =W(z)H(z) is the composite transfer function 
367 
9.7 Cascade of two FIR .lters (noncausal and causal) for blind deconvolution of nonminimum phase system 
369 
9.8 Illustration of the information backpropagation learning 
371 
9.9 Simulation results of two channel blind deconvolution for SIMO system in Example 9.2: (a) Parameters of mixing .lters (H1(z),H2(z)) and estimated parameters of adaptive deconvoluting .lters (W1(z),W2(z)), (b) coe.cients of global subchannels (G1(z) = W1(z)H1(z),G2(z) = W2(z)H2(z)), (c) parameters of global system (G(z) = G1(z) + G2(z)) 
374 
9.10 Typical performance index MISI of the natural gradient algorithm for multichannel blind deconvolution in comparison with the standard gradient algorithm [1369] 
375 
9.11 The parameters of G(z) of the causal system in Example 9.3: (a) The initial state, (b) after 3000 iterations [1368, 1374] 
376 
9.12 Zeros and poles distributions of the mixing ARMA model in Example 9.4 
377 
9.13 The distribution of parameters of the global transfer function G(z) of noncausal system in Example 9.4: (a) The initial state, (b) after convergence [1369] 
378 
11.1 Conceptual block diagram illustrating the general linear statespace mixing and selfadaptive demixing model for blind separation and .ltering. The objective of learning algorithms is the estimation of a set matrices {A,B,C,D,L} [287, 289, 290, 1359, 1360, 1361, 1368] 
425 
11.2 Kalman .lter for noise reduction 
438 
12.1 Typical nonlinear dynamical models: (a) The Hammerstein system, (b) the Wiener system and (c) Sandwich system 
444 
12.2 The simple nonlinear dynamical model which leads to the standard linear .ltering and separation problem if the nonlinear function can be estimated and their inverses exist 
445 
12.3 Nonlinear statespace models for multichannel semiblind separation and .ltering: (a) Generalized nonlinear model, (b) simpli.ed nonlinear model 
446 
12.4 Block diagram of a simpli.ed nonlinear demixing NARMA model. For the switch open, we have a feedforward nonlinear MA model, and for the switch closed we have a recurrent nonlinear ARMA model 
448 
12.5 Conceptual block diagram illustrating HRBF neural network model employed for nonlinear semiblind separation and .ltering: (a) Block diagram, (b) detailed neural network model 
450 
12.6 Simpli.ed model of HRBF neural network for nonlinear semiblind single channel equalization if the switch is in position 1, we have supervised learning, and unsupervised learning if it is in position 2, assuming binary sources 
451 
List of Tables
2.1 Basic robust loss functions ρ(e) and corresponding influence functions Ψ(e) = dρ(e)/de 
55 
3.1 Basic cost functions which maximization leads to adaptive PCA algorithms 
101 
3.2 Basic adaptive learning algorithms for principal component analysis (PCA) 
102 
3.3 Basic adaptive learning algorithms for minor component analysis (MCA) 
109 
3.4 Parallel adaptive algorithms for PSA/PCA 
114 
3.5 Adaptive parallel MSA/MCA algorithms for complex valued data 
116 
A.1 Fast implementations of PSA algorithms for complexvalued signals and matrices 
124 
5.1 Cost functions for sequential blind source extraction one by one, y = wT x. (Some criteria require prewhitening of sensor data, i.e., Rxx = I or AAT = I) 
216 
6.1 Typical pdf q(y) and corresponding normalized activation functions f(y) = .d log q(y)/dy 
246 
8.1 Basic cost functions for ICA/BSS algorithms without prewhitening 
319 
8.2 Family of equivariant learning algorithms for ICA for complexvalued signals 
321 
8.3 Typical cost functions for blind signal extraction of group of esources (1 ¡ e ¡ n) with prewhitening of sensor signals, i.e., AAT = I 
324 
8.4 BSE algorithm based on cumulants without prewhitening [331] 
325 
9.1 Relationships between instantaneous blind source separation and multichannel blind deconvolution for complexvalued signals and parameters 
361 
11.1 Family of adaptive learning algorithms for statespace models 
435 