## ADAPTIVE BLIND SIGNAL AND IMAGE PROCESSING |

Andrzej CICHOCKI & Shun-ichi
AMARI

Blind Signal Processing (BSP) is now one of the hottest and
exciting topics in the fields of neural computation, advanced statistics, and
signal processing with solid theoretical foundations and many potential
applications. In fact, BSP has become a very important topic of research and development in many areas, especially
biomedical engineering, medical imaging, speech enhancement, remote
sensing, communication systems, exploration
seismology, geophysics, econometrics, data mining, neural networks, etc. The blind signal processing techniques
principally do not use any training data and do not assume *a priori* knowledge about parameters of convolutive, filtering and
mixing systems. BSP includes three major areas: Blind Signal Separation and
Extraction (BSS/BSE), Independent Component Analysis (ICA), and Multichannel
Blind Deconvolution (MBD) and Equalization which are the main subjects of the
book. In this chapter are formulated fundamental problems of the BSP, given
important definitions and described basic mathematical and physical models. Moreover,
several potential and promising applications are reviewed.

Keywords: Blind Source Separation (BSS), Blind Source
Extraction (BSE), Independent Component Analysis (ICA), Multichannel Blind
Deconvolution (MBD), Basic definitions and models, Applications.

In modern signal and image processing fields
like biomedical engineering, computer tomography (image reconstruction from
projections), automatic control, robotics, speech and communication, linear parametric
estimation, models such as auto-regressive moving-average (ARMA) and linear
prediction (LP) have been extensively utilized. In fact, such models can be
mathematically described by an overdetermined system of linear algebraic
equations. Such systems of equations are often contaminated by noise or errors,
thus the problem of finding an optimal and robust with respect noise solution
arises if some a* priori* information
about the error is available. On the other hand, wide classes of extrapolation,
reconstruction, estimation, approximation, interpolation and inverse problems
can be converted to minimum norm problems of solving underdetermined systems of
linear equations. Generally speaking, in signal processing applications, the
overdetermined system of linear equations describes filtering, enhancement,
deconvolution and identification problems, while the underdetermined case
describes inverse and extrapolation problems. This chapter provides a tutorial
to the problem of solving large overdetermined and underdetermined systems of
linear equations, especially when there is an uncertainty in parameter values
and/or the systems are contaminated by noise. A special emphasis is placed in
on-line fast adaptive and iterative algorithms for arbitrary noise statistics. This
chapter also gives several illustrative examples that demonstrate the
characteristics of the developed novel algorithms.

Keywords: Least
Squares (LS) problem, Extended Total Least Squares (TLS), Data Least Squares

(DLS), Least Absolute Deviation (LAD), 1-norm solution, Solving
of system of linear equations with non-negativity constraints, Non-negative
Matrix Factorization (NMF), Regularization, Sparse signal representation, Sparse
solutions, Minimum Fuel Problem (MFP), Focuss algorithms, Amari-Hopfield
recurrent neural networks for on-line solutions.

Neural networks with unsupervised learning algorithms organize themselves in such a way that they can detect or extract useful features, regularities, correlations of data or signals or separate or decorrelate some signals with little or no prior knowledge of the desired results. Normalized (constrained) Hebbian and anti-Hebbian learning rules are simple variants of basic unsupervised learning algorithms; in particular, learning algorithms for principal component analysis (PCA), singular value decomposition (SVD) and minor component analysis (MCA) belong to this class of unsupervised rules. Recently, many efficient and powerful adaptive algorithms have been developed for PCA, MCA and SVD and their extensions The main objective of this chapter is a derivation and overview of the most important adaptive algorithms.

Keywords: PCA, MCA,
SVD, Subspace methods, Automatic dimensionality reduction, AIC and MDL
criteria, Power method, Robust PCA, Multistage PCA for blind source separation.

Temporal, spatial and spatio-temporal decorrelations
play important roles in signal processing. These techniques are based only on
second-order statistics (SOS). They are the basis for modern subspace methods
of spectrum analysis and array processing and often used in a preprocessing
stage in order to improve convergence properties of adaptive systems, to
eliminate redundancy or to reduce noise. Spatial decorrelation or prewhitening is
often considered as a necessary (but not sufficient) condition for the stronger
stochastic independence criteria. After prewhitening, the BSS or

Keywords: Robust
whitening, Robust orthogonalization, Gram-Schmidt orthogonalization,, Second
order statistics (SOS) blind identification, Multistage EVD/SVD for BSS,
Simultaneous diagonalization, Joint approximative diagonalization, SOBI and
JADE algorithms, Blind source separation for non-stationary signals, Natural
gradient, Atick-Redlich formula, Gradient descent with Frobenius norm
constraint.

There are three main objectives of this chapter:

(a)
To present simple neural networks
(processing units) and propose unconstrained extraction and deflation
criteria that do not require either *a
priori* knowledge of source signals or the whitening of mixed signals. These
criteria lead to simple, efficient, purely local and biologically plausible
learning rules (e.g., Hebbian/anti-Hebbian type learning algorithms).

(b) To prove that the proposed criteria have no spurious equilibriums. In other words, the most learning rules discussed in this chapter always reach desired solutions, regardless of initial conditions (see appendixes for proof).

(c) To demonstrate with computer simulations the validity and high performance for practical use of the derived learning algorithms.

In this chapter there are used two different models and approaches. The first approach is based on higher order statistics (HOS) which assume that sources are mutually statistically independent and they are non-Gaussian (expect at most one) and as criteria of independence, we will use some measures of non-Gaussianity. The second approach based on the second order statistics (SOS) assumes that source signals have some temporal structure, i.e., the sources are colored with different autocorrelation functions or equivalently different shape spectra. Special emphasis will be given to blind source extraction (BSE) in the case when sensor signals are corrupted by additive noise using the bank of band pass filters.

Keywords: Basic
criteria for blind source extraction, Kurtosis, Gray function, Cascade neural
network, Deflation procedures, KuickNet, Fixed-point algorithms, Blind
extraction with reference signal, Linear predictor and band-pass filters for
BSS, Statistical analysis, Log likelihood, Extraction of sources from
convolutive mixture, Stability, Global convergence.

In this chapter, fundamental signal processing and information theoretic approaches are presented together with learning algorithms for the problem of adaptive blind source separation (BSS) and Independent Component Analysis (ICA). We discuss recent developments of adaptive learning algorithms based on the natural gradient approach in the general linear, orthogonal and Stiefel manifolds. Mutual information, Kullback-Leibler divergence, and several promising schemes are discussed and reviewed in this chapter, especially for signals with various unknown distributions and unknown number of sources. Emphasis is given to an information-theoretical and information-geometrical unifying approach, adaptive filtering models and associated on-line adaptive nonlinear learning algorithms. We discuss the optimal choice of nonlinear activation functions for various distributions, e.g., Gaussian, Laplacian, impulsive and uniformly-distributed signals based on a generalized-Gaussian-distributed model. Furthermore, families of efficient and flexible algorithms that exploit non-stationarity of signals are also derived.

Keywords:
Kullback-Leibler divergence, Natural gradient concept, Derivation and analysis
of

natural gradient
algorithms, Local stability analysis, Nonholonomic constraints, Generalized
Gaussian and Cauchy distributions, Pearson model. Natural gradient algorithms
for non-stationary sources. Extraction of arbitrary group of sources, Semi-orthogonality
constraints, Stiefel manifolds.

The main purpose of this chapter is to describe and overview models and to present a family of practical and efficient associated adaptive or locally adaptive learning algorithms which have special advantages of efficiency and/or simplicity and straightforward electronic implementations. Some of the described algorithms have special advantages in the cases of noisy, badly scaled or ill-conditioned signals. The developed algorithms are extended for the case when the number of sources and their statistics are unknown. Finally, problem of an optimal choice of nonlinear activation function and general local stability conditions are also discussed. In particular, we focus on simple locally adaptive Hebbian/anti-Hebbian learning algorithms and their implementations using multi-layer neural networks are proposed.

Keywords: Modified
Jutten-Herault algorithm, robust local algorithms for ICA/BSS, Multi-layer
network for

In this chapter we focus mainly on approaches to blind separation of sources when the measured signals are contaminated by large additive noise. We extend existing adaptive algorithms with equivariant properties in order to considerably reduce the bias caused by measurement noise for the estimation of mixing and separating matrices. Moreover, we propose dynamical recurrent neural networks for simultaneous estimation of the unknown mixing matrix, source signals and reduction of noise in the extracted output signals. The optimal choice of nonlinear activation functions for various noise distributions assuming a generalized-Gaussian-distributed noise model is also discussed. Computer simulations of selected techniques are provided that confirms their usefulness and good performance. The main objective of this chapter is to present several approaches and derive learning algorithms that are more robust with respect to noise than the techniques described in the previous chapters or that can reduce the noise in the estimated output vector of independent components

Keywords: Bias removal
techniques, Wiener filters with references convolutive noise, Noise
cancellation and reduction, Cumulants based cost functions and equivariant
algorithms, Blind source separation with more sensors than sources, Robust
extraction of arbitrary group of sources, Recurrent neural network for noisy
data, Amari-Hopfield neural network.

The main objective of this chapter is to
review and extend existing adaptive natural gradient algorithms for various
multichannel blind deconvolution models.
Blind separation/deconvolution of source signals has been a subject under
consideration for more than two decades. There are significant potential
applications of blind separation/deconvolution in various fields, for example,
wireless telecommunication systems, sonar and radar systems, audio and
acoustics, image enhancement and biomedical signal processing (EEG/MEG
signals). In these applications, single or multiple unknown but independent temporal
signals propagate through a mixing and filtering medium. The blind source
separation/deconvolution problem is concerned with recovering independent
sources from sensor outputs without assuming any *a priori* knowledge of the original signals, except certain
statistical features. In this chapter,
we present using various models and assumptions, relatively simple and
efficient, adaptive and batch algorithms for blind deconvolution and
equalization for single-input/multiple-output (SIMO) and multiple-input/multiple-output
(MIMO) dynamical minimum phase and non-minimum phase systems. The basic
relationships between standard ICA/BSS (Independent Component Analysis and
Blind Source Separation) and multichannel blind deconvolution are discussed in
detail. They enable us to extend
algorithms derived in the previous chapters. In particular, the natural gradient
approaches for instantaneous mixture to convolutive dynamical models. We also
derive a family of equivariant algorithms and analyze their stability and
convergence properties. Furthermore, a Lie group and Riemannian metric are introduced
on the manifold of FIR filters and using the isometry of the Riemannian metric,
the natural gradient on the FIR manifold is described. Based on the
minimization of mutual information, we present then a natural gradient
algorithm for the causal minimum phase finite impulse response (FIR)
multichannel filter. Using information back-propagation, we also discuss an
efficient implementation of the learning algorithm for the non-causal FIR
filters. Computer simulations are also presented to illustrate the validity and
good learning performance of the described algorithms.

Keywords: Basic models
for blind equalization and multichannel deconvolution, Fractionally Sampled system, SIMO and MIMO
models, Equalization criteria, Separation-deconvolution criteria, Relationships
between BSS/ICA and multichannel blind deconvolution (MBD), Natural gradient
algorithms for MBD, Information Back-propagation.

Chapter 10 introduces the method of estimating
functions to elucidate the common structures in most of the ICA/BSS and MBD
algorithms. We use information geometry for this purpose, and define estimating
functions in semiparametric statistical models which include unknown functions
as parameters. Differences in most existing algorithms are only in the choices
of estimating functions. We then give error analysis and stability analysis in
terms of estimating functions. This
makes it possible to design various adaptive methods for choosing unknown
parameters included in estimating functions, which control accuracy and
stability. The *et al.* We present the geometrical properties of the manifold of the
FIR filters based on the Lie group structure and formulate the multichannel
blind deconvolution problem within the framework of the semiparametric model
deriving a family of estimating functions for blind deconvolution. We then
analyze the efficiency of the batch estimator based on estimating function - obtaining
its convergence rate. Finally, we show that both batch learning and on-line natural
gradient learning are superefficient under given nonsingular conditions.

Keywords: Estimating functions, Semiparametric statistical models,
Superefficiency, Likelihood, Score functions, Batch estimator, Information
geometry, Stability analysis.

The state-space description of dynamical
systems is a powerful and flexible generalized model for blind separation and
deconvolution or more generally for filtering and separation. There are several
reasons why the state-space models are advantageous for blind separation and
filtering. Although transfer function models in the *Z*-domain or the frequency domain are equivalent to the state-space
models in the time domain for any linear, stable time-invariant dynamical
system, using transfer function directly it is difficult to exploit internal
representation of real dynamical systems. The main advantage of the state-space
description is that it not only gives the internal description of a system, but
there are various equivalent canonical types of state-space realizations for a
system, such as balanced realization and observable canonical forms. In
particular, it is possible to parameterize some specific classes of models
which are of interest in applications. In addition, it is relatively easy to
tackle the stability problem of state-space systems using the Kalman filter. Moreover,
the state-space model enables a much more general description than the standard
finite impulse response (FIR) convolutive filtering models discussed in the
Chapter 9. In fact, all the known filtering models, such as the AR, MA, ARMA,
ARMAX and Gamma filtering, could also be considered as special cases of
flexible state-space models. In this chapter, we briefly review adaptive
learning algorithms based on the natural gradient approach and give some perspective
and new insight into multiple-input multiple-output blind separation and
filtering in the state-space framework.

Keywords: Linear basic
state space model, Natural gradient algorithm for state space model, Estimation
of output and state space matrices, Comparison of various algorithms, Kalman
filter, Two stage blind separation/filtering approach.

In this chapter we attempt to extend and
generalize the results discussed in the previous chapters to nonlinear
dynamical models. However, the problem is not only very challenging but
intractable in the general case without
*a priori* knowledge about the
mixing and filtering nonlinear process. Therefore, in this chapter we consider
very briefly only some simplified nonlinear models. In addition, we assume that
some information about the mixing and separating system and source signals is
available. In practice, special
nonlinear dynamical models are considered in order to simplify the problem and
solve it efficiently for specific applications. Specific examples include the
Wiener model, the Hammerstein model and Nonlinear Autoregressive Moving Average
models.

Keywords: Semi-blind
separation and filtering, Wiener and Hammerstein models, Nonlinear
Autoregressive Moving Average (NARMA) model, Hyper radial basis function (HRBF)
neural network.

In this appendix some mathematical background needed for complete understanding of the text are quickly reviewed. Many useful definitions, formulas for matrix algebra and matrix differentiation are given

Appendix B contains the list of basic symbols, notation and abbreviations used in the book.

The list of references contains more than 1350 publications.

Accompanying CD-ROM includes electronic, interactive
version of the book with hyperlinks, full-color figures and text. The black and
white electronic version with hyperlinks is also provided.

In addition MATLAB
user friendly demo package for performing