6
Introduction Why "adaptive algorithms and stochastic approximations"? The use of adaptive algorithms is now very widespread across such varied applications as system identification, adaptive control, transmission systems, adaptive filtering for signal processing, and several aspects of pattern recognition. Numerous, very different examples of applications are given in the text. The success of adaptive algorithms has inspired an abundance of literature, and more recently a number of significant works such as the books of Ljung and Soderstrom (1983) and of Goodwin and Sin (1984). In general, these works consider primarily the notion of an adaptive system, which is composed of: 1. The object upon which processing is carried out: control system, modelling system, transmission system, .... 2. The so-called estimation process. In so doing, they implicitly address the modelling of the system as a whole. This approach has naturally led to the introduction of boundaries between • System identification from the control point of view. • Signal modelling. • Adaptive filtering. The myriad of applications to pattern recognition: adaptive quantisation, .... These boundaries echo the classes of models which conveniently describe each corresponding system. For example multi variable linear systems certainly have an important role to play in system identification, although they are scarcely ever met in adaptive filtering, and never appear in most pattern recognition applications. On the other hand, the latter applications call for models which have no relevance to the linear systems widely used in automatic control theory. It would therefore be foolish to try to present a general theory of adaptive systems which created a framework sufficiently broad to encompass all models and algorithms simultaneously. However, in our opinion and experience, these problems have a major common component: namely the use (once all the modelling problems have A. Benveniste et al., Adaptive Algorithms and Stochastic Approximations © Springer-Verlag Berlin Heidelberg 1990

Adaptive Algorithms and Stochastic Approximations || Introduction

  • Upload
    pierre

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Introduction

Why "adaptive algorithms and stochastic approximations"?

The use of adaptive algorithms is now very widespread across such varied applications as system identification, adaptive control, transmission systems, adaptive filtering for signal processing, and several aspects of pattern recognition. Numerous, very different examples of applications are given in the text. The success of adaptive algorithms has inspired an abundance of literature, and more recently a number of significant works such as the books of Ljung and Soderstrom (1983) and of Goodwin and Sin (1984).

In general, these works consider primarily the notion of an adaptive system, which is composed of:

1. The object upon which processing is carried out: control system, modelling system, transmission system, ....

2. The so-called estimation process.

In so doing, they implicitly address the modelling of the system as a whole. This approach has naturally led to the introduction of boundaries between

• System identification from the control point of view.

• Signal modelling.

• Adaptive filtering.

• The myriad of applications to pattern recognition: adaptive quantisation, ....

These boundaries echo the classes of models which conveniently describe each corresponding system. For example multi variable linear systems certainly have an important role to play in system identification, although they are scarcely ever met in adaptive filtering, and never appear in most pattern recognition applications. On the other hand, the latter applications call for models which have no relevance to the linear systems widely used in automatic control theory. It would therefore be foolish to try to present a general theory of adaptive systems which created a framework sufficiently broad to encompass all models and algorithms simultaneously.

However, in our opinion and experience, these problems have a major common component: namely the use (once all the modelling problems have

A. Benveniste et al., Adaptive Algorithms and Stochastic Approximations© Springer-Verlag Berlin Heidelberg 1990

2 Introduction

been resolved) of adaptive algorithms. This topic, which we shall now study more specifically, is the counterpart of the notion of stochastic approximation as found in statistical literature. The juxtaposition of these two expressions in the title is an exact statement of our ambition to produce a reference work, both for engineers who use these algorithms and for probabilists or statisticians who would like to study stochastic approximations in terms of problems arising from real applications.

Adaptive algorithms.

The function of these algorithms is to adjust a parameter vector, which we shall denote generically by 0, with a view to an objective specified by the user: system control, identification, adjustment, .... This vector 0 is the user's only interface with the system and its definition requires an initial modelling phase.

In order to tune this parameter 0, the user must be able to monitor the system. Monitoring is effected via a so-called state vector, which we shall denote by X n , where n refers to the time of observation of the system. This state vector might be:

• The set consisting of the regression vector and an error signal, in the classical case of system identification, as for example presented in (Ljung and Soderstrom 1983) or in numerous adaptive filtering problems .

• The sample signal observed at the instant n, in the case of adaptive quantisation, ....

In all these cases, the rule used to update 0 will typically be of the form

On = On-l + In H( On-I, Xn)

where In is a sequence of small gains and H( 0, X) is a function whose specific determination is one of the main aims of this book.

Aims of the book.

These are twofold:

1. To p!,"ovide the user of adaptive algorithms with a guide to their analysis and design, which is as clear and as comprehensive as possible.

2. To accompany this guide with a presentation of the fundamental underlying mathematics.

In seeking to reach these objectives, we come up against two contradictory demands. On the one hand, adaptive algorithms must, generally speaking, be easy to use and accessible to a large class of engineers: this requires the guide to use a minimal technical arsenal. On the other hand, an honest assessment

Introduction 3

of practices currently found in adaptive algorithm applications demands that we obtain fine results using assumptions which, in order to be realistic, are perforce complicated. This remark has led many authors to put forward the case for a similar guide, modestly restricted to the application areas of interest to themselves.

We have preferred to resolve this difficulty in another way, and it is this prejudice which lends originality to the book, which is, accordingly, divided into two parts, each of a very different character.

Part II presents the mathematical foundations of adaptive systems theory from a modern point of view, without shying away from the difficulty of the questions to be resolved: in it we shall make great use of the basic notions of conditioning, Markov chains and martingales. Assumptions will be stated in detail and proofs will be given in full. Part II contains:

1. "Law of large numbers type" convergence results where, so as not to make the proofs too cumbersome, the assumptions include minor constraints on the temporal properties of the state vector Xn and on the regularity of the function H(O, X), and quite severe restrictions upon the moments of Xn (Chapter 1).

2. An illustration of the previous results, first with classical examples, then with a typical, reputedly difficult, example (Chapter 2).

3. A refinement of the results of Chapter 1 with weaker assumptions on the moments (Chapter 3).

4. The introduction of diffusion approximation!! ("central limit theorem type" results) which allow a detailed evaluation of the asymptotic behaviour of adaptive algorithms (Chapter 4).

Many of the results and proofs in Part II are original. They cover the case of algorithms with decreasing gain, as well as that of algorithms with constant gain, the latter being the most widely use in practice.

Part I concentrates on the presentation of the guide and on its illustration by various examples. Whilst not totally elementary in a mathematical sense, Part I is not encumbered with technical assumptions, and thus it is able to highlight the essential mathematical difficulties which must be faced if one is to make good use of adaptive algorithms. On the other hand, we wanted the guide to provide as full an introduction as possible to good usage of adaptive algorithms. Thus we discuss:

1. The convergence of adaptive algorithms (in the sense of the law of large numbers) and the consequence of this on algorithm analysis and design (Chapters 1 and 2).

2. The asymptotic behaviour of algorithms in the "ideal" case where the phenomenon upon which the user wishes to ope~ate is time invariant (Chapter 3).

4 Introduction

3. The behaviour of the algorithms when the true system evolves slowly in time and the consequences of this on algorithm design (Chapter 4).

4. The monitoring of abrupt changes in the true system, or the non­conformity of the true system to the model in use (Chapter 5).

The final two points are central to the study of adaptive algorithms (these algorithms arose because true systems are time-varying), yet, to the best of our knowledge they have never been systematically discussed in any text on adaptive algorithms.

Whilst the two parts of the book overlap to a certain extent, they take complementary views of the areas of overlap. In each case, we cross-reference the informal results of Part I with the corresponding theorems of Part II, and the examples of Part I with their mathematical treatment in Part II.

How to read this book.

The diagram below shows the organisation of the various chapters of the book and their mutual interaction.

Each chapter of Part I contains a number of exercises which form a useful complement to the material presented in that chapter. The exercises are either direct applications or non-trivial extensions of the chapter. Part I also includes three appendices which describe the rudiments of systems theory and Kalman filtering for mathematicians who wish to read Part I. Part II is technically difficult, although it demands little knowledge of probability: basic concepts, Markov chains, basic martingale concepts; other principles are introduced as required. As for Part I, the first two chapters only require the routine knowledge of probability theory of an engineer working in signal processing or control theory, whilst the final three chapters are of increasing difficulty.

The book may be read in several different ways, for example :

• Engineer's introductory COU1'se on adaptive algorithms and their uses: Chapters 1 and 2 of Part I;

• Engineer's technical course on adaptive algorithms and their use: all of Part I, the first two sections of Chapter 4 of Part II;

• Mathematician's technical course on adaptive algorithms and their use: Part II, Chapters 1, 2, 4 and a rapid pass through Part I.

Introduction 5

Part I Part II

Chapter 1 Chapter 1 -

adaptive algorithms: f', ODE and

~: -

general form f\ convergence a.s.

Chapter 2 Chapter 2 0; convergence: ~

examples r-

the ODE method -..

~\ - ~ Chapter 3 ~ Chapter 3

f rate of convergence convergence a.s. I--

~ weak assumptions Chapter 4

f-. tracking a Chapter 4

I[ non-stationary system Gaussian approximations

Chapter 5 0;

change detection, '-0;

monitoring

Application domains.

As the title indicates, the adaptive algorithms are principally applied to system identification (one of the most important areas of control theory), signal processing and pattern recognition.

As far as system identification is concerned, comparison of the numerous examples of AR and ARMA system identification with (Ljung & Soderstrom 1983) highlights the importance of this area; of course this much is already well known. On the other hand, the two adaptive control exercises will serve to show the attentive reader that the stability of adaptive control schemes is one essential problem which is not resolved by the theoretical tools presented here.

The relevance of adaptive algorithms to signal processing is also well known, as the large number of examples from this area indicates. We would however highlight the exercise concerning the ALOHA protocol for satellite communications as an atypical example in telecommunications.

Applications to pattern recognition are slightly more unusual. Certainly

6 Introduction

the more obvious areas of pattern recognition, such as speech recognition, use techniques largely based on adaptive signal processing (LPC, Burg and recursive methods ... ). The two exercises on adaptive quantisation are more characteristic: in fact they are a typical illustration of the difficulties and the techniques of pattern recognition; such methods, involving a learning phase, are used in speech and image processing. Without wishing to overload our already long list of examples, we note that the recursive estimators of motion in image sequences used in numerical encoding of television images are also adaptive algorithms.