Transcript
  • 7/30/2019 pattern recognition Ch01 Introduction

    1/7

    Chapter 1Chapter 1

    IntroductionIntroduction

    Pattern Recognition :

    What is this?

    Introduction to Pattern Recognition 2

    CourseIntroduction

    n Introduction

    n Course organization

    n Grading policy

    n Outline

    n What is Pattern Recognition?

    n Definitions from the literature

    n Related fields and applications

    n Components of a Pattern Recognition system

    n Pattern Recognition problems

    n Features and Patterns

    n The Pattern Recognition design cycle

    n Pattern Recognition approaches

    n Statistical

    n Neural

    n Structural

    Introduction to Pattern Recognition 3

    Course Organization

    n Description

    n Survey of the theory and practice of statistical and neural pattern recognition.

    n The course covers fundamental problems including density estimation,

    dimensionality reduction, classification, clustering and validation, as well asadvanced topics such as support vector machine, Hidden Markov Models and

    ensemble learning

    n Textbook

    n R. O. Duda, P. E. Hart and D. G. Stork, Pattern Classification, 2nd Ed. 2001,

    Willey

    n Grading

    n Homework

    n assignments, every 3 weeks

    n Tests

    n 1 midterm, 1 final (comprehensive)

    n Term project

    n Open-ended

    n Public presentation

    Introduction to Pattern Recognition 4

    Course Outline

    n Introduction(1week)

    n Pattern Recognition

    n Probability and Statistics

    n Linear Algebra

    n Statistical Pattern Recognition (3 weeks)

    n Bayesian Decision Theory

    n Quadratic Classifiers

    n Parameter and Density Estimation

    n Nearest Neighbors

    n Linear Discriminants

    n Validation

    n Dimensionality Reduction (2 weeks)

    n Principal Components Analysis

    n Fisher s Discriminants Analysis

    n Feature Subset Selection

    n Clustering (1 week)

    n Mixturemodels

    n Hierarchical Clustering

    n On-line Clustering

    n Self-OrganizingMaps

    n Neural network approaches (3 weeks)

    n Multilayer Perceptrons

    n RadialBasisFunctions

    n Associative Memories

    n Advanced topics (3 weeks)

    n Support Vector Machines

    n Hidden Markov Modelsn EnsembleLearning

  • 7/30/2019 pattern recognition Ch01 Introduction

    2/7

    Introduction to Pattern Recognition 5

    What Is Pattern Recognition?

    n Definitions from the literature

    n The assignment of a physical object orevent to one of several pre-specifiedcategories Duda and Hart

    n A problem of estimating density functions in ahigh-dimensional space and dividing

    the space into the regions ofcategories orclasses Fukunaga

    n Given some examples ofcomplex signals and the correctdecisions for them, make

    decisions automatically for a stream of future examples Ripley

    n The science that concerns the description orclassification (recognition) ofmeasurements Schalkoff

    n The process of giving names ? to observations x, Schrmann

    n Pattern recognition is concerned with answering the question What is this?

    Morse

    Introduction to Pattern Recognition 6

    Examples of Pattern Recognition Problems

    n Machine vision

    n Visual inspection

    n Imaging device detects ground target

    n Classification into friend or foe

    n Character recognition

    n Automated mail sorting, processing bank

    checks

    n Scanner captures an image of the text

    n Image is converted into constituent

    characters

    Introduction to Pattern Recognition 7

    Examples of Pattern Recognition Problems

    n Computer aided diagnosis

    n Medical imaging,EEG(electroencephalogram),

    ECG(electrocardiogram) signal analysis

    n Designed to assist (not replace)

    physicians

    n Example: X-ray mammography

    n 10-30% false negatives in x -ray

    mammograms

    n 2/3 of these could be prevented with

    proper analysis

    n Speech recognition

    n Human Computer Interaction, UniversalAccess

    n Microphone records acoustic signal

    n Speech signal is classified into

    phonemes and/or words

    Introduction to Pattern Recognition 8

    Related Fields and Application Areas for PR

    n Related fields

    n Adaptive Signal Processing

    n Machine Learning

    n Artificial Neural Networks

    n Robotics and Vision

    n Cognitive Sciences

    n Mathematical Statistics

    n Nonlinear Optimization

    n Exploratory Data Analysis

    n Fuzzy and Genetic systems

    n Detection and Estimation Theory

    n Formal Languages

    n Structural Modeling

    n Biological Cybernetics

    n Computational Neuroscience

    n Applications

    n Image Preprocessing /Segmentation

    n Computer Vision

    n Speech Recognition

    n Automated Target Recognition

    n Optical Character Recognition

    n Seismic Analysis

    n Man and Machine Diagnostics

    n Fingerprint Identification

    n Industrial Inspection

    n Financial Forecast

    n Medical Diagnosis

    n ECG Signal Analysis

  • 7/30/2019 pattern recognition Ch01 Introduction

    3/7

    Introduction to Pattern Recognition 9

    Components of a Pattern Recognition System

    n A basic pattern classification system contains

    n A sensor

    n A preprocessing mechanism

    n A feature extraction mechanism (manual or automated)

    n A classification algorithm

    n A set of examples (training set) already classified or described

    Introduction to Pattern Recognition 10

    General Pattern Recognition System

    n The objects to be classified are first sensed by transducer (camera). Whose signals arepreprocessed, then the features extracted and finally the classification emitted

    Sensor

    Sensor

    FeatureExtraction

    FeatureExtraction

    Modeling

    Classification

    a prioriinformation

    Scene

    with model

    Scenewith object

    to be recognized

    Decision andSpatial information

    Training

    ClassificationMeasurement

    vectorFeaturevector

    Introduction to Pattern Recognition 11

    Types of Prediction Problems

    n Classification

    n The PR problem of assigning an object to a class

    n The output of the PR system is an integer label

    n e.g. classifying a product as good or bad in a quality control test

    n Regression

    n A generalization of a classification task

    n The output of the PR system is a real-valued number

    n

    e.g. predicting the share value of a firm based on past performance and stockmarket indicators

    n Clustering

    n The problem of organizing objects into meaningful groups

    n The system returns a (sometimes hierarchical) grouping of objects

    n e.g. organizing life forms into a taxonomy of species

    n Description

    n The problem of representing an object in terms of a series of primitives

    n The PR system produces a structural or linguistic description

    n e.g. labeling an ECG signal in terms of P, QRS and T complexes

    Introduction to Pattern Recognition 12

    Features and Patterns

    n Feature

    n Feature is any distinctive aspect, quality or characteristic

    n Features may be symbolic (i.e., color) or numeric (i.e., height)

    n Definitions

    n The combination of d features is represented as a d-dimensional column vectorcalled a feature vector

    n The d-dimensional space defined by the feature vector is called the feature

    space

    n Objects are represented as points in feature space. This representation is

    called a scatter plot

  • 7/30/2019 pattern recognition Ch01 Introduction

    4/7

    Introduction to Pattern Recognition 13

    Features and Patterns

    n Pattern

    n Objects are represented by set of measurements

    n Measurement may be an image, a drawing, a waveform, a temporal/spatial

    history of events, a set of measurements, the state of system, the arrangementof a set of objects, and so forth.

    n Pattern is a basic set of measurements or observations, perhaps represented in

    vectoror matrix notation.

    n A pattern(ex. waveform, character, ...) is represented by a vector in an n-

    dimensional space(multi-dimensional space, hyperspace)n A pattern can be represented as a point in hyperspace.

    n Pattern is a composite of traits or features characteristic of an individual

    n The feature vectors represented by a vector form, are in a smaller dimension than

    pattern vector dimension

    n In classification tasks, a pattern is a pair of variables {x, ?} where

    n xis a collection of observations or features (feature vector)

    n is the concept behind the observation (label)

    featurexwhere

    x

    x

    x

    i

    n

    :,

    2

    1

    =M

    X

    Introduction to Pattern Recognition 14

    Features and Patterns

    n What makes a good feature vector?

    n The quality of a feature vector is related to its ability to discriminate examples fromdifferent classes

    n Examples from the same class should have similar feature values

    n Examples from different classes have different feature values

    n More feature properties

    Introduction to Pattern Recognition 15

    Classifiers

    n The task of a classifier is to partition feature

    space into class-labeled decision regions

    n Borders between decision regions are called

    decision boundaries

    n The classification of feature vectorxconsists

    of determining which decision region it

    belongs to, and assign xto this class

    n A classifier can be represented as a set of discriminant functions

    n The classifier assigns a feature vectorx to class i if gi(x)>gj(x) j?i

    Introduction to Pattern Recognition 16

    Pattern Recognition Approaches

    n Statistical (StatPR)

    n Patterns classified based on an underlying statistical model of the features

    n The statistical model is defined by a family of class-conditionalprobabilitydensity functions Pr(x|ci) (Probability of feature vectorxgiven class ci)

    n Neural (NeurPR)

    n Classification is based on the response of a network of processing units (neurons)to an input stimuli (pattern)

    n Knowledge is stored in theconnectivity and strength of the synapticweights

    n NeurPR is a trainable, non-algorithmic, black-box strategy

    n NeurPR is very attractive sincen it requires minimum a priori knowledge

    n with enough layers and neurons, an ANN can create any complex decisionregion

    n Syntactic (SyntPR)

    n Patterns classified based on measures of structural similarity

    n Knowledge is represented by means offormal grammars or relationaldescriptions (graphs)

    n SyntPR is used not only for classification, but also for description

    n Typically, SyntPR approaches formulate hierarchical descriptions of complexpatterns built up from simpler sub patterns

  • 7/30/2019 pattern recognition Ch01 Introduction

    5/7

    Introduction to Pattern Recognition 17

    Example: Neural, Statistical and Structural OCR

    Introduction to Pattern Recognition 18

    A Simple Pattern Recognition Problem

    n Consider the problem of recognizing the letters L,P,O,E,Q

    n Determine a sufficient set of features

    n Design a tree-structured classifier

    Introduction to Pattern Recognition 19

    The Pattern Recognition Design Cycle

    n Data collection

    n Probably the most time-intensive component of a PR project

    n How many examples are enough?

    n Featurechoice

    n Critical to the success of the PR problem

    n Garbage in, garbage out

    n Requires basic prior knowledge

    n

    Modelchoicen Statistical, neural and structural approaches

    n Parameter settings

    n Training

    n Given a feature set and a blank model, adapt the model to explain the data

    n Supervised, unsupervised and reinforcement learning

    n Evaluation

    n How well does the trained model do?

    n Overfitting vs. generalization

    Introduction to Pattern Recognition 20

    The Pattern Recognition Design Cycle

    n Consider the following scenario

    n A fish processing plan wants to automate the process of sorting incoming fishaccording to species (salmon or sea bass)

    n The automation system consists of

    n A conveyor belt for incoming products

    n two conveyor belts for sorted products

    n A pick-and-place robotic arm

    n A vision system with an overhead CCD camera

    n A computer to analyze images and control the robot arm

  • 7/30/2019 pattern recognition Ch01 Introduction

    6/7

    Introduction to Pattern Recognition 21

    The Pattern Recognition Design Cycle

    n Sensor

    n The vision system captures an image as a new fish enters the sorting area

    n Preprocessing

    n Image processing algorithms

    n adjustments for average intensity levels

    n segmentation to separate fish from background

    n Feature Extraction

    n

    Suppose we know that, on the average, sea bass is larger than salmonn From the segmented image we estimate the length of the fish

    n Classification

    n Collect a set of examples from both species

    n Compute the distribution of lengths for both

    classes

    n Determine a decision boundary (threshold) that

    minimizes the classification error

    n We estimate the classifier s probability of errorand obtain a discouraging result of 40%

    n What do we do now?

    Introduction to Pattern Recognition 22

    The Pattern Recognition Design Cycle

    n Improving the performance of our PR system

    n Determined to achieve a recognition rate of 95%, we try a number of features

    n Width, Area, Position of the eyes, mouth...

    n only to find out that these features contain no discriminatory information

    n Finally we find a good feature: average intensity of the scales

    n We combine length and averageintensity of the scales to improve class

    separability

    n We compute a linear discriminant functionto separate the two classes, and obtain a

    classification rate of 95.7%

    Introduction to Pattern Recognition 23

    The Pattern Recognition Design Cycle

    n Cost Versus Classification rate

    n Our linear classifier was designed to minimize the overall misclassification rate

    n Is this the best objective function for our fish processing plant?

    n The cost of misclassifying salmon as sea bass is that the end customer will

    occasionally find a tasty piece of salmon when he purchases sea bass

    n The cost of misclassifying sea bass as salmon is an end customer upset whenhe finds a piece of sea bass purchased at the price of salmon

    n Intuitively, we could adjust the decision boundary to minimize this cost function

    Introduction to Pattern Recognition 24

    The Pattern Recognition Design Cycle

    n The issue of generalization

    n The recognition rate of our linear classifier (95.7%) met the design specs, but westill think we can improve the performance of the system

    n We then design an artificial neural network with five hidden layers, acombination of logistic and hyperbolic tangent activation functions, train it withthe Levenberg-Marquardt algorithm and obtain an impressive classification rate

    of 99.9975% with the following decision boundary

    n Satisfied with our classifier, we integrate thesystem and deploy it to the fish processing

    plantn After a few days, the plant manager calls

    to complain that the system is

    misclassifying an average of 25% of thefish

    n What went wrong?

  • 7/30/2019 pattern recognition Ch01 Introduction

    7/7

    Introduction to Pattern Recognition 25

    Pattern Recognition References

    n The following books cover statistical pattern recognition and related topics in depth. Informationavailable over the Web is currently rather limited, although one can find a lot of related work on neuralnetworks, which provide an attractive way to implement pattern classifiers.

    1. P. A. Devijver and J. Kittler, Pattern Recognition: A Statistical Approach (Prentice-Hall International,Englewood Cliffs, NJ, 1980). A very clear presentation of the mathematical foundations.

    2. R. O. Duda and P. E. Hart, Pattern Classification and Scene Analysis (Wiley-Interscience,NewYork,1973). Still in print after all these years.

    3. K.Fukunaga, Introduction to Statistical Pattern Recognition, 2nd Ed. (Academic Press, New York,1990. A standard, widely-usedtextbook.

    4. D. J. Hand, Discrimination and Classification (John Wiley and Sons, Chichester, UK, 1981). A warmlyrecommended introductory book.

    5. S.Haykin, NeuralNetworks (MacMillan, NY, 1993). There are dozens of interesting books on neuralnetworks. Haykin is an excellent, engineering-oriented textbook.

    6. T. Masters,Advanced Algorit hms for Neural Networks (Wiley, NY, 1995). Well described by the title,with a chapter devoted to the often overlooked issue of validation.

    7. Y-H Pao,Adaptive Pattern Recognition and Neural Network s(Addison-Wesley Publishing Co.,Reading, MA, 1989). Augments statistical procedures by including neural networks, fuzzy-setrepresentations and self-organizing feature maps.

    8. R.Schalkoff, Pattern Recognition: Statistical, Structural and Neural Approaches (John Wiley & Sons,New York, 1992). A clear presentation of the essential ideas in three important approaches to pat ternrecognition.

    9. M.Stefik, Introduction to Knowledge Systems (Morgan Kaufmann,San Francisco, CA, 1995).Although this exce llent book is not oriented towards pattern recogniti on per se, the methods itpresents are the basis for knowledge-based pattern recognition.


Recommended