27
Towards Towards a Mapping of a Mapping of Modern AIS and Learning Modern AIS and Learning Classifier Systems Classifier Systems Larry Bull Department of Computer Science & Creative Technologies University of the West of England, U.K.

Towards a Mapping of Modern AIS and Learning Classifier Systems

Embed Size (px)

DESCRIPTION

Towards a Mapping of Modern AIS and Learning Classifier Systems. Larry Bull Department of Computer Science & Creative Technologies University of the West of England, U.K. Background. - PowerPoint PPT Presentation

Citation preview

Page 1: Towards  a Mapping of Modern AIS and Learning Classifier Systems

TowardsTowards a Mapping of a Mapping of Modern AIS and Learning Modern AIS and Learning

Classifier SystemsClassifier Systems

Larry Bull

Department of Computer Science & Creative Technologies

University of the West of England, U.K.

Page 2: Towards  a Mapping of Modern AIS and Learning Classifier Systems

BackgroundBackgroundFor 25 years correlations between aspects

of AIS and Learning Classifier Systems (LCS) have been highlighted.

Neither field appears to have benefitted.More recently, an LCS has been presented

for unsupervised learning which, with hindsight, may be viewed as a form of AIS.

Purpose is to bring this LCS to the attention of the AIS community with the aim of serving as a catalyst for sharing ideas and mechanisms.

Page 3: Towards  a Mapping of Modern AIS and Learning Classifier Systems

LCS in a NutshellLCS in a NutshellInvented by John Holland circa 1976.Consist of an “ecology” of rules.IF <states> AND <action> THEN

RewardTraditionally use reinforcement

learning techniques to approximate rule utility.

Use evolutionary computing techniques to discover new rules.

Often incorporate other heuristics.

Page 4: Towards  a Mapping of Modern AIS and Learning Classifier Systems

Environment

reward

[P]

10#0:11

EA

[M]

[A]

[A]-1

Action selection

Prediction

0,10,2,9

state action

Q-learning

Page 5: Towards  a Mapping of Modern AIS and Learning Classifier Systems

CS-1Holland &

Reitman ‘78

LCSHolland ‘80

BooleWilson ‘87

ZCSWilson ‘94

NewBooleBonelli et al.

‘90

XCSWilson ‘95

UCSBernado-Mansilla & Garrell ‘03

XCSFWilson ‘00

GoferBooker ‘82

XCSCTammee et

al.’08

CFCS2

Riolo ‘90

ACSStolzmann

‘98

ACS2Butz et al.

‘02

Regression(& Reinforcement)

Reinforcement

Supervised Unsupervised Models

LearningClassifierSystems

Family Tree1978-2008

Animat

Wilson ‘85

Page 6: Towards  a Mapping of Modern AIS and Learning Classifier Systems

From LCS to AISFrom LCS to AISRecently presented a novel variant of

XCS for data clustering.Approach exploits the mechanisms

inherent to XCS but for unsupervised learning.

Aim is to develop an approach to learning rules which accurately describe clusters - without prior assumptions as to their number within a given dataset.

With hindsight approach is a form of clonal selection AIS.

Page 7: Towards  a Mapping of Modern AIS and Learning Classifier Systems

YCSC SchematicYCSC Schematic

Data

cluster descriptor

EA[M]

[P]

data

Error updates

Page 8: Towards  a Mapping of Modern AIS and Learning Classifier Systems

Rule Representation: Bounded Rule Representation: Bounded AffinityAffinityA condition consists of intervals:

{ {c1 ,s1}, ….. {cd ,sd} }

c is the interval’s range centre from [0.0,1.0]

s is the “spread” from that centre (truncated).

d is the number of dimensions. Each interval predicates’ upper and lower

bounds are calculated as: [ci - si, ci + si].

Page 9: Towards  a Mapping of Modern AIS and Learning Classifier Systems

FitnessFitnessEach rule maintains a running

estimate of matching error and niche size.

Error is derived from the Euclidean distance with respect to the input x and c in the condition of each member of [M]:

Page 10: Towards  a Mapping of Modern AIS and Learning Classifier Systems

NichesNichesNiche size estimates () are based

on match sets, i.e., number of concurrently active rules:

j j + ( |[M]| - j)

A time-triggered Genetic Algorithm is run in the match sets.

Page 11: Towards  a Mapping of Modern AIS and Learning Classifier Systems

SelectionSelectionAll rules maintain a time-stamp of the cycle

when they were last in an [M] where the GA was used.

If GA cycles or more have passed on average for all rules in a current [M], the GA is triggered.

The GA uses roulette-wheel selection with a scalable function:

1 Fitness = v + 1

Time-stamps are reset for all members of [M]

Page 12: Towards  a Mapping of Modern AIS and Learning Classifier Systems

SearchSearchOffspring are produced via mutation

(probability ) where we mutate an allele by adding an amount + or - rand(m0).

Crossover (probability , two-point) can occur between any two alleles, i.e., within an interval predicate as well as between predicates.

If no rules match on a given time step, then a covering operator is used which creates a rule with its condition centre on the input value and the spread with a range of rand(s0), which then replaces an existing member of the rulebase.

Page 13: Towards  a Mapping of Modern AIS and Learning Classifier Systems

ReplacementReplacementRule replacement is population wide

and proportional to niche occupancy.Each rule maintains an estimate of

the size of [M] in which it occurs.Roulette-wheel selection.Encourages all niches to contain the

same number of rules; rule resource is balanced.

Page 14: Towards  a Mapping of Modern AIS and Learning Classifier Systems

Learning ProcessLearning Process

GeneralizationMax gen.0

1

Fitn

ess

niche

1/error

Page 15: Towards  a Mapping of Modern AIS and Learning Classifier Systems

ExperimentsExperimentsClustering is an important

unsupervised classification technique where a set of data are grouped into clusters.

Done in such a way that data in the same cluster are similar in some sense and data in different clusters are dissimilar in the same sense.

Page 16: Towards  a Mapping of Modern AIS and Learning Classifier Systems

Some DataSome DataUsed randomly generated synthetic

datasets.The first dataset is well-separated and has

k = 25 true clusters arranged in a 5x5 grid in d = 2 dimension.

Each cluster is generated from 400 data points using a Gaussian distribution with a standard deviation of 0.02, for a total of n = 10,000 datum.

The second dataset is not well-separated and generated it in the same way as the first except the clusters are not centred on that of their given cell in the grid.

Page 17: Towards  a Mapping of Modern AIS and Learning Classifier Systems

ExamplesExamples

Page 18: Towards  a Mapping of Modern AIS and Learning Classifier Systems

Experimental DetailExperimental DetailThe parameters used were:

N=800, =0.2, v=5, =0.8, =0.04, GA =12, s0 =0.03, m0 =0.006.

All results presented are the average of ten runs.

Learning trials consisted of 200,000 presentations of a randomly sampled data point.

Page 19: Towards  a Mapping of Modern AIS and Learning Classifier Systems

Example Initial ResultsExample Initial Results

Page 20: Towards  a Mapping of Modern AIS and Learning Classifier Systems

CompactionCompactionMany overlapping rules are seen

around each true cluster.Developed a four-step rule

compaction algorithm to remove overlaps:◦Delete useless rules (v.low coverage)◦Sort on numerosity◦Sort on error◦Extract largest [M] rules

Page 21: Towards  a Mapping of Modern AIS and Learning Classifier Systems

Example Result after Example Result after CompactionCompaction

Page 22: Towards  a Mapping of Modern AIS and Learning Classifier Systems

Comparative PerformanceComparative PerformanceWe use as a measure of the quality of

each clustering solution the total of the k-means objective function.

Quality of LCS was 8.12 +/- 0.54 and the number of clusters 25.0 +/- 0.

The average quality on the not well-separated dataset was 24.50 +/- 0.56 and the number of clusters 14.0 +/- 0.

The k-means algorithm (k=25) averaged over 10 runs gives a quality of 32.42 +/- 9.49 and 21.07 +/- 5.25 on the well-separated and less-separated datasets respectively.

Page 23: Towards  a Mapping of Modern AIS and Learning Classifier Systems

Comparative Performance Comparative Performance IIIIFor estimating the number of clusters we

ran, for 10 times each, different k (2 to 30) with different random initializations in k-means.

To select the best clustering with different numbers of clusters, the Davies-Bouldin validity index was used.

The result on well-separated dataset has a lower negative peak at 23 clusters and the less-separated dataset has a lower negative peak at 14 clusters.

Thus LCS better on separated data (25).

Page 24: Towards  a Mapping of Modern AIS and Learning Classifier Systems

A Network-like ExtensionA Network-like ExtensionOne of the missing parts of XCS

is a niche fitness sharing mechanism.

Here rules adjust their fitnesses based on the fitnesses of the other co-active rules.

Termed relative accuracy (f’):

f’ = f / f

Page 25: Towards  a Mapping of Modern AIS and Learning Classifier Systems

Gives Improved Gives Improved PerformancePerformance

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Page 26: Towards  a Mapping of Modern AIS and Learning Classifier Systems

ConclusionsConclusionsSimilarities (and differences) between

AIS and LCS have long been noted.Views taken from many different

perspectives: dynamical systems, networks, complex adaptive systems, etc.

A recently presented LCS as a clustering technique is essentially a clonal selection AIS.

Can mechanisms from both fields now be consolidated to mutual benefit?

Page 27: Towards  a Mapping of Modern AIS and Learning Classifier Systems

Some PossibilitiesSome PossibilitiesTheory and mechanisms for

generalization.Adaptive rates of search.Theory from ensembles/mixture-

of-experts.Representation schemes.Memory.N.B. A new theory of neuronal

replicators implies innate and adaptive components in learning.