97
Reservoir Computing Methods Basics and Recent Advances Claudio Gallicchio 1

Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ReservoirComputingMethodsBasicsandRecentAdvances

ClaudioGallicchio

1

Page 2: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ContactInformationClaudio Gallicchio, Ph.D.AssistantProfessorComputational Intelligence and Machine Learning Grouphttp://www.di.unipi.it/groups/ciml/

Dipartimento di Informatica - Universita' di PisaLargo Bruno Pontecorvo 3, 56127 Pisa, Italy– PoloFibonacci

Research onReservoir Computing

ChairoftheIEEETaskForceonRChttps://sites.google.com/view/reservoir-computing-tf/home

web: www.di.unipi.it/~gallicchemail: [email protected].:+390502213145

2

Page 3: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

DynamicalRecurrentModels} Neuralnetworkarchitectureswithfeedbackconnectionsareabletodealwithtemporaldata inanaturalfashion

} Computationisbasedondynamicalsystems

3

dynamical component

feed-forward component

Page 4: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

RecurrentNeuralNetworks(RNNs)} Feedbacksallowstherepresentationofthetemporal context inthestate(neuralmemory)

} Discrete-timenon-autonomousdynamicalsystem} Potentiallytheinputhistorycanbemaintainedforarbitraryperiodsoftime

} Theoreticallyverypowerful} Universalapproximationthroughlearning

4

Page 5: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

LearningwithRNNs(repetita)} UniversalapproximationofRNNs(e.g.SRN,NARX)throughlearning

} Trainingalgorithmsinvolvesomedownsidesthatyoualreadyknow} Relativelyhighcomputationaltrainingcostsandpotentiallyslowconvergence

} Localminimaoftheerrorfunction(whichisgenerallynon-convex)

} Vanishingofthegradientsandproblemoflearninglong-termdependencies} Alleviatedbygatedrecurrentarchitectures(althoughtrainingismadequitecomplexinthiscase)

5

Page 6: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

DynamicalRecurrentNetworkstrainedeasily

Question:} IsitpossibletotrainRNNarchitecturesmoreefficiently?

} Wecanshiftthefocusfromtrainingalgorithmstothestudyofinitializationconditionsandstability oftheinput-drivensystem

} Toensurestabilityofthedynamicalpartwemustimposeacontractivepropertytothesystemdynamics

6

Page 7: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

LiquidStateMachines} W.Maas,T.Natschlaeger,H.Markram (2002)

} Originatedfromthestudyofbiologicallyinspiredspikingneurons} Theliquidshouldsatisfyapointwiseseparationproperty} Dynamicsprovidedbyapoolofspikingneuronswithbio-inspiredarch.

7

W. Maass, T. Natschlaeger, and H. Markram, Real-time computing withoutstable states: A new framework for neural computation based on perturbations,Neural Computation. 14(11), 2531–2560, (2002)

Integrate-and-fire

Izhikevich

Page 8: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

FractalPredictionMachines} P.Tino,G.Dorffner (2001)

} ContractiveIteratedFunctionSystems} FractalAnalysis

8

Tino, P., Dor®ner, G.: Predicting the future of discrete sequences from fractal representations of the past. Machine Learning 45 (2001) 187-218

Page 9: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

EchoStateNetworks} H.Jaeger(2001)

} Controlthespectralpropertiesoftherecurrencematrix} EchoStateProperty

9

Jaeger, H.: The "echo state" approach to analysing and training recurrent neural networks. Technical Report GMD Report 148, German National Research Center for Information Technology (2001)

Jaeger, H., Haas, H.: Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication. Science 304 (2004) 78-80

𝒅(𝑡)

𝒅 𝑡− 𝑦(𝑡)

Page 10: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ReservoirComputing

10

} Reservoir:untrainednon-linearrecurrenthiddenlayer} Readout:(linear)outputlayer

} Initialize𝐖() and𝐖*randomly

} Scale𝐖* tomeetthecontractive/stabilityproperty

} Drivethenetworkwiththeinputsignal

} Discardaninitialtransient} Trainthereadout

𝐱 t = tanh(𝐖()𝒖 𝑡 +𝐖* 𝐱(t − 1))

𝐲 t = 𝐖678𝒙 𝑡

Verstraeten, David, et al. "An experimentalunification of reservoir computingmethods." Neural networks 20.3 (2007): 391-403.

Page 11: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

11

EchoStateNetworks

Page 12: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

EchostateNetwork:Architecture

12

Input Space: Reservoir State Space: Output Space:

} Reservoir:untrained,large,sparselyconnected,non-linearlayer

} Readout:trained,linearlayer

Page 13: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

EchostateNetwork:Architecture

13

Input Space: Reservoir State Space: Output Space:

Reservoir} Non-linearlyembedtheinputintoahigherdimensionalfeaturespacewhere

theoriginalproblemismorelikelytobesolvedlinearly(Cover’sTh.)} Randomizedbasisexpansioncomputedbyapoolofrandomizedfilters} Providesa“rich”setofinput-drivendynamics} Contextualizeeachnewinputgiventhepreviousstate:memory

Efficient:therecurrentpartisleftcompletelyuntrained

Page 14: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

EchostateNetwork:Architecture

14

Input Space: Reservoir State Space: Output Space:

Readout} Computethefeaturesinthereservoirstatespacefortheoutputcomputation

} Typicallyimplementedbyusinglinearmodels

Page 15: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

Reservoir:StateComputation

15

} Itisalsousefultoconsidertheiteratedversionofthestatetransitionfunction} thereservoirstateafterthepresentationofanentireinputsequence

} Thereservoirlayerimplementsthestatetransitionfunctionofthedynamicalsystem

Page 16: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

EchoStateProperty(ESP)AvalidESNshouldsatisfythe“EchoStateProperty”(ESP)} Def.AnESNsatisfiestheESPwhenever:

} Thestateofthenetworkasymptoticallydependsonlyonthedrivinginputsignal

} Dependenciesontheinitialconditionsareprogressivelylost} Equivalentdefinitions:statecontractivity,stateforgettingandinputforgetting

16

Page 17: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ConditionsfortheESPTheESPcanbeinspectedbycontrollingthealgebraicpropertiesoftherecurrentweightmatrix} Theorem.Ifthemaximumsingularvalueofislessthan1thentheESNsatisfiestheESP.} SufficientconditionfortheESP(contractivedynamicsforeveryinput)

} Theorem. Ifthespectralradiusofisgreaterthan1than(undermildassumptions)theESNdoesnotsatisfytheESP.} NecessaryconditionfortheESP(stabledynamics)

} recall:thespectralradiusisthemaximumamongtheabsolutevaluesoftheeigenvalues

17

Page 18: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ESNInitialization:HowtosetuptheReservoir

18

} Elementsin𝐖() areselectedrandomlyin[−𝑠𝑐𝑎𝑙𝑒(), 𝑠𝑐𝑎𝑙𝑒()]} 𝐖* initializationprocedure:

} Startwitharandomlygeneratedmatrix𝐖*AB)C} Scale𝐖*AB)C tomeettheconditionfortheESP(usually:thenecessaryone)

Page 19: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ESNTraining

19

} Runthenetworkonthewholeinputsequenceandcollectthereservoirstates

} Discardaninitialtransient(washout)} Solvetheleastsquaresproblemdefinedby

s

Page 20: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

TrainingtheReadout

20

} On-linetrainingisnotthestandardchoiceforESNs} LeastMeanSquaresistypicallynotsuitable

} Higheigenvaluespread(i.e.largeconditionnumber)ofX

} RecursiveLeastSquaresismoresuitable

} Off-linetrainingisstandardinmostapplications} Closedformsolutionoftheleastsquaresproblembydirectmethods} Moore-Penrosepseudo-inversion

} Possibleregularizationusingrandomnoiseinthestates

} Ridge-regression

} 𝝀𝒓 isaregularizationcoefficient(thehigher,themorethereadoutisregularized)

Page 21: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

TrainingtheReadout/2

21

} Multiplereadoutsforthesamereservoir} Solvingmorethan1taskwiththesamereservoirdynamics

} Otherchoicesforthereadout:} Multi-layerPerceptron} SupportVectorMachine} K-NearestNeighbor} …

Page 22: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ESN– AlgorithmicDescription:Training} Initialization

} Win = 2*rand(Nr,Nu) - 1; Win = scale_in * Win;

} Wh = 2*rand(Nr,Nr) - 1; Wh = rho * (Wh / max(abs(eig(Wh)));

} state = zeros(Nr,1);

} Runthereservoirontheinputstream} for t = 1:trainingSteps

state = tanh(Win * u(t) + Wh * state);X(:,end+1) = state;

end

} Discardthewashout} X = X(:,Nwashout+1:end);

} Trainthereadout} Wout = Ytarget(:,Nwashout+1:end)*X’*inv(X*X’+lambda_r*eye(Nr));

} TheESNisnowreadyforoperation(estimations/predictions)

22

Page 23: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ESN– AlgorithmicDescription:OperationPhase} Runthereservoirontheinputstream(testpart)

} for t = testStepsstate = tanh(Win * u(t) + Wh * state);output(:,end+1) = Wout * state;

end

} Note:whenworkingonasingletime-seriesyoudonotneedto} re-initializethestate} discardtheinitialtransient

23

Page 24: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ESNHyper-parameterization&ModelSelectionImplementESNsfollowingagoodpracticeformodelselection(likeforanyotherML/NNmodel)} Carefulselectionofnetwork’shyper-parameters

} reservoirdimension} spectralradius} inputscaling} readoutregularization} …

24

Page 25: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ESNMajorArchitecturalVariants

25

} directconnectionsfromtheinputtothereadout

} feedbackconnectionsfromtheoutputtothereservoir

} mightaffectthestabilityofthenetwork’sdynamics} smallvaluesaretypicallyused

Page 26: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ESNforsequence-to-elementtasks} Thelearningproblemrequiresonesingleoutputforeachinputsequence} Granularityofthetaskisonentiresequences(notontime-steps)

} example:sequenceclassification

26

inputsequence

outputelement

time→

𝒙(𝟏)

𝒙(𝟑)

𝒙(𝟒)𝒙(𝟓)

𝒙(𝟐)

𝒙(𝒔) 𝒚(𝒔)

} Laststate} 𝒙 𝑠 = 𝒙 𝑁N

} Meanstate} 𝒙 𝑠 = O

PQR ∑ 𝒙(𝑡)PQ8TO

} Sumstate} 𝒙 𝑠 = ∑ 𝒙(𝑡)PQ

8TO

𝒔 = [𝒖 𝟏 ,… , 𝒖 𝑵𝒔 ]

Page 27: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

LeakyIntegratorESN(LI-ESN)

27

} Useleakyintegratorreservoirunits

} Applyanexponentialmovingaveragetoreservoirstates} low-passfiltertobetterhandleinputsignalsthatchangeslowlywith

respecttothesamplingfrequency

} theleakingrateparameter 𝑎 ∈ 0,1} controlsthespeedofreservoirdynamicsinreactiontotheinput} smallervaluesimplyreservoirthatreactmoreslowlytotheinputchanges} ifa=1 thenstandardESNdynamicsareobtained

Page 28: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

28

ExamplesofApplications

Page 29: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ApplicationsofESNs:Examples/1} ESNsformodelingchaotictimeseries} Mackey-Glasstimeseries

29

contraction coefficient reservoir dimension

} forα>16.8thesystemhasachaoticattractor

} mostusedvaluesare17and30

ESNperformanceontheMG17task

Page 30: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ApplicationsofESNs:Examples/2} Forecastingofindoorusermovements

30

Generalizationofpredictiveperformancetounseenenvironments

https://archive.ics.uci.edu/ml/datasets/Indoor+User+Movement+Prediction+from+RSS+dataDatasetisavailableonlineontheUCIrepository

q DeployedWSN:4fixedsensors(anchors)&1sensorwornbytheuser(mobile)

q PredictiftheuserwillchangeroomwhensheisinpositionM

q Input:receivedsignalstrength(RSS)datafromthe4anchors(10dimensionalvectorforeachtimestep,noisydata)

q Target:binaryclassification(changeenvironmentalcontextornot)

http://fp7rubicon.eu/

Page 31: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ApplicationsofESNs:Examples/2} Forecastingofindoorusermovements– Inputdata

31

exampleoftheRSStracesgatheredfromallthe4anchorsintheWSN,fordifferentpossiblemovementpaths

Page 32: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ApplicationsofESNs:Examples/3} HumanActivityRecognition(HAR)andLocalization

32

q Input fromheterogeneoussensorsources(datafusion)q Predictingeventoccurrenceandconfidenceq Highaccuracy ofeventrecognition/indoorlocalization

>90%ontestdataq Effectivenessinlearning avarietyofHAR tasksq Effectivenessintrainingonnew events

Page 33: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ApplicationsofESNs:Examples/4} Robotics

33

q Indoorlocalizationestimationincriticalenvironment(StellaMarisHospital)

q PreciserobotlocalizationestimationusingnoisyRSSIdata(35cm)

q Recalibrationincaseofenvironmentalalterationsorsensormalfunctions

q Input:temporalsequencesofRSSIvalues(10dimensionalvectorforeachtimestep,noisydata)

q Target:temporalsequencesoflaser-basedlocalization(x,y)

Page 34: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ApplicationsofESNs:Examples/7} HumanActivityRecognition

34

q Classificationofhumandailyactivities fromRSSdatageneratedbysensorswornbytheuser

q Input:temporalsequencesofRSSvalues(6dimensionalvectorforeachtimestep,noisydata)

q Target:classificationofhumanactivity(bending,cycling,lying,sitting,standing,walking)

q Extremelygoodaccuracy(≈0,99)andF1score(≈0,96)

q 2ndPrizeat2013EvAAL InternationalCompetition

http://archive.ics.uci.edu/ml/datasets/Activity+Recognition+system+based+on+Multisensor+data+fusion+%28AReM%29DatasetisavailableonlineontheUCIrepository

Page 35: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ApplicationsofESNs:Examples/8

35

Wii Balance Board

BBS

} AutonomousBalanceAssessment} Anunobtrusiveautomaticsystemforbalanceassessmentinelderly} BergBalanceScale(BBS)test:14exercises/items(̴30min.)

} Input:streamofpressuredatagatheredfromthe4cornersNintendoWiiboardduringtheexecutionofjust1(overthe14)BBSexercises

} Target:globalBBSscoreoftheuser(0-56)} TheuseofRNNsallowtoautomaticallyexploittherichnessofthesignal

dynamics oremi

Page 36: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ApplicationsofESNs:Examples/8

36

} AutonomousBalanceAssessment} ExcellentpredictionperformanceusingLI-ESNs

} Practicalexampleofhowperformancecanbeimprovedinareal-worldcase} Byanappropriatedesignofthetaske.g.inclusionofclinicalparametersininput

} Byanappropriatechoicesforthenetworkdesigne.g.byusingaweightsharingapproachontheinput-to-reservoirconnections

LI-ESNmodel TestMAE(BBSpoints)

TestR

standard 4,80± 0,40 0,68

+weight 4,62± 0,30 0,69

LR weightsharing(ws) 4,03± 0,13 0,71

ws+weight 3,80± 0,17 0,76

q Verygoodcomparison

q withrelatedmodels(MLPs,TDNN,RNNs,NARX,…)

q withliteratureapproaches

oremi

Page 37: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ApplicationsofESNs:Examples/9

37

} Phonesrecognitionwithreservoirnetworks} 2-layeredad-hocreservoirarchitecture

} layersfocusondifferentrangesoffrequencies(usingappropriateleakyparameters)andfocusondifferentsub-problems

Triefenbach, Fabian, et al. "Phoneme recognition with large hierarchical reservoirs." Advances in neural information processing systems. 2010.

Triefenbach, Fabian, et al. "Acoustic modeling with hierarchical reservoirs." IEEE Transactions on Audio, Speech, and Language Processing 21.11 (2013): 2439-2450.

Page 38: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

38

ExtensionstoStructuredData

Page 39: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

LearninginStructuredDomains

39

Sequences Trees Graphs

QSPR analysis of Alkanes

Boiling Point

Predictive Toxicology ChallengeMG -Chaotic Time Series Prediction

{−1, +1}

} Inmanyreal-worldapplicationdomainstheinformationofinterestcanbenaturallyrepresentedbythemeansofstructureddatarepresentations.

} Theproblemsofinterestcanbemodeledasregressionorclassificationtasksonstructureddomains.

Page 40: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

LearninginStructuralDomains} RecursiveNeuralNetworksextendtheapplicabilityofRNNmethodologies

tolearningindomainsoftreesandgraphs} Randomizedapproachesenableefficienttrainingandstate-of-theart

performance} EchoStateNetworksextendedtodiscretestructures:

TreeandGraphEchoStateNetworks

} BasicIdea:thereservoirisappliedtoeachnode/vertexoftheinputstructure

[C. Gallicchio, A. Micheli, Priceedings of IJCNN 2010] [C. Gallicchio, A. Micheli, Neurocomputing, 2013]

Page 41: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

LearninginStructuredDomains} Thereservoiroperationisgeneralizedfromtemporalsequencestodiscretestructures

} Statetransitionsystemsondiscretetree/graphstructures

41

Standard ESN Tree ESN Graph ESN

Page 42: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

TreeESN:Reservoir} Large,sparselyconnected,untrainedlayerofnon-linearrecursiveunits

} Inputdrivendynamicalsystemondiscretetreestructures

42

Page 43: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

TreeESN:StatemappingandReadout} StateMappingfunctionfortree-to-elementtasks

} onesingleoutputvector(unstructured)isrequiredforeachinputtree

} example:documentclassification

} ReadoutcomputationandtrainingisasinthecaseofstandardReservoirComputingapproaches} tree-to-tree(isomorphictransductions)} tree-to-element

43

( )x t

Root State Mapping

Mean State Mapping

Page 44: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

TreeESN:EchoStateProperty} Therecursivereservoirdynamicscanbeleftuntrainedprovidedthatastabilitypropertyissatisfied

} TreeESP:asymptoticstabilityconditionsontreestructures

} fortwoanyinitialstates,thestatecomputedfortherootoftheinputtreeshouldconvergeforincreasingheightofthetree

} theinfluenceofaperturbationinthelabelofanodewillprogressivelyfadeaway

} SufficientconditionfortheESPfortrees} beingcontractive

} Necessarycondition} beingstable

44

[C. Gallicchio, A. Micheli, Information Sciences, 2019]

degree

Page 45: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

TreeESN:Efficiency

45

Computational ComplexityExtremely efficient RC approach: only the linear readout parameters are trained

Encoding Process

For each tree t

• Scales linearly with the number of nodes and the reservoir dimension• The same cost for training and test• Compares well with state of art methods for trees:

• RecNNs: extra cost (time + memory) for gradient computations• Kernel methods: higher cost of encoding (e.g. Quadratic in PT kernels)

number of nodes max degree number of reservoir unitsdegree of connectivity

Output Computation• Depends on the method used (e.g. Direct using SVD or iterative)• The cost of training the linear TreeESN readout is generally inferior to the cost

of training MLPs or SVMs (used in RecNNs and Kernels)

Page 46: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ReservoirinGraphEchoStateNetworks} Input-drivendynamicalsystemondiscretegraphs} Thesamereservoirarchitectureisappliedtoalltheverticesinthestructure

} Stabilityofthestateupdateensuresasolutionevenincaseofcyclicdependenciesamongstatevariables

46

t

ttt

t

t

tttt

v

t

t

t tt̂Standard ESN GraphESN

Page 47: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

GraphESN:EchoStateProperty} Astabilityconstraintisrequiredtoachieveusabledynamics

} Foundationalidea:resorttocontractivedynamics} Contractivedynamicsensuretheconvergenceoftheencodingprocess(Banach Th.)

} Enablesaniterativecomputationoftheencodingprocess

} Sufficientcondition} Necessarycondition

47

Page 48: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ResearchonReservoirComputing(Inourgroup)} Applicationstocomplexreal-worldtasks

} NLP,earthquaketime-series,humanmonitoring,…

} GatedReservoirComputingModels} RC-basedanalysisoffullytrainedRNNs} Unsupervisedadaptationofreservoirs

} edgeofstability/chaos} Bayesianoptimization

} DeepEchoStateNetworks} Advancedmathematicalanalysis} ArchitecturalconstructionofhierarchicalRC

} DeepRCforStructures

48

Page 49: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

49

EchoStateNetworkDynamics

Page 50: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

EchoStateProperty} Assumption:Inputandstatespacesarecompactsets} Areservoirnetworkwhosestateupdateisruledby

satisfiestheESPifinitialconditionsareasymptoticallyforgotten

} Thestatedynamicsprovidesapoolof“echoes”ofthedrivinginput

} Essentially,thisisanstability condition(globalasymptoticstabilityinthesenseofLyapunov)

50

Page 51: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

EchoStateProperty:Stability} Whyastableregimeissoimportant?} Anunstablenetworkexhibitssensitivitytoinputperturbations} Twoslightlydifferent(long)inputsequencesdrivethenetworkinto(asymptoticallyvery)differentstates

} Goodfortraining} Thestatevectorstendtobemoreandmorelinearlyseparable(foranygiventask)

} Badforgeneralization:overfitting!} Nogeneralizationabilityifatemporalsequencesimilartooneinthetrainingsetdrivesthenetworkintocompletelydifferentstates

51

Page 52: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ESP:SufficientCondition} ThesufficientconditionfortheESPanalyzesthecaseofcontractive dynamics ofthestatetransitionfunction

} Whateveristhedrivinginputsignal:Ifthesystemiscontractivethenitwillexhibitstability

} Inwhatfollows,weassumestatetransitionfunctionsoftheform:

52

input previous statenew state

input weight matrix recurrent weight matrix

Page 53: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

Contractivity

53

Thereservoirstatetransitionfunctionrulestheevolutionofthecorrespondingdynamicalsystem

} Def.ThereservoirhascontractivedynamicswheneveritsstatetransitionfunctionF isLipschitzcontinuouswithconstantC<1

Page 54: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

Contractivity andtheESP} Theorem. IfanESNhasacontractivestatetransitionfunctionF(andbounded

statespace),thenitsatisfiestheEchoStateProperty} Assumption:F iscontractivewithparameterC<1

} Giventhiscondition:

} WewanttoshowthattheESPholdstrue:

54

(Contractivity)

(ESP)

Page 55: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

Contractivity andtheESP} Theorem. IfanESNhasacontractivestatetransitionfunctionF,thenit

satisfiestheEchoStateProperty} Assumption:F iscontractivewithparameterC<1

55

goes to 0 as n goes to infinity à the ESP holds

Page 56: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

Contractivity andReservoirInitialization} IfthereservoirisinitializedtoimplementacontractivemappingthantheESPisguaranteed(inanynorm,foranyinput)

} FormulationofasufficientconditionfortheESP} Assumptions:

} Euclideandistanceasmetricinthestatespace(useL2-norm)} Reservoirunitswithtanh activationfunction(note:squashingnonlinearitiesboundthestatespace)

56

Page 57: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

MarkovianNatureofstatespaceorganizations} Contractivedynamicalsystemsarerelatedtosuffix-basedstatespaceorganizations

} Statesassumedincorrespondenceofdifferentinputsequencessharingacommonsuffixareclosetoeachotherproportionallytothelengthofthecommonsuffix} similarsequencesaremappedtoclosestates} differentsequencesaremappedtodifferentstates} similaritiesanddissimilaritiesareintendedinasuffix-basedfashion

} RNNsinitializedwithsmallweights(withcontractivestatetransitionfunction)andboundedstatespaceimplement(approximatearbitrarilywell)definitememorymachines

57

Hammer, B., Tino, P.: Recurrent neural networks with small weights implementdefinite memory machines. Neural Computation 15 (2003) 1897-1929

Page 58: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

MarkovianNatureofstatespaceorganizations} MarkovianArchitecturalbiasofRNNs

} recurrentweightsaretypicallyinitializedwithsmallvalues} thisleadstoatypicallycontractiveinitializationofrecurrentdynamics

} IteratedFunctionSystems,fractaltheory,architecturalbiasofRNNs

} RNNsinitializedwithsmallweights(withcontractivestatetransitionfunction)andboundedstatespaceimplement(approximatearbitrarilywell)definitememorymachines

} Thischaracterizationisabias forfullytrainedRNNs:holdsintheearlystagesoflearning

58

Hammer, B., Tino, P.: Recurrent neural networks with small weights implementdefinite memory machines. Neural Computation 15 (2003) 1897-1929

Page 59: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

Markovianity andESNs} Usingdynamicalsystemswithcontractivestatetransitionfunctions(inanynorm)impliestheEchoStateProperty(foranyinput)

} ESNsfeaturedbyfixedcontractivedynamics} RelationswiththeuniversalityofRCforboundedmemorycomputation(LSMstheory)

} ESNswithuntrainedcontractivereservoirsarealreadyabletodistinguishinputsequencesonasuffix-basedfashion

} IntheRCframeworkthisisnolongerabias,itisafixedcharacterizationoftheRNNmodel

59

Maass, W., Natschlager, T., Markram, H.: Real-time computing without stable states: A new framework for neural computation based on perturbations. Neural Computation 14 (2002) 2531-2560

Gallicchio C, Micheli A. Architectural and Markovian Factors of Echo State Networks. Neural Networks 2011;24(5):440–456.

Page 60: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

WhydoEchoStateNetworkswork?} BecausetheyexploittheMarkovianstatespaceorganization} Thereservoirconstructsahigh-dimensionalMarkovianstatespace

representationoftheinputhistory} Inputsequencessharingacommonsuffixdrivethesystemintoclosestates

} Thestatesareclosetoeachotherproportionallytothelengthofthecommonsuffix

} Asimpleoutput(readout)toolcanthenbesufficienttoseparatethedifferentcases

60

Gallicchio C, Micheli A. Architectural and Markovian Factors of Echo State Networks. Neural Networks 2011;24(5):440–456.

Page 61: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

WhendoEchoStateNetworkswork?} WhenthetargetmatchestheMarkovianassumptionbehindthereservoir

statespaceorganization} Markovianity canbeusedtocharacterizeeasy/hardtasksforESNs

Example:easytask

61

+1

+1

-1

-1

Page 62: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

WhendoEchoStateNetworkswork?} WhenthetargetmatchestheMarkovianassumptionbehindthereservoir

statespaceorganization} Markovianity canbeusedtocharacterizeeasy/hardtasksforESNs

Example:hardtask

62

+1

-1

+1

-1

?

Page 63: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ESP:NecessaryCondition} Investigating thestability ofreservoirdynamicsfromadynamicalsystemperspective

} Theorem. IfanESNhasunstabledynamicsaroundthezerostateandthezerosequenceisanadmissibleinput,thentheESPisnotsatisfied.

} Approachthisstudybylinearizingthestatetransitionfunction

63

Jacobian matrix

Page 64: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ESP:NecessaryCondition} Linearizationaroundthezerostateandfornullinput

} Remember:

} TheJacobianwithtanh neuronsisgivenby

64

Page 65: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ESP:NecessaryCondition} Linearizationaroundthezerostateandfornullinput

} Remember:

} TheJacobianwithtanh neuronsisgivenby

65

Null input assumption

Page 66: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ESP:NecessaryCondition} Thelinearizedsystemnowreads:

} 0isafixedpoint.Isitstable?} LineardynamicalsystemstheorytellsusthatIf𝜌 𝐖* < 1 thenthefixedpointisstable

} Otherwise:0isnotstableifwestartfromastatenear0 andwedrivethenetworkwitha(infinite-length)nullsequencewedonotendupin0

} Thenullsequenceisacounter-example:theESPdoesnothold!} Thereareatleasttwodifferentorbitsresultingfromthesameinputsequence

66

Page 67: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ESP:NecessaryCondition} Asufficientcondition(underourassumptions)fortheabsenceoftheESPisthat𝜌 𝐖* ≥ 1

} Hence,anecessaryconditionfortheESPisthat𝜌 𝐖* < 1

} Ingeneral:𝜌 𝐖* ≤ 𝐖* 2

} Typically,inapplications:} Thesufficientconditionisoftentoorestrictive} ThenecessaryconditionfortheESPisoftenusedtoscalethe

recurrentweightmatrix(e.g.toavalueofthespectralradiusof0.9)

} However,notethatscalingthespectralradiusbelow1inpresenceofdrivinginputisneithersufficientnornecessaryforensuringechostates

67

Page 68: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ESPIndex:aconcreteperspective} Drivinginputisnotproperlytakenintoaccountinconventionalreservoirinitializationstrategies

} Idea:calculateaveragedeviationsofreservoirstateorbitsfromdifferentinitialconditionsandundertheinfluenceofthesamedrivingexternalinput

68

Page 69: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ESPIndex:aconcreteperspective

• The set of configurations that satisfy the ESPin real cases are well beyond thosecommonly adopted in ESN practice

• A large portion of "good" reservoirs areusually neglected in common practice

69

C. Gallicchio, "Chasing the Echo State Property", ESANN 2019.

Page 70: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

70

AdvancesonEchoStateNetworks

Page 71: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ResearchonEchoStateNetworks} TheresearchonESNsfollowstwomajorcomplementaryobjectives:} StudytheintrinsicpropertiesofRNNs,takingasidetheaspectsrelatedtotrainingoftherecurrentconnections

} DevelopefficientlytrainedRNNs

} Advances…} Theoreticalanalysis} Qualityofreservoirdynamics} Architecturalstudies} DeepEchoStateNetworks} ReservoirComputingforStructures

71

Page 72: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

EchoStateProperty:Advances} MuchofthetheoreticaladvancesinthestudyofESNsaimtoestablishsimpleconditionsforreservoirinitialization

} Focusofthetheoreticalinvestigationsisshiftedtothestudyofstabilityconstraint

} Analysisofthesystemstabilitygiventheinput

} Non-autonomousdynamicalsystems

} MeanFieldTheory

} LocalLyapunovexponents

72

G. Manjunath and H. Jaeger. Echo state property linked to an input: Exploring a fundamental characteristic of recurrent neural networks. Neural computation, 25(3):671–696, 2013.

M. Massar and S. Massar. Mean-field theory of echo state networks. Physical Review E,87(4):042809, 2013.

G. Wainrib and M.N. Galtier. A local echo state property through the largest lyapunovexponent. Neural Networks, 76:39–45, 2016.

I.B. Yildiz, H. Jaeger, and S.J. Kiebel. Re-visiting the echo state property. Neural networks, 35:1–9, 2012.

Page 73: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ApplicationstoReal-worldProblems} Successfulapplicationsinseveralreal-worldproblems

} Chaotictime-seriesmodeling} Non-linearsystemidentification} Speechrecognition} Financialforecasting} Bio-medicalapplications} Robotlocalization&control} …

} Highdimensionalreservoirsareoftenneededtoachieveexcellentperformanceincomplexreal-worldtasks

} Question:canwecombinetrainingefficiencywithcompact (i.e.smallsize)ESNs?

Page 74: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

QualityofReservoirDynamics} Howtoestablishthequalityofareservoir?} IfIcanfindasuitablewaytocharacterizehowgoodareservoirisIcantrytooptimizeit

} Entropyofrecurrentunitsactivations} UnsupervisedadaptationofreservoirsusingIntrinsicPlasticity

} Studytheshort-termmemoryabilityofthesystem} MemoryCapacityandrelationstolinearity

} Edgeofstability/chaos:reservoirdynamicsattheborderofstability} Recurrentsystemsclosetoinstabilityshowoptimalperformanceswheneverthetaskathandrequireslongshort-termmemory

74

Page 75: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

Short-termMemoryCapacity} Anaspectofgreatimportanceinthestudyofdynamicalsystems istheanalysisoftheirmemoryabilities

} Jaegerintroducedalearningtask,calledMemoryCapacity(MC)toquantifyit

} Trainindividualoutputunitstorecallincreasinglydelayedversionsofaunivariatei.i.d.inputsignal

75

Jaeger, Herbert. Short term memory in echo state networks. Vol. 5. GMD-Forschungszentrum Informationstechnik, 2001.

MCisthesumofsquaredcorrelationcoefficientsbetweenthedelayedsignalsandtheoutputs

Page 76: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

Short-termMemoryCapacity} Forgettingcurvestostudythememorystructure} Plotthesquaredcorrelation(Y-axis,i.e.detCoeff)withrespecttoindividualdelays(X-axis)

76

linear reservoir units tanh reservoir units

Page 77: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

Short-termMemoryCapacitySomefundamentaltheoreticalresults:} TheMCofanetworkwithNrecurrentunitsisupper-boundedbyN} MC≤N} ItisimpossibletotrainanESNontaskswhichrequireunbounded-timememory

} Linearreservoirscanachievethemaximumbound} e.g.sufficientcondition:thematrix𝐖* O𝐰()𝐖* a𝐰() …𝐖*P𝐰() hasfullrank} example:unitaryrecurrentmatrices(i.e.orthogonalmatricesintherealcase)

} MemoryversusNon-linearitydilemma} Linearreservoirsarefeaturedbylongershort-termmemories,butnon-linear

reservoirsarerequiredtosolvecomplexreal-worldproblems....

} MemoryCapacityvsPredictiveCapacity

77

Page 78: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

ArchitecturalSetup} Howtoconstruct“better”reservoirsthanjustrandomreservoirs?

} CriticalEchoStateNetworks:relevanceoforthogonalrecurrenceweightmatrices(e.g.permutationmatrices)

78

[M.A. Hajnal, A. Lorincz, 2006]

CyclicReservoirs DelayLineReservoirs[A. Rodan, P. Tino, 2011][T. Strauss et al., 2012][J. Boedecker et al., 2009]

[M. Cernansky, P. Tino 2008][A. Rodan, P. Tino 2012]

Page 79: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

EdgeofStability} Reservoirdynamicalregimeclosetothetransitionbetweenstableandunstabledynamics} E.g.,studyofLyapunovexponents} λ=0identifiesthetransitionbetween(locally)stable(λ<0)andunstable(λ>0)dynamics

79

[J. Boedecker, O. Obst, J.T. Lizier, N.M. Mayer, and M. Asada. Information processing in echostate networks at the edge of chaos. Theory in Biosciences, 131(3):205–213, 2012.]

Page 80: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

DeepNeuralNetworks

80

DeepLearningisanattractiveresearcharea} Hierarchyofmanynon-linearmodels} Abilitytolearndatarepresentationsatdifferent(higher)levelsofabstraction

DeepNeuralNetworks(DeepNNs)} Feed-forwardhierarchyofmultiplehiddenlayersofnon-linearunits

} Impressiveperformance inreal-worldproblems(especiallyinthecognitivearea)

} Remember:deeplearninghasastrongbiologicalplausibility

Page 81: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

DeepRecurrentNeuralNetworksExtensionofthedeeplearningmethodologytotemporalprocessing.Aimat naturallycapturetemporal feature representations atdifferenttime-scales

} Text,speech,languageprocessing} Multi-layeredprocessingoftemporalinformationwithfeedbackshasastrongbiologicalplausibility

TheanalysisofdeepRNNsisstillyoung} DeepReservoirComputing:Investigatetheactualroleoflayeringindeeprecurrentarchitectures

} Stability:characterizethedynamicsofhierarchicallyorganizedrecurrentmodels

81

Page 82: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

DeepEchoStateNetwork

} Whatistheintrinsicrole oflayering inrecurrentarchitectures?

} Developnovelefficientapproachestoexploit} Multipletime-scalesrepresentations

} Extremeefficiencyoftraining

82

C. Gallicchio, A. Micheli, L. Pedrelli, "Deep Reservoir Computing: A Critical Experimental Analysis", Neurocomputing, 2017

untrained

Page 83: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

DeepRNNArchitecture:TheroleofLayering

83

ConstraintstothearchitectureofafullyconnectedRNN,by:} Removingconnectionfrominputtohigherlayers

} Removingconnectionsfromhigherlayerstolowerones

} Removingconnectionstolayersatlevelshigherthan+1

LessweightstostorethanafullyconnectedRNN

Page 84: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

DeepESN:OutputComputation

84

} Thereadoutcanmodulatethe(qualitativelydifferent)temporalfeaturesdevelopedatthedifferentlayers

Page 85: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

DeepESN:ArchitectureandDynamics

85

first layer

l-th layer (l>1)

Each layer has its own:- leaky integration

constant - Input scaling- Spectral radius- Inter-layer scaling

Page 86: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

DeepESN:ArchitectureandDynamics

86

first layer

l-th layer (l>1)

Each layer has its own:- leaky integration

constant - Input scaling- Spectral radius- Inter-layer scaling

The recurrent part of the system is hierarchically structured.Interestingly, this naturally entails a structure into the developed system dynamics

* seminar topic

Page 87: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

IntrinsicallyRicherDynamicsLayeringinRNN:aconvenientwayofarchitecturalsetup} Multipletime-scalesrepresentations} Richerdynamicsclosertotheedgeofstability} Longershort-timememory

Page 88: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

DeepESN:HierarchicalTemporalFeatures

88

Structuredrepresentationoftemporaldatathroughthedeeparchitecture

EmpiricalInvestigations} Effectsofinputperturbationslastslongerinhigherlayers

} Multipletime-scalesrepresentation

} Orderedalongthenetwork’shierarchy

C. Gallicchio, A. Micheli, L. Pedrelli, "Deep Reservoir Computing: A Critical Experimental Analysis", Neurocomputing, 2017

Page 89: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

DeepESN:HierarchicalTemporalFeatures

89

Structuredrepresentationoftemporaldatathroughthedeeparchitecture

FrequencyAnalysis } DiversifiedmagnitudesofFFTcomponents

} Multiplefrequencyrepresentation

} Orderedalongthenetwork’shierarchy

} Higherlayerstendtofocusonlowerfrequencies[C. Gallicchio, A. Micheli, L. Pedrelli, WIRN 2017]

layer 1 layer 4

layer 7 layer 10

Page 90: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

DeepESN:MathematicalBackground

90

Structuredrepresentationoftemporaldatathroughthedeeparchitecture

TheoreticalAnalysis} Higher layersintrinsicallyimplementless contractive dynamics

} EchoStatePropertyforDeepESNs

} Deepernetworksnaturallydevelopricher dynamics,closertotheedge of stability

[C. Gallicchio, A. Micheli. Cognitive Computation (2017).]

[C. Gallicchio, A. Micheli, L. Silvestri. Neurocomputing 2018.]

Page 91: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

DeepESN:PerformanceinApplications

91

Formidabletrade-offbetweenperformanceandcomputationaltime

C. Gallicchio, A. Micheli, L. Pedrelli, "Comparison between DeepESNs and gatedRNNs on multivariate time-series prediction", ESANN 2019.

Page 92: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

DeepTreeEchoStateNetworks

92

} DeepTreeEchoStateNetworks} Untrainedmulti-layeredrecursiveneuralnetwork

Gallicchio, Micheli, IJCNN, 2018.Gallicchio, Micheli, Information Sciences, 2019.

Page 93: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

DeepTreeEchoStateNetworks:Unfolding

93

Page 94: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

DeepTreeEchoStateNetworks:Advantages

94

} Effectiveinapplications

} Extremelyefficient} Layeredrecursivearchitecture} Shallowcase

Page 95: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

Conclusions} ReservoirComputing:paradigmforefficientmodelingofRNNs

} Reservoir:non-lineardynamiccomponent,untrainedaftercontractiveinitialization

} Readout:linearfeed-forwardcomponent,trained} Easy toimplement,fast totrain} Markovianflavourofreservoirstatedynamics} SuccessfulapplicationsRecentextensionstoward:

} DeepLearningarchitecture} StructuredDomains

95

Page 96: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

References} JaegerH.The“echostate”approachtoanalysing andtrainingrecurrent

neuralnetworks- withanerratumnote.Tech.rep.GMD- GermanNationalResearchInstituteforComputerScience,Tech.Rep.2001.

} JaegerH,HaasH.Harnessingnonlinearity:predictingchaoticsystemsandsavingenergyinwirelesscommunication.Science.2004;304(5667):78–80.

} Gallicchio C,Micheli A.ArchitecturalandMarkovianFactorsofEchoStateNetworks.NeuralNetworks2011;24(5):440–456.

} Lukoševičius,Mantas,andHerbertJaeger."Reservoircomputingapproachestorecurrentneuralnetworktraining." ComputerScienceReview 3.3(2009):127-149.

} Lukoševičius,Mantas."Apracticalguidetoapplyingechostatenetworks." Neuralnetworks:Tricksofthetrade.SpringerBerlinHeidelberg,2012.659-686.

96

Page 97: Reservoir Computing Methods - e-learning · Recurrent Neural Networks (RNNs)}Feedbacks allows the representation of the temporalcontextin the state (neural memory)}Discrete-time non-autonomous

References} Gallicchio,Claudio,AlessioMicheli,andLucaPedrelli."Deepreservoir

computing:Acriticalexperimentalanalysis." Neurocomputing 268(2017):87-99.

} Gallicchio,Claudio,AlessioMicheli,andLucaPedrelli."Designofdeepechostatenetworks." NeuralNetworks 108(2018):33-47.

} Gallicchio,Claudio,andAlessioMicheli."Deepechostatenetwork(deepesn):Abriefsurvey." arXiv preprintarXiv:1712.04323 (2017).

} Gallicchio,Claudio,andAlessioMicheli."DeepReservoirNeuralNetworksforTrees." InformationSciences 480(2019):174-193.

} Gallicchio,Claudio,andAlessioMicheli."Graphechostatenetworks." The2010InternationalJointConferenceonNeuralNetworks(IJCNN).IEEE,2010.

97