Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Old Dogs, New Tricks?or
The Benefits and Drawbacks of Using Network Analysisto Tackle Difficult Problems
Martin Gould
Friday 21st May, 2010
Oxford-Harvard Workshop
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 1 / 49
The Networks Explosion
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 2 / 49
Recently, understanding and mastery of network analysis techniqueshas progressed significantlyWe now possess a diverse toolbox of measures, techniques andintuitions about networks
QuestionCan techniques from network analysis be illuminating in tackling moretraditional problems in science?
“Much recent research has shown that many, and perhaps most, natural oreven artificial phenomena may be usefully and fruitfully described in terms
of networks and their properties” – Shirazi et. al
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 3 / 49
Recently, understanding and mastery of network analysis techniqueshas progressed significantlyWe now possess a diverse toolbox of measures, techniques andintuitions about networks
QuestionCan techniques from network analysis be illuminating in tackling moretraditional problems in science?
“Much recent research has shown that many, and perhaps most, natural oreven artificial phenomena may be usefully and fruitfully described in terms
of networks and their properties” – Shirazi et. al
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 3 / 49
Recently, understanding and mastery of network analysis techniqueshas progressed significantlyWe now possess a diverse toolbox of measures, techniques andintuitions about networks
QuestionCan techniques from network analysis be illuminating in tackling moretraditional problems in science?
“Much recent research has shown that many, and perhaps most, natural oreven artificial phenomena may be usefully and fruitfully described in terms
of networks and their properties” – Shirazi et. al
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 3 / 49
Problems to Consider
Dynamical Systems
Time Series
Stochastic Processes
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 4 / 49
Problems to Consider
Dynamical Systems
Time Series
Stochastic Processes
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 5 / 49
Dynamical Systems
The Rössler System:
dxdt
= −y − z
dydt
= x + ay
dzdt
= b + z(x − c)
−20−10
010
20
−20−10
010
20
−10
0
10
20
30
40
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 6 / 49
Dynamical Systems
The Recurrence Plot:
Proposed by Eckmann et. al in 1987.
1 Choose initial conditions2 Trace trajectory for specified
time length3 Place dot4 Repeat steps 2 & 3 i times5 For each i , draw ball of radiusε(i) around dot i
−20−10
010
20
−20−10
010
20
−10
0
10
20
30
40
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 7 / 49
Dynamical Systems
The Recurrence Plot:
Proposed by Eckmann et. al in 1987.
1 Choose initial conditions2 Trace trajectory for specified
time length3 Place dot4 Repeat steps 2 & 3 i times5 For each i , draw ball of radiusε(i) around dot i
−20−10
010
20
−20−10
010
20
−10
0
10
20
30
40
1
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 8 / 49
Dynamical Systems
The Recurrence Plot:
Proposed by Eckmann et. al in 1987.
1 Choose initial conditions2 Trace trajectory for specified
time length3 Place dot4 Repeat steps 2 & 3 i times5 For each i , draw ball of radiusε(i) around dot i
−20−10
010
20
−20−10
010
20
−10
0
10
20
30
40
12
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 9 / 49
Dynamical Systems
The Recurrence Plot:
Proposed by Eckmann et. al in 1987.
1 Choose initial conditions2 Trace trajectory for specified
time length3 Place dot4 Repeat steps 2 & 3 i times5 For each i , draw ball of radiusε(i) around dot i
−20−10
010
20
−20−10
010
20
−10
0
10
20
30
40
12
3
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 10 / 49
Dynamical Systems
The Recurrence Plot:
Proposed by Eckmann et. al in 1987.
1 Choose initial conditions2 Trace trajectory for specified
time length3 Place dot4 Repeat steps 2 & 3 i times5 For each i , draw ball of radiusε(i) around dot i
−20−10
010
20
−20−10
010
20
−10
0
10
20
30
40
12
34
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 11 / 49
Dynamical Systems
The Recurrence Plot:
Proposed by Eckmann et. al in 1987.
1 Choose initial conditions2 Trace trajectory for specified
time length3 Place dot4 Repeat steps 2 & 3 i times5 For each i , draw ball of radiusε(i) around dot i
−20−10
010
20
−20−10
010
20
−10
0
10
20
30
40
12
34 5
6
7
8
9
10
11
1213
1415
16
17
18
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 12 / 49
Dynamical Systems
The Recurrence Plot:
Proposed by Eckmann et. al in 1987.
1 Choose initial conditions2 Trace trajectory for specified
time length3 Place dot4 Repeat steps 2 & 3 i times5 For each i , draw ball of radiusε(i) around dot i
−20−10
010
20
−20−10
010
20
−10
0
10
20
30
40
12
34 5
6
7
8
9
10
11
1213
1415
16
17
18
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 13 / 49
Dynamical Systems
The Recurrence Plot:
Define:
Ai ,j :=
{1 if dot j is in ball i0 otherwise
Ai,j =
0BBBBBBB@
1 0 0 0 0 1 0 0 . . . 00 1 0 0 0 0 1 1 . . . 00 0 1 0 0 0 0 0 . . . 10 0 0 1 0 0 0 0 . . . 0...
. . ....
0 0 1 0 0 0 0 1 . . . 0
1CCCCCCCA−20
−100
1020
−20−10
010
20
−10
0
10
20
30
40
12
34 5
6
7
8
9
10
11
1213
1415
16
17
18
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 14 / 49
Dynamical Systems
Ai,j =
1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 00 1 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 10 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 00 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 01 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 00 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 00 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 10 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 00 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 00 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 01 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 00 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 10 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 00 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 01 0 0 0 1 1 0 0 0 0 1 0 0 0 0 1 0 00 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 00 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 15 / 49
Dynamical Systems
2 4 6 8 10 12 14 16 18
2
4
6
8
10
12
14
16
18
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 16 / 49
Dynamical Systems
The Lorentz Equations:
dxdt
= a(y − x)
dydt
= bx − y − xz
dzdt
= xy − cz
−20 −15 −10 −5 0 5 10 15 20 25−50
0
50−10
0
10
20
30
40
50
60
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 17 / 49
Dynamical Systems
The Lorentz Equations:
−20 −15 −10 −5 0 5 10 15 20 25−50
0
50−10
0
10
20
30
40
50
60
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 18 / 49
Dynamical Systems
The Lorentz Equations:
Longest line parallel to diagonalis inversely proportional to thelargest Liapunov exponent“Chessboard” texture is a resultof the trajectory lying on twoseparate “wings”
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 19 / 49
Dynamical Systems
Ai,j =
1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 00 1 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 10 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 00 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 01 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 00 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 00 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 10 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 00 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 00 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 01 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 00 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 10 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 00 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 01 0 0 0 1 1 0 0 0 0 1 0 0 0 0 1 0 00 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 00 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 20 / 49
Dynamical SystemsThe Rössler System:
10
1
2
34
5
6
7
8
9
11
12
13
14
15
1617
18
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 21 / 49
Dynamical Systems
Idea - examine the relative frequency of different motifs in the network:
w
x
y
z
w
x
y
z
w
x
y
z
w
x
y
z
w
x
y
z
w
x
y
z
A B C D E F
“The relative frequency with which the different subgraphs occur is shownto be a sensitive measure of the underlying dynamics” – Xu et al.
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 22 / 49
Dynamical Systems
Authors found motifs D and F were the most illuminating for classifyingdynamical systems
w
x
y
z
w
x
y
z
D F
Periodic flows - many F s, few DsChaotic flows - fewer F s, more Ds
Motif D is more likely to appear on trajectories which reside on a higherdimensional manifold.
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 23 / 49
Problems to Consider
Dynamical Systems
Time Series
Stochastic Processes
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 24 / 49
Time Series
Much like dynamical systems, observed time series often have a deep andinteresting structure:
Stationarity?Periodic?Fractal?
But these features can be difficult to identify directly from time series data!
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 25 / 49
Time Series
Two methods have recently been proposed:The Visibility Graph (Lacasa et. al)Pseudoperiodic Similarity Network (Zhang/Small et al.)
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 26 / 49
Time SeriesThe Visibility Graph
Take time series X (t)Rescale so that 0 ≤ X (t) ≤ 1for each tFor each t, draw a bar of heightX (t). Each bar represents anode in the network.Declare pairs of bars with adirect visibility line to beadjacent in the network
0 1 2 3 4 5 6 7 8 9 10−10
−8
−6
−4
−2
0
2
4
6
8
10
0 1 2 3 4 5 6 7 8 9 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 27 / 49
Time SeriesThe Visibility Graph
0 1 2 3 4 5 6 7 8 9 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 28 / 49
Time SeriesThe Visibility Graph
This is equivalent to declaring the nodes relating to (ta, xa) and (tb, xb) tobe neighbours if any data point (tc , xc), with ta < tc < tb, satisfies:
xc < xb + (xa − xb)tb − tctb − ta
0 1 2 3 4 5 6 7 8 9 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
10
9
8 7
5
6
3
4
21
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 29 / 49
Time SeriesThe Visibility Graph
This method guarantees a network which is:ConnectedRobust to horizontal and vertical rescalingRobust to horizontal and vertical translationRobust to the addition of a linear trend
The final three points are particularly attractive in time series analysis,where different instruments may assign different values to the same signal!
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 30 / 49
Time SeriesThe Visibility Graph
Under such a setup. . .Periodic time series are mapped to regular graphsRandom time series are mapped to random graphsFractal time series are mapped to scale-free networks
I The authors have explored this observation further, and find that thedegree distribution can provide a good estimator for the Hurstexponent of a fractal time series
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 31 / 49
Time SeriesPseudoperiodic Similarity Network
Given a pseudoperiodic time seriesX (1), . . .X (n):
Split X (1), . . . ,X (n) into mcycles C1, . . . ,Cm, eachcontaining numerous samplingpointsDraw a node for each ofC1, . . . ,Cm
Declare nodes i and j to beadjacent if they are sufficientlyclose under some distance metric(based on the sampling pointswithin the cycles)
0 100 200 300 400 500 600 700 800 900−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
0 100 200 300 400 500 600 700 800 900−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 32 / 49
Time SeriesPseudoperiodic Similarity NetworkSome possible metrics:
1 Phase Space Difference:
Di ,j = minm=0,1,...,|lj−li |
1min(li , lj)
min(li ,lj )∑k=1
||Xk − Yk+l ||
where li is the number of sample points in cycle i ; Xk and Yk are thekth sample points in cycles i and j respectively; and || · || denotesEuclidean distance.
2 Linear Correlation Coefficient:
ρi ,j = maxm=0,1,...,|lj−li |
Cov {Ci (1 : li ),Cj(m + 1,m + li )}√Var {Ci (1 : li )}
√Var {Cj(m + 1 : m + li )}
Di ,j =ρi ,j + 1
2Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 33 / 49
Time SeriesPseudoperiodic Similarity NetworkNetwork produced by using Phase Space Difference metric on a noisy sinewave with m = 60 cycles
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 34 / 49
Time SeriesPseudoperiodic Similarity Network
What is a good value for the threshold?Need to strike a balance between:
Being large enough to preservethe local clustering properties ofthe networkBeing small enough not toobscure the local properties ofthe network by over-connectingthe nodes
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 35 / 49
Time SeriesPseudoperiodic Similarity NetworkInference on the network
1 Degree distributionI Degree distributions for time series of chaotic systems demonstrate
multiple peaks due to the various unstable periodic orbits embedded inthe chaotic attractor
2 Betweenness centrality:
CB(v) =∑
s 6=v 6=t
σst(v)
σst
where σst denotes the number of shortest paths from node s to node tand σst(v) denotes the number of shortest paths from s to t that passthrough vertex v .
I Betweenness centrality can predict the role which individual nodes playin complex systems – those with high CB correspond to cycles betweenadjacent clusters in the network. Chaotic attractors have infinitelymany unstable periodic orbits, so will contain more such nodes in theirnetwork than other types of dynamical systems.
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 36 / 49
Time SeriesPseudoperiodic Similarity Network
3 ClassificationI Betweenness centrality and assortativity (the preference for high-degree
vertices to attach to other high-degree vertices) are shown by Zhanget. al to be excellent tools in classifying time series (eg. healthy vsarrythmia heart patients)
Statistic Healthy Patient Arrythmia PatientBetweenness Centrality 0.124 0.049
Assortativity 0.674 0.208Correlation Dimension 1.845 1.903
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 37 / 49
Time SeriesPseudoperiodic Similarity Network
The networks for the ECG examples:
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 38 / 49
Problems to Consider
Dynamical Systems
Time Series
Stochastic Processes
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 39 / 49
Stochastic Processes
Stochastic processes may have finite or infinite Markov-Einstein (ME)coherence length:
DefinitionThe Markov-Einstein coherence length of a stochastic process is theshortest time interval over which the process may be considered to be aMarkov process
I leave aside the problem of finding the ME timescale, and consider onlystochastic process with finite ME time scale.
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 40 / 49
Stochastic Processes
Constructing the network:
Determine the state space of thestochastic process, groupingstates if necessary to form adiscrete, finite state space. Eachstate will correspond to a nodein the network.If the process is not alreadyMarkov, sample the originalstochastic process on the MEtimescale.For each successive sample pointon the stochastic process, jointhe respective nodes with adirected edge.
0 5 10 15 20 25 300
1
2
3
4
5
6
7
8
9
10
Time
State
109
8
7
56
3
4
2
1
0
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 41 / 49
Stochastic Processes
The authors convert this construction into a weighted network, andconsider various statistics:
Statistic White Noise DAX Index Jet Engine TurbulenceMean weight 1.0498 1.1044 3.820Clustering 0.001 0.013 0.038Diameter 2 2 15
Mean weight =P
i<j wi,jNumber of edges with weight > 0
Clustering =P
j,k wi,jwj,kwk,iPj,k wi,jwk,i
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 42 / 49
EvaluationRecurrence Plot Method
Allows for a straightforward classification of dynamical systems overand above that offered by a sketch of the trajectoryHighly visual techniqueOnce the trajectory has been found, the algorithm runs in polynomialtime. . .. . . but finding the trajectory may be highly nontrivialOnly partitions dynamical systems into large classesAlthough the node number denotes temporal ordering, this doesn’tever play a role in calculations to do with network statisticsDrawing the network “throws away” a large amount of informationfrom the original recurrence plot matrix
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 43 / 49
EvaluationVisibility Graph Method
Invariant under numerous transformationsEffectively classifies different types of time seriesMakes inference about Hurst exponent – a very difficult problem!Computationally intensive – determining visibility requires O(n3)computationsUnclear exactly what “visibility” relates to in a theoretical sense
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 44 / 49
EvaluationPseudoperiodic Similarity Network
Effective at distinguishing between types of time series, and andidentifying similar time series within a specific typeDifferent choices of metric availableThreshold parameters can be chosen according to specific needsRecently, Yang and Yang have proposed an algorithm which (althoughslow to operate) allows time series which are not pseudoperiodic to beexaminedIt is nontrivial to identify a “cycle” in an unknown times series (eg.using local minima won’t provide a meaningful answer on an ECG plot)Lengthy to compute due to optimization within distance calculationsCould we not get similarly useful statistics from a spectral analysis?
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 45 / 49
EvaluationStochastic Processes
Extremely simple to generate network – a single parse of the data issufficientCan be used to generate synthetic replications of the stochasticprocess, which can be useful for predictionsNetwork statistics seem to provide good indication of time seriesbehaviourDoes the “network” setting really provide new understanding?It’s difficult to choose the number of nodes correctly. Too few andlittle insight is gained about the process, too many and the network istoo sparse to draw any statistically significant conclusions
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 46 / 49
Open Problems
1 Are there a new set of network statistics which take node number(and thus temporal ordering) into consideration?
2 Could the dynamical systems method provide any insight into thebifurcation behaviour?
3 For the dynamical systems and pseudoperidic time series approaches,could weighted networks provide more insight? At the moment, themethods essentially rely on thresholding.
4 What are these new methods really offering over and above existingtechniques? Are we essentially abandoning years of theory andrestarting with little more than empirical science?
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 47 / 49
Thanks
Mason Porter
Sam Howison
Stacy Williams, Mark McDonald and Dan Fenn
Ben Fulcher
HSBC Bank
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 48 / 49
References[J Eckmann, S Oliffson Kamphorst and D Ruelle, “Recurrence plots ofdynamical systems”, Europhysics Letters 4, (1987), 973]
[L Lacasa et. al, “From time series to complex networks: the visibility graph” ,Proceedings of the National Academy of Sciences, 105 (13), (2008), 19601]
[A Shirazi et. al, “Mapping stochastic processes onto complex networks” ,Journal of Statistical Mechanics (2009), P07046]
[M Small, J Zhang and X Xu, “Transforming time series into complexnetworks”, Complex Sciences (2009), pp 2078–2089]
[X Xu, J Zhang and M Small, “Superfamily phenomena and motifs ofnetworks induced from time series” , Proceedings of the National Academy ofSciences, 105 (50), (2008), 19601]
[J Zhang et. al, “Characterizing pseudoperiodic time series through complexnetwork approach” , Physica D, 237 (22), (2008), pp 2856–2865]
[J Zhang and M Small, “Complex network from pseudoperiodic time series:topology versus dynamics”, Physical Review Letters, 96 (23) (2006), 238701]
Martin Gould (University of Oxford) Old Dogs, New Tricks? 21st May 2010 49 / 49