Graph Frequency Analysis of Brain Signals - arXiv · 2016-05-04 · 1 Graph Frequency Analysis of Brain Signals Weiyu Huang, Leah Goldsberry, Nicholas F. Wymbs, Scott T. Grafton,

1

Graph Frequency Analysis of Brain SignalsWeiyu Huang, Leah Goldsberry, Nicholas F. Wymbs, Scott T. Grafton, Danielle S. Bassett and Alejandro Ribeiro

Abstract—This paper presents methods to analyze functionalbrain networks and signals from graph spectral perspectives.The notion of frequency and filters traditionally defined forsignals supported on regular domains such as discrete time andimage grids has been recently generalized to irregular graphdomains, and defines brain graph frequencies associated withdifferent levels of spatial smoothness across the brain regions.Brain network frequency also enables the decomposition of brainsignals into pieces corresponding to smooth or rapid variations.We relate graph frequency with principal component analysiswhen the networks of interest denote functional connectivity. Themethods are utilized to analyze brain networks and signals assubjects master a simple motor skill. We observe that brainsignals corresponding to different graph frequencies exhibitdifferent levels of adaptability throughout learning. Further, wenotice a strong association between graph spectral properties ofbrain networks and the level of exposure to tasks performed,and recognize the most contributing and important frequencysignatures at different task familiarity.

Index Terms—Functional brain network, network theory,graph signal processing, fMRI, motor learning, filtering

I. INTRODUCTION

The study of brain activity patterns has proven valuablein identifying neurological disease and individual behavioraltraits [1]–[4]. The use of functional brain networks describingthe tendency of different regions to act in unison has provencomplementary in the analysis of similar matters [5]–[8]. It isnot surprising that signals and networks prove useful in similarproblems since the two are closely related. In this paper weadvocate an intermediate path in which we interpret brain ac-tivity as a signal supported on the graph of brain connectivity.We show how the use of graph signal processing (GSP) toolscan be used to glean information from brain signals using thenetwork as an aid to identify patterns of interest. The benefitsof incorporating network information into signal analysis hasbeen demonstrated in multiple domains. Notable examples ofapplications include video compression [9], rating predictionsin recommendation systems [10], and breast cancer diagnostics[11], and semi-supervised learning [12].

The fundamental GSP concepts that we utilize to exploitbrain connectivity in the analysis of brain signals are thegraph Fourier transform (GFT) and the corresponding notions

Supported by NSF CCF-1217963, PHS NS44393, ARO ICB W911NF-09-0001, ARL W911NF-10-2-0022, ARO W911NF-14-1-0679, NIH R01-HD086888, NSF BCS-1441502, NSF BCS-1430087 and the John D. andCatherine T. MacArthur and Alfred P. Sloan foundations. Content does notnecessarily represent official views of any of the funding agencies. W. Huang,L. Goldsberry, D. S. Bassett, and A. Ribeiro are with the Dept. of Electricaland Systems Eng., University of Pennsylvania. N. F. Wymbs is with theDept. of Physical Medicine and Rehabilitation, Johns Hopkins University.S. T. Grafton is with the Dept. of Psychological and Brain Sciences,University of California at Santa Barbara. Email: [email protected],[email protected], [email protected], [email protected],[email protected], and [email protected].

of graph frequency components and graph filters. These con-cepts are generalizations of the Fourier transform, frequencycomponents, and filters that are used in regular domains suchas time and spatial grids [13]–[15]. As such, they permitthe decomposition of a graph signal into components thatrepresent different modes of variability. We can define lowgraph frequency components representing signals that changeslowly with respect to brain connectivity networks in a welldefined sense and high graph frequency components repre-senting signals that change fast in the same sense. This isimportant because low and high temporal variability haveproven important in the analysis of neurological disease andbehavior [16], [17]. GFT based decompositions permit asimilar analysis of variability across regions of the brain for afixed time – a sort of spatial variability measured with respectto the connectivity pattern. We demonstrate here that it isuseful in a similar sense; see e.g. Figs. 5, 6, 7, and 9.

The GSP studies in this paper are related to principalcomponent analysis (PCA), which has been used with successin the analysis of brain signals [18], [19]. The differencewith the GSP analysis we present here is that PCA implicitlyassumes the brain network to be a correlation matrix and thesignals to be drawn from a stochastic model. More importantly,whereas the GFT can be used to, e.g., decompose the signalinto low, medium, and high frequency components, PCA ismostly utilized for dimensionality reduction; which in thelanguage of this paper is tantamount to analyzing a few lowgraph frequency components. Another important difference isthat PCA focuses on identifying variability across differentrealizations of brain signals, but the GFT identifies spatialvariability of a single realization. GSP is also related to thespectral analysis of networks in general and Laplacians inparticular [20], [21]. The difference in this case is that thesespectral analyses yield properties of the networks. In GSPanalyses, the network provides an underlying structure, butthe interest is on signals expressed on this stratum.

The goal of this paper is to introduce GSP notions that canbe used to analyze brain signals and to demonstrate their valuein identifying patterns that appear when monitoring activity assubjects learn to perform a visual-motor task. Specifically, thecontributions of this paper are: (i) To explain tools from theemerging field of GSP and show how they can be applied inanalyzing brain signals. (ii) To evaluate the graph spectrumof brain functional network and to define artificial networkconstruction methods that replicate the features of the graphspectrum of functional networks. (iii) To examine the temporalvariation of brain signals corresponding to different graphfrequencies when participants perform visual-motor learningtasks. (iv) To investigate the contribution of brain signalsassociated with different graph frequencies to the learningsuccess at different stages of visual-motor learning.

arX

iv:1

512.

0003

7v2

[q-

bio.

NC

] 3

May

201

6

2

We begin the paper with the introduction of basic notionsof graphs and graph signals. Particular emphasis goes into thedefinition of the graph Fourier transform and the interpretationof graph frequency components as different modes of spatialvariability measured with respect to the brain network (SectionII-A). We also introduce the notion of graph filters anddiscuss the interpretation of a graph low-pass filter as a localaveraging operation (Section II-B). Band-pass and high-passfilters that are used in later sections to separate signals intodifferent components are also introduced. We point out that thediscussion here is more extensive than necessary for readersfamiliar with GSP. The goal is to make the paper accessibleto readers that are not necessarily familiar with the subject.

We then move on to describe two different experimentsinvolving the learning of different visual-motor tasks by dif-ferent sets of participants (Section III). The graph frequencyof functional brain networks for the participants is visualizedand analyzed (Section IV), which is found to be associatedacross scan sessions in the same dataset and across datasets.In specific, we find that graph high frequencies of functionalnetworks concentrate on visual and sensorimotor modules ofthe brain – the two brain areas well-known to be associatedwith motor learning [22], [23]. This motivates us to con-sider graph frequencies besides low frequency components,whereas the PCA-oriented approaches has been focusing onlow frequencies. We also describe the construction of a simplemodel to establish artificial networks with a few networkdescriptive parameters (Section IV-A). We observe that themodel is able to mimic the properties of actual functional brainnetworks and we use them to analyze spectral properties of thebrain networks (Section IV-B). The paper then utilizes graphfrequency decomposition to visualize and investigate brainactivities with different levels of spatial variation (Section V).It is noticed that the decomposed signals associated to differentgraph frequencies exhibit different levels of temporal variationthroughout learning (Section V-A). In particular, graph low andhigh frequency components exhibit higher temporal variationat multiple temporal scales in the two experiments considered.Because temporal variation has been shown to be associatedwith learning success [17], this implies graph low and high fre-quency components offer more contributions to learning suc-cess. Finally, we also define learning capabilities of subjects,and examine the importance of brain frequencies at differenttask familiarity by evaluating their respective correlation withlearning performance at different task familiarities (SectionVI). We find that it is good to have graph low frequencycomponents (smooth, spread, and cooperative brain signals)when we face an unfamiliar task. When we become highlyfamiliar with the task, it is better and favors further learningto have graph high frequency components (varied, spiking, andcompetitive brain signals).

II. GRAPH SIGNAL PROCESSING

The interest of this paper is to study brain signals in whichwe are given a collection of measurements xi associated witheach cortical region out of n different brain regions. Anexample signal of this type is an fMRI reading in which xi

estimates the level of activity of brain region i. The collectionof n measurements is henceforth grouped in the vector signalx = [x1, x2, . . . , xn]

T ∈ Rn. A fundamental feature ofthe signal x is the existence of an underlying pattern ofstructural or functional connectivity that couples the values ofthe signal x at different brain regions. Irrespective of whetherconnectivity is functional or structural, our goal here is todescribe tools that utilize this underlying brain network toanalyze patterns in the neurophysiological signal x.

We do so by modeling connectivity between brain regionswith a network that is connected, weighted, and symmetric.Formally, we define a network as the pair G = (V,W), whereV = {1, . . . , n} is a set of n vertices or nodes representingindividual brain regions and W ∈ Rn×n collects weights ofedges in the network with wij ≥ 0 the weight of the edge(i, j). Since the network is undirected and symmetric we havethat wij = wji for all (i, j). The weights wij = wji representthe strength of the connection between regions i and j, or,equivalently, the proximity or similarity between nodes i andj. In terms of the signal x, this means that when the weightwij is large, the signal values xi and xj tend to be related.Conversely, when the weight wij is small, the signal valuesxi and xj are not directly related except for what is impliedby their separate connections to other nodes.

We adopt the conventional definitions of the degree andLaplacian matrices [24, Chapter 1]. The degree matrix D ∈Rn×n+ is a diagonal matrix with its ith diagonal element Dii =∑j wij denoting the sum of all the weights of edges out of

node i. The Laplacian matrix is defined as the difference L :=D−W ∈ Rn×n. The components of the Laplacian matrix areexplicitly given by Lij = −wij and Lii =

∑nj=1 wij . Observe

that the Laplacian is real, symmetric, diagonal dominant, andwith strictly positive diagonal elements. As such, the matrix Lis positive semidefinite. The eigenvector decomposition of Lis utilized in the following section to define the graph Fouriertransform and the associated notion of graph frequencies.

We note that brain networks, irrespective of whether theirconnectivity is functional [25] or structural [26], tend to bestable for a window of time, entailing associations betweenbrain regions during captured time of interest. Brain activitiescan vary more frequently, forming multiple samples of brainsignals supported on a common underlying network.

A. Graph Fourier Transform and Graph Frequencies

Begin by considering the set {λk}k=0,1,...,n−1 of eigenval-ues of the Laplacian L and assume they are ordered so that0 = λ0 ≤ λ1 ≤ . . . ≤ λn−1. Define the diagonal eigenvaluematrix Λ := diag(λ0, λ1 . . . , λn−1) and the eigenvector matrix

V := [v0,v1, . . . ,vn−1]. (1)

Because the graph Laplacian L is real symmetric, it acceptsthe eigenvalue decomposition

L = VΛVH , (2)

where VH represents the Hermitian (conjugate transpose)of the matrix V. The validity of (2) follows because theeigenvectors of symmetric matrices are orthogonal so that the

3

definition in (1) implies that VHV = I. The eigenvectormatrix V is used to define the Graph Fourier Transform ofthe graph signal x as we formally state next; see, e.g., [15].

Definition 1 Given a signal x ∈ Rn and a graph LaplacianL ∈ Rn×n accepting the decomposition in (2), the GraphFourier Transform (GFT) of x with respect to L is the signal

x = VHx. (3)

The inverse (i)GFT of x with respect to L is defined as

x = Vx. (4)

We say that x and x form a GFT transform pair.

Observe that since VVH = I, the iGFT is, indeed, theinverse of the GFT. Given a signal x we can compute theGFT as per (3). Given the transform x we can recover theoriginal signal x through the iGFT transform in (4).

There are several reasons that justify the association of theGFT with the Fourier transform. Mathematically, it is just amatter of definition that if the vectors vk in (2) are of the formvk = [1, ej2πk/n, . . . , ej2πk(n−1)/n]T , the GFT and iGFT inDefinition 1 reduce to the conventional time domain Fourierand inverse Fourier transforms. More deeply, it is not difficultto see that if the graph G is a cycle, the vectors vk in (2)are of the form vk = [1, ej2πk/n, . . . , ej2πk(n−1)/n]T . Sincecycle graphs are representations of discrete periodic signals,it follows that the GFT of a time signal is equivalent to theconventional discrete Fourier transform; see, e.g., [27].

An important property of the GFT is that it encodes a notionof variability akin to the notion of variability that the Fouriertransform encodes for temporal signals. To see this, definex = [x0, . . . , xn−1]

T and expand the matrix product in (4) toexpress the original signal x as

x =

n−1∑k=0

xkvk. (5)

It follows from (5) that the iGFT allows us to write thesignal x as a sum of orthogonal components vk in whichthe contribution of vk to the signal x is the GFT compo-nent xk. In conventional Fourier analysis, the eigenvectorsvk = [1, ej2πk/n, . . . , ej2πk(n−1)/n]T carry a specific notionof variability encoded in the notion of frequency. When k isclose to zero, the corresponding complex exponential eigen-vectors are smooth. When k is close to n, the eigenfunctionsfluctuate more rapidly in the discrete temporal domain. Inthe graph setting, the graph Laplacian eigenvectors provide asimilar notion of frequency. Indeed, define the total variabilityof the graph signal x with respect to the Laplacian L as

TV(x) = xHLx =∑i 6=j

wij(xi − xj)2, (6)

where in the second equality we expanded the quadratic form.It follows that the total variation TV(x) is a measure of howmuch the signal changes with respect to the network. For theedge (i, j), when wij is large we expect the values xi and xjto be similar because a large weight wij is encoding functionalsimilarity between brain regions i and j. The contribution of

their difference (xi − xj)2 to the total variation is amplifiedby the weight wij . If the weight wij is small, activities atbrain regions i and j tend to be uncorrelated, and thereforethe difference between the signal values xi and xj makes littlecontribution to the total variation. We can then think of a signalwith small total variation as one that changes slowly over thegraph and of signals with large total variation as those thatchange rapidly over the graph.

Consider the total variation of the eigenvectors vk and usethe facts that Lvk = λkvk and vHk vk = 1 to conclude that

TV(vk) = vHk Lvk = λk. (7)

It follows from (7) and the fact that the eigenvalues areordered as 0 = λ0 ≤ . . . ≤ λn−1, that the total variationsof the eigenvectors vk follow the same order. Combining thisobservation with the discussion following (6), we concludethat when k is close to 0, the eigenvectors vk vary slowlyover the graph, whereas for k close to n the eigenvaluesvary more rapidly. Therefore, from (5) we see that the GFTand iGFT allow us to decompose the brain signal x intocomponents that characterize different levels of variability.The GFT coefficients xk for small values of k indicate howmuch these slowly varying signals contribute to the observedbrain signal x. On the other hand, the GFT coefficients xk forlarge values of k describe how much rapidly varying signalscontribute to the observed brain signal x.

B. Graph Filtering and Frequency Decompositions

Given a graph signal x with GFT x we can isolate thefrequency components corresponding to the lowest KL graphfrequencies by defining the filtered spectrum xL = HLxsatisfying xLk = xk for k < KL and xLk = 0 otherwise. Thefilter HL can be written as the diagonal matrix HL = diag(hL)where the vector hL takes value 1 for frequencies smaller thanKL and is otherwise null,

hLk = I[k < KL

]. (8)

Utilizing the definitions of the GFT in (3) and the iGFT in (4),the spectral operation xL = HLx is equivalent to performingthe following operations in the graph vertex domain

xL := VxL = VHLx = VHLV−1x := HLx. (9)

The last equality in (9) defines the matrix HL := VHLV−1 sothat the graph signal xL associated with low graph frequenciesof x can be written as the product xL = HLx. Since the signalxL contains the low graph frequency components of x, we saythe matrix HL in (9) is a graph low-pass filter.

The filter HL := VHLV−1 admits an alternative repre-sentation as the expansion HL =

∑n−1k=0 hLkL

k in terms ofLaplacian powers [27]. The coefficients hLk in this expansionare elements of the vector hL = Ψ−1hL where Ψ is theVandermonde matrix defined by the eigenvalues of L, i.e.,

Ψ =

1 λ0 · · · λn−10...

.... . .

...1 λn−1 · · · λn−1n−1

. (10)

4

Since the eigenvalues are ordered in (10), the coefficients hLktend to be concentrated in small indexes k, and the expansionHL =

∑n−1k=0 hLkL

k is therefore dominated by small powersLk. From this fact it follows that we can think of the graphlow-pass filtered signal xL as resulting from a localized aver-aging of the elements of x. To understand this interpretation,simply note that L0x = x coincides with the original signal,Lx is an average of neighboring elements, L2x is an averageof elements in nodes that interact via intermediate commonneighbors, and, in general, Lkx describes interactions betweenk-hop neighbors. The fact that xL can be considered as asignal that follows from local averaging of x implies that xLhas smaller total variation than x and is consistent with theinterpretation of low graph frequencies presented in SectionII-A. We point out that the definition hL = Ψ−1hL assumesthe inverse matrix Ψ−1 exists. This holds true if the graphLaplacian does not have duplicate eigenvalues, which is thecase for all functional brain networks examined in the paper.

Other types of graph filters can be defined analogouslyto study interactions between signal components other thanthe local interactions captured in xL. Apart from the graphlow-pass filter HL, we also consider a graph band-pass filterHM and a graph high-pass filter HH, whose graph frequencyresponses are defined as

hMk = I[KL ≤ k < KL +KM

], (11)

hHk = I[KL +KM ≤ k

]. (12)

The definitions in (8), (11), and (12) are such that the low-pass filter takes the lowest KL graph frequencies, the band-pass filter captures the middle KM graph frequencies, and thehigh-pass filter the highest n − KL − KM frequencies. Thethree filters are defined such that the graph frequencies oftheir respective interest are mutually exclusive yet collectivelyexhaustive. As a result, if we use xM := HMx and xH := HHxto respectively denote the signals filtered by the band-pass andhigh-pass filters, we have that the original signal can be writtenas the sum x = xL + xM + xH. This gives a decompositionof x into low, medium, and high frequency componentswhich respectively represent signals that have slow, medium,and high variability with respect to the connectivity networkbetween brain regions. This decomposition is utilized in thispaper to analyze brain activity patterns associated with thelearning of visual-motor tasks.

III. BRAIN SIGNALS DURING LEARNING

We considered two experiments in which subjects learneda simple motor task [28], [29]. In the experiments, fourty-seven right-handed participants (29 female, 18 male; mean age24.13 years) volunteered with informed consent in accordancewith the University of California, Santa Barbara InternalReview Board. After exclusions for task accuracy, incompletescans, and abnormal MRI, 38 participants were retained forsubsequent analysis.

Twenty individuals participated in the first experimentalframework. The experiment lasted 6 weeks, in which therewere 4 scanning sessions, roughly at the start of the ex-periment, at the end of the 2nd week, at the end of the

Session 1 Session 2 Session 3 Session 4

MIN Sequences 50 110 170 230MOD Sequences 50 200 350 500EXT Sequences 50 740 1430 2120

Fig. 1. Relationship between training duration, intensity, and depth for thefirst experimental framework. The values in the table denote the number oftrials (i.e., “depth”) of each sequence type (i.e., “intensity”) completed aftereach scanning session (i.e., “duration”) averaged over the 20 participants.

4th week, and at the end of the experiment, respectively.During each scanning session, individuals performed a discretesequence-production task in which they responded to sequen-tially presented stimuli with their dominant hand on a customresponse box. Sequences were presented using a horizontalarray of 5 square stimuli with the responses mapped fromleft to right such that the thumb corresponded to the leftmoststimulus. The next square in the sequence was highlightedimmediately following each correct key press; the sequencewas paused awaiting the depression of the appropriate keyif an incorrect key was pressed. Each participant completed 6different 10-element sequences. Each sequence consists of twosquares per key. Participants performed the same sequences athome between each two adjacent scanning sessions, however,with different levels of exposure for different sequence types.Therefore, the number of trials completed by the participantsafter the end of each scanning session depends on the sequencetype. There are 3 different sequence types (MIN, MOD, EXT)with 2 sequences per type. The number of trials of eachsequence type completed after each scanning session averagedover the 20 participants is summarized in Fig. 1. Duringscanning sessions, each scan epoch involved 60 trials, 20 trialsfor each sequence type. Each scanning session contained atotal of 300 trials (5 scan epochs) and a variable number ofbrain scans depending on how quickly the task was performedby the specific individual.

Eighteen subjects participated in the second experimentalframework. The experiment had 3 scanning sessions spanningthe three days. Each scanning session lasted roughly 2 hoursand no training was performed at home between adjacentscanning sessions. Subjects responded to a visually cuedsequence by generating responses using the four fingers oftheir nondominant hand on a custom response box. Visualcues were presented as a series of musical notes on a pseudo-musical staff with four lines such that the top line of the staffmapped to the leftmost key pressed with the pinkie finger. Each12-note sequence randomly ordered contained three notes perline. Each training epoch involved 40 trials and lasted a totalof 245 repetition times (TRs), with a TR of 2,000 ms. Eachtraining session contained 6 scan epochs (240 trials) and lasteda total of 2,070 scan TRs.

In both experiments participants were instructed to respondpromptly and accurately. Repetitions (e.g., “11”) and regular-ities such as trills (e.g., “121”) and runs (e.g., “123”) wereexcluded in all sequences. The order and number of sequencetrials were identical for all participants. Participants completedthe tasks inside the MRI scanner for scanning sessions.

Reordering with fMRI was conducted using a 3.0 T Siemens

5

Fig. 3. Absolute magnitude at each of the n cortical structures averaged across participants in the 6 week experiment and averaged across all frequencycomponents in (a) the set of low graph frequencies {vk}KL−1

k=0 , (b) the set of middle graph frequencies {vk}KL+KM−1k=KL

, and (c) the set of high graphfrequencies {vk}n−1

k=KL+KM. (d)-(f) presents the average magnitudes for the 3 day experiment. Only brain regions with absolute magnitudes higher than a

fixed threshold (0.015) are colored. Magnitudes across the datasets are highly similar in the low and high graph frequencies (correlation coefficients 0.5818and 0.6616, respectively). The brain regions with high magnitude values significantly overlap with the visual and sensorimotor modules, in which more than60% of values greater than the threshold belong to the visual and sensorimotor modules.

Trio with a 12-channel phased-array head coil. For eachfunctional run, a single-shot echo planar imaging sequencethat is sensitive to blood oxygen level dependent (BOLD)contrast was utilized to obtain 37 (the first experiment) or 33(the second experiment) slices (3mm thickness) per repetitiontime (TR), an echo time of 30 ms, a flip angle of 90◦, afield of view of 192 mm, and a 64 × 64 acquisition matrix.Image preprocessing was performed using the Oxford Centerfor Functional Magnetic Resonance Imaging of the Brain(FMRIB) Software Library (FSL), and motion correction wasperformed using FMRIB’s linear image registration tool. Thewhole brain is parcellated into a set of n = 112 regions ofinterest that correspond to the 112 cortical and subcorticalstructures anatomically identified in FSL’s Harvard-Oxfordatlas. The choice of parcellation scheme is the topic ofseveral studies in structural [30], resting-state [31], and task-based [32] network architecture. The question of the mostappropriate delineation of the brain into nodes of a network isopen and is guided by the particular question one wants to ask.We use Harvard-Oxford atlas here because it is consistent withprevious studies of task-based functional connectivity duringlearning [28], [29]. The threshold in probability cutoff settingsof Harvard Oxford atlas parcellation is 0 so that no voxels wereexcluded.

For each individual fMRI dataset, we estimate regionalmean BOLD time series by averaging voxel time series in eachof the n regions. We evaluate the magnitude squared spectral

coherence [33] between the activity of all possible pairs ofregions to construct n × n functional connectivity matricesW. Besides, for each pair of brain regions i and j, we use t-statistical testing to evaluate the probability pi,j of observingthe measurements by random chance, when the actual dataare uncorrelated [34]. In the 3 day dataset, the value of allelements with no statistical significance (pi,j > 0.05) [35] areset to zero; the values remain unchanged otherwise. In the3 day experiment, a single brain network is constructed foreach participant. Thresholding is applied because the networksare for the entire span of the experiment and many entriesin W would be close to zero without threshold correction.In the 6 week experiment, due to the long duration of theexperiment, we build a different brain network per scanningsession, per sequence type for each subject. Because eachnetwork describes the functional connectivity for one trainingsession given a subject, not many entries will be removedeven in the presence of threshold correction; consequently, nothresholding is applied for the 6 week dataset. We normalizethe regional mean BOLD observations x(t) at any sample timet and consider x(t) = x(t)/‖x(t)‖2 such that the total energyof activities at all structures is consistent at different t to avoidextreme spikes due to head motion or drift artifacts in fMRI.

IV. BRAIN NETWORK FREQUENCIES

In this section, we analyze the graph spectrum brain net-works of the dataset considered. For the brain network W

6

Eigenvalue Index

20 40 60 80 100

Tota

l V

ariation

0

10

20

30

40

(a)Eigenvalue Index

20 40 60 80 100

Weig

hte

d Z

ero

Cro

ssin

gs

0

1000

2000

3000

4000

5000

6000

7000

(b)

Eigenvalue Index

20 40 60 80 100

Tota

l V

ariation

0

5

10

15

20

(c)Eigenvalue Index

20 40 60 80 100

Weig

hte

d Z

ero

Cro

ssin

gs

0

50

100

150

200

250

300

(d)

Fig. 2. (a) Total variation TV(vk) and (b) weighted zero crossings ZC(vk)of the graph Laplacian eigenvectors for the brain networks averaged acrossparticipants in the 6 week training experiment. (c) and (d) present the valuesfor the 3 day experiment. In both cases, the Laplacian eigenvectors associatedwith larger indexes vary more on the network and cross zero relatively moreoften, confirming the interpretation of the Laplacian eigenvalues as notionsof frequencies. Besides, note that total variation increases relatively linearlywith indexes.

of each subject, we construct its Laplacian L = D −W,and evaluate the total variation TV(vk) [cf. (7)] for eacheigenvector vk. Fig. 2 (a) and (c) plot the total variation of allgraph eigenvectors averaged across participants of the 6 weektraining experiment and 3 day experiment, respectively. In bothexperiments, the Laplacian eigenvectors associated with largerindexes fluctuate more on the network. Another observation isthat the total variation increases in linear scale with indexes.

Besides total variation, the number of zero crossings is usedas a measure of the smoothness of signals with respect to anunderlying network [15]. Since brain networks are weighted,we adapt a slightly modified version – weighted zero crossings– to investigate the given graph eigenvector vk

ZC(vk) =1

2

∑i 6=j

wijI {vk(i)vk(j) < 0} . (13)

In words, weighted zero crossings evaluate the weighted sumof the set of edges connecting a vertex with a positivesignal to a vertex with a negative signal. Fig. 2 (b) and(d) demonstrate the weighted zero crossings of all grapheigenvectors averaged across subjects of the 6 week and 3day experiments, respectively. The weighted zero crossingsincrease proportionally with graph frequency index k for0 ≤ k ≤ 100, as expected. However, eigenvectors associatedwith higher graph frequencies exhibit lower weighted zerocrossings when k is greater than 100.

It would be interesting to examine where the associatedeigenvectors lie anatomically, and the relative strength of theirvalues. To facilitate the presentation, we consider three sets ofeigenvectors, {vk}KL−1

k=0 , {vk}KL+KM−1k=KL

and {vk}n−1k=KL+KM,

and compute the absolute magnitude at each of the n corticaland subcortical regions averaged across participants and across

all graph frequencies belonging to each of the three sets. Fig.3 presents the average magnitudes for the two experimentalframeworks considered in the paper using BrainNet [36],where brain regions with absolute magnitudes lower than afixed threshold are not colored. Throughout the paper, theparameter KL is set as 40 and KM is set as 32. We chosethis combination because it yields three roughly equally-sizedcomponents with one piece corresponding to the 40 lowestgraph frequencies and another piece corresponding to the40 highest frequencies. The results presented in the paperare robust with the choice of parameters: we examined theresults for KL and KM in the range of 32 to 42 and foundsimilar observations as the ones presented. To demonstratethat, we also consider the parameters KL = 38, KM = 37and KL = 35, KM = 42. The average correlation coefficientbetween the absolute magnitude across brain regions of theparameters selected for this paper and the two chosen forrobustness is 0.8659. See Fig. 8 for another quantificationon the robustness of parameters. In this paper, we reportcorrelation coefficient as a quantification of similarity measurewhen we examine the similarity between two vectors. We alsoinvestigate cosine similarity and find high similarities as well.

A. Artificial Functional Brain Networks

An approach to analyze the complex networks is to de-fine a model to generate artificial networks [37]. The mainmotivation of an artificial network model is to use them toanalyze complex brain networks. Examples of such modelsinclude Erdos-Renyi model [38] for unweighted networks,Barabasi-Albert model for scale free network [39], and recentdevelopment and insights on weighted network models [37].Here we present a framework to construct artificial networksthat can be used to mimic the functional brain networks withonly a few parameter inputs. The model is related to weightedblock stochastic model [40], but involves more aspects likeindividual variance and links independent of their connectingregions. The output of the method would be a symmetricnetwork with edge weights between 0 and 1 without self-loops.

To begin, suppose the desired network has two clusters ofnodes V1 and Vo. The algorithm requires the average edgeweight µ1 for connections between nodes of the first clusterV1, average edge weight µo for links between nodes of theother cluster Vo, and average edge weight µ1o for inter-cluster connections. To reflect the fact that the edge weightson some links are independent of their joining vertices, foreach edge within V1, with probability pε < 1, its weightis randomly generated with respect to uniform distributionU [0, 1] between 0 and 1, and with probability 1−pε, its weightis randomly generated with respect to uniform distributionU [µ1−δ, µ1+δ]. The parameter pε determines the percentageof edges whose weights are selected irrespective of their actuallocations To further simulate the observation that differentparticipants may possess distinctive brain networks, if theedge weight is randomly generated from a uniform distributionU [µ1−δ, µ1+δ], it is then perturbed by wu ∼ U [−uε/2, uε/2]where uε controls the level of perturbation. The edge weightsfor connections within cluster Uo are generated similarly:

7

with probability 1 − pε, the edge weight is randomly chosenfrom the uniform distribution U [µo − δ, µo + δ] before beingcontaminated by wu ∼ U [−uε/2, uε/2]. The edge weightsfor connections between clusters U1 and Uo are formedanalogously using µ1o. The method presented here can beeasily generalized to analyze brain networks with more regionsof interest, i.e. by specifying sets of regions of interest andby detailing the expected correlation values on each type ofconnection between different regions.

Remark 1 At one extreme we can make each node i belong-ing to a different set Vi = {i}. Then the method requiresthe inputs of expected weights for all nodes, or alternativelyspeaking, the expected network. At the other extreme, thereis only one set of nodes V , and then the method is highlyakin to a network with edge weights completely randomlygenerated. Any construction of interest would have some priorknowledge regarding the community structure. Therefore, themethod proposed here can be used to see if the networkconstructed with the specific choice of community structurehighly simulate the key properties of the actual network, andcan be used to examine the evolution of community structurein the brain throughout the process to master a particular task.

B. Spectral Properties of Brain Networks

In this section, we analyze graph spectral properties of brainnetworks. Given the graph Laplacian, we examine the fluctu-ation of its eigenvectors on different types of connections inthe brain network [23]. More specifically, given an eigenvectorvk, its variation on the visual module is defined as

TVvisual(vk) =

∑i,j∈Vv,i6=j wij(vk(i)− vk(j))

2∑i,j∈Vv,i6=j wij

, (14)

where Vv denotes the set of nodes belonging to the visualmodule. The measure TVvisual(vk) computes the difference forsignals on the visual module for each unit of edge weight.To facilitate interpretation, we only consider three sets ofeigenvectors {vk}KL−1

k=0 , {vk}KL+KM−1k=KL

, and {vk}n−1k=KL+KM.

We then compute the visual module total variation TVLvisual

averaged over eigenvectors {vk}KL−1k=0 , and TVM

visual as wellas TVH

visual similarly. Besides TVvisual, we also examine thelevel of fluctuation of eigenvectors on edges within the motormodule, denoted as TVmotor, and on connections belonging tobrain modules other than the visual and motor module TVothers.Further, there are links between two separate brain modules,and to assess the variation of eigenvectors on those links, wedefine total variations between the visual and motor modules

TVvisual-motor(vk) =

∑i∈Vv,j∈Vm

wij(vk(i)− vk(j))2∑i∈Vv,j∈Vm

wij, (15)

where Vm denotes the set of nodes belonging to the motormodule. Total variations TVvisual-others between the visual andother modules, and total variations TVmotor-others between themotor and other modules are defined analogously. We choseto study visual and motor modules separately from other brainmodules because of their well-known associations with motorlearning [22], [23].

Fig. 4 (a) presents boxplots of the variation for eigenvectorsof different graph frequencies measured over different typesof connections across participants, at the start of the six weektraining. Despite that total variation of eigenvectors shouldincrease with their frequencies, the variation on the other mod-ule TVL

others of eigenvectors associated with low frequenciesare higher than TVH

others (pass t-test with p < 0.0001). Thisobservation is discussed in detail in Section IV-C.

Next we study how the graph spectral properties of brainnetworks evolve as participants become more familiar withthe tasks. Fig. 4 (b) illustrates the median of the variationfor eigenvectors of different graph frequencies measured overdifferent types of connections across subjects, at 10 differentlevels of exposure in the six week training. As participantsbecome more acquainted with the assignment, their brainnetworks display lower variation in the visual and motormodules and higher variation in the other modules for lowand middle graph frequencies, and the exact opposite is truefor high graph frequencies. The association with training inten-sity is statistically significant (average correlation coefficientr = 0.8164).

C. DiscussionFirstly, we examine why we see a decrease in zero crossings

of graph frequencies when k is greater than 100 in Fig. 2. Adetailed analysis shows this is because the functional brainnetworks are highly connected with nearly homogeneous de-gree distribution, and consequently each high graph frequencytends to have a value with high magnitude at one vertex ofhigh degree and similar values at other nodes, resulting in asmaller global zero crossings for eigenvectors associated withvery high frequencies.

Secondly, in terms of the visualization of graph frequenciesin Fig. 3, the most interesting finding relates to the eigenvec-tors associated with high graph frequencies. The magnitudesat different brain regions for high frequencies are significantlysimilar across the two datasets investigated (correlation coef-ficient 0.6616), and brain regions with high magnitude valuesare highly alike (greater than 60% overlap) to the visual andsensorimotor cortices [41]. This is likely to be a consequenceof the fact that visual and motor regions are more stronglyconnected with other structures, and hence an eigenvector witha high magnitude on visual or motor structures would resultin high global spatial variation. The eigenvectors of low graphfrequencies are more spread across the networks, resultingin low global variations. The middle graph frequencies areless interesting – the magnitudes at most regions (greaterthan 90%) do not pass the threshold, and little associations(correlation coefficient 0.3529) can be found between theeigenvectors of the 6 week and 3 day experiments.

Thirdly, to better interpret the meaning of variations forspecific types of connections, we construct artificial networksas described in Section IV-A with visual and motor modulesas regions of interest, and consider other modules to be brainregions other than visual and motor modules. We observe thatthere are three contributing factors that cause the variationwithin a specific module to become higher for higher eigenvec-tors and to become lower for lower eigenvectors: (i) Increases

8

Region

V M VM O VO MO0

0.01

0.02

0.03

To

tal V

aria

tio

n M

ag

nitu

de

Region

V M VM O VO MO0

0.01

0.02

0.03

Region

V M VM O VO MO0

0.01

0.02

0.03

0 5 10

To

tal V

aria

tio

n M

ag

nitu

de

0

0.01

0.02

0.03

Trial

Visual Motor Visual-Motor Other Visual - Other Motor - Other

0 5 100

0.01

0.02

0.03

Trial

0 5 100

0.01

0.02

0.03

Trial

Trial

0 5 100

0.01

0.02

0.03

To

tal V

aria

tio

n M

ag

nitu

de

Trial

0 5 100

0.01

0.02

0.03

Trial

0 5 100

0.01

0.02

0.03

Fig. 4. Spectral property of brain networks in the 6 week experiment. (a) Left: Averaged total variation of eigenvectors vk for 6 different types of connectionsof the brain averaged across all eigenvectors associated with low graph frequencies vk ∈ {vk}KL−1

k=0 , across all participants and scan sessions. Middle: Acrossall eigenvectors with middle graph frequencies vk ∈ {vk}KL+KM−1

k=KL. Right: Across all eigenvectors with high graph frequencies vk ∈ {vk}n−1

k=KL+KM. (b)

Median total variations of brain networks across participants of different scanning sessions and different sequence types with respect to the level of exposureof participants to the sequence type at the scanning session. Relationship between training duration, intensity, and depth is summarized in Fig. 1. Value of1 on the x-axis in the figures refers to minimum exposure to sequences (all 3 sequence types of the first session), and value of 10 on the x-axis denotesthe maximum exposure to sequences (EXT sequence types of the fourth session). An association between spectral property of brain networks and the levelof exposure is clearly observed (average correlation coefficient 0.8164). (c) Median total variations evaluated upon artificial networks. Spectral propertiesof actual brain networks can be closely simulated using a few parameters. The main text gives all correlation values for similarity between variance amongsubjects and between correlations of training intensity.

in the average edge weight for connections within the module,(ii) Increments in the average edge weight for links betweenthis module to other module, and (iii) Escalation in the averageedge weight for associations within the other module. Thiscan also be observed by analyzing closely the definition oftotal variation. If a module is highly connected, in order forthe eigenvector associated with a low graph frequency to besmooth on the entire network, it has to be smooth on thespecific module, resulting in a low value in the variation ofan eigenvector associated with a low graph frequency withrespect to the module of interest. Similarly, the increase in the

variation of connections between two modules, e.g. betweenvisual and other modules are resulted from: (i) The growth inthe average edge weight for connections between visual andother modules, or (ii) The augmentation of average weight forlinks within the other module. The graph spectral properties asin Fig. 4 (a) are observed because (i) visual and motor modulesare themselves highly connected, and (ii) visual module is alsostrongly linked with motor module.

Finally, in analyzing the evolution of graph spectral proper-ties as participants become more familiar with the tasks, fol-lowing the interpretations based on artificial network analysis,

9

Fig. 5. Distribution of decomposed signals for the 6 week experiment. (a) Absolute magnitudes for all brain regions with respect to xL – brain signalsvaring smoothly across the network – averaged across all sample points for each individual and across all participants at the first scan session of the 6 weekdataset. (b) With respect to xM and (c) with respect to xH – signals rapidly fluctuating across the brain. (d), (e), and (f) are averaged xL,xM and xH at thelast scan session of the 6 week dataset, respectively. Only regions with absolute magnitudes higher than a fixed threshold are colored.

Fig. 6. Distribution of decomposed signals for the 3 day experiment. (a), (b), and (c) are xL,xM and xH averaged across all sample points for each subjectand across participants in the 3 day experiment, respectively. Regions with absolute value less than a threshold are not colored.

this evolution in graph spectral properties of brain networksis mainly caused by the decrease in values of connectionswithin visual and motor modules and between the visual andmotor modules. An interesting observation is that the values inthe variation of eigenvectors associated with high frequenciesdecline with respect to the visual module much faster thanthat of motor module, even though the visual module is morestrongly connected throughout training compared to the motormodule. A deep analysis using artificial networks shows thatthis results from the following three factors: (i) Though morestrongly connected compared to the motor module, connec-tions within the visual module weaken very quickly, (ii) The

motor module is more closely connected with the other modulethan the link between the visual module to the other module,and (iii) Association levels within the other module stayrelatively constant. Therefore, as participants become moreexposed to the tasks, compared to the visual module, the motormodule becomes more strongly connected. The graph spectralproperties of actual brain networks and their evolution can beclosely imitated using artificial networks as plotted in Fig. 4(c). The artificial network created for our analysis best imitatedthe real brain networks with parameters pε of 0.10, uε of 0.10,and δ of 0.01. The average edge weights µ for visual (v), motor(m), other (o), and inter-connecting regions are µv = 0.6028,

10

Fig. 7. Temporal adaptations of spatial variations. Boxplots showing differences in temporal adaptabilities between brain activities with smooth (pink),moderate (red) and rapid (maroon) spatial variations, measured over the complete experiment (Left) and individual training sessions (Right) for 6 weekexperiment (Top) and 3 day experiment (Bottom). We measured the temporal adaptations using the variance of the averaged activities over the completeexperiment or with individual training sessions. Compared to activities with moderate spatial variations, smooth (95% sessions pass t-test with p < 0.01) andrapid (65% sessions pass t-test with p < 0.005) spatial variations have significantly higher temporal adaptations.

µm = 0.4902, µo = 0.3098, µvm = 0.3985, µvo = 0.3181,and µmo = 0.3271. The correlation coefficients of associationwith training intensity between real and artificial networksfor low, medium, and high graph frequencies are 0.6436,0.7187 and 0.8457, respectively. Additionally, the variationamong participants in real dataset can be highly mimickedusing artificial network model we proposed, with correlationcoefficients 0.9338, 0.9660, and 0.9486 for low, medium, andhigh graph frequencies, respectively. The analysis for the threeday training dataset is highly similar (correlation coefficients0.9834, 0.9186, and 0.9674 for low, medium, and high graphfrequencies, respectively) and for this reason we do not presentand analyze it separately here.

V. FREQUENCY DECOMPOSITION OF BRAIN SIGNALS

The previous sections focus on the study of brain networksand their graph spectral properties. In this section, we in-vestigate brain signals from a GSP perspective, and analyzethe brain signals by examining the decomposed graph signalsxL,xM, and xH with respect to the underlying brain networks.We compute the absolute magnitude of the decomposed signalxL for each brain region averaged across all sample signals foreach individual during a scan session and then averaged acrossall participants. Similar aggregation is applied for xM and xH.

Fig. 5 presents the distribution of the decomposed signalscorresponding to different levels of spatial variations for thefirst scan session (top row) and the last scan session (bottomrow) in the 6 week experiment. Fig. 6 exhibits how thedecomposed signals are distributed across brain regions inthe 3 day experiment. Brain regions with absolute magnitudeslower than a fixed threshold are not colored.

A. Temporal Variation of Graph Frequency Components

We analyze temporal variation of decomposed signals withrespect to different levels of spatial variations. To that end, weevaluate the variance of the decomposed signals over multipletemporal scales – over days and minutes [42] – for the twoexperiments. We describe the method specifically for xL forsimplicity and similar computations were conducted for xMand xH. At the macro timescale, we average the decomposedsignals xL for all sample points within each scanning sessionwith different sequence type, and evaluate the variance ofthe magnitudes of the signals [16] across all the scanningsessions and sequence types. For the 6 week experiment,there are 4 scanning sessions and 3 different sequence types,so the variance is with respect to 12 points. For the 3 dayexperiment, there are 3 scanning sessions and only 1 sequencetype, and therefore the variance is for 3 points. As for themicro or minute-scale, we average the decomposed signalsxL for all sample points within each minute, and evaluatethe variance of the magnitudes of the averaged signals acrossall minute windows for each scanning session with differentsequence types. The evaluated variance is then averaged acrossall participants of the experiment of interest.

Fig. 7 displays the variance of the decomposed signalsxL,xM and xH at two different temporal scales of the twoexperiments. For the 6 week dataset, 3 session-sequencecombinations, with the number proportional to the level ofexposure of participants to the sequence (1-MIN refers to MINsequence at session 1, 5 denotes MIN sequence at session 4, 9entails EXT sequence at session 3) are selected out of the 12combinations in total for a cleaner illustration, but all the othersession-sequence combinations exhibit similar properties.

11

B. Discussion

A deep analysis of Figs. 5 and 6 yields many interestingaspects of graph frequency decomposition. First, for xL, themagnitudes on adjacent brain regions tend to possess highlysimilar values, resulting in a more evenly spread brain signaldistribution, where as for xH, neighboring signals can exhibithighly dissimilar values; this corroborates the motivation touse graph frequency decomposition to segment brain signalsinto pieces corresponding to different levels of spatial fluc-tuations. Second, decomposed signal of a specific level ofvariations, e.g. xL and xH, are highly similar with respectto different scan sessions in an experiment as well as withrespect to two experiments with different sets of participants.The correlation coefficient between datasets for low, mediumand high graph frequencies are 0.5841, 0.2852, and 0.6469, re-spectively. This reflects the fact that frequency decompositionis formed by applying graph filters with different pass bandsupon signals and therefore should express some consistentaspects of brain signals. Third, recall that we normalize thebrain signals at every sample point for all subjects, and for thisreason signals xL,xM and xH would be similarly distributedacross the brain if nothing interesting happens at the decom-position. However, in both Fig.s 5 and 6, it is observed thatmany brain regions possess magnitudes higher than a thresholdin xL (∼ 60% pass) and xH (∼ 20% pass) while not manybrain regions pass the thresholding with respect to xM (∼ 3%pass). It has long been understood that the brain combinessome degree of disorganized behavior with some degree ofregularity and that the complexity of a system is high whenorder and disorder coexist [43]. xL varies smoothly acrossthe brain network and therefore can be regarded as regularity(order), whereas xH fluctuates rapidly and consequently can beconsidered as randomness (disorder). This evokes the intuitionthat graph frequency decomposition segments a brain signal xinto pieces xL and xH, which reflect order and disorder (andare therefore more interesting), as well as the remaining xM.

For the variance analysis, it is expected for the low graphfrequency components (smooth spatial variation) to exhibitthe smallest temporal variations, exceeded by medium andthen high counterparts. Nonetheless, it is observed that brainactivities with smooth spatial variations exhibit the mostrapid temporal variation. Because it has been shown thattemporal variation of observed brain activities is associatedwith better performance in tasks [16], this indicates a strongercontribution of low graph frequency components during thelearning process. Furthermore, since the measurements werenormalized such that the total energy of overall brain activitiesstayed constant at different sampling points, the rapid temporalchanges of low graph frequency components should be accom-panied by fast temporal variation of some other components,which are found to be high frequency components in all cases.Because these results were consistent for all of the temporalscales and datasets that we examined, and the associationbetween temporal variability and positive performance hasbeen established [17], we concluded that brain activities withsmooth or rapid spatial variations offer greater contributionsduring learning. The graph frequency signatures at different

‖xL‖2 ‖xM‖2 ‖xH‖26 week experiment (linear scale) −0.3155 0.0897 0.4125

6 week experiment (logarithm scale) −0.5409 0.3992 0.35653 day experiment −0.9873 0.8443 0.9605

Fig. 8. Pearson correlation coefficients between the number of trials (levelof task familiarity) and R values, defined as correlations between learningrate parameters and the norm of the decomposed signal of interest. Moreobvious adaptability between decomposed signals and learning across trainingis observed for xL and xH, with decreasing association with exposure to tasksfor the former and increasing importance for the latter.

stages of learning is analyzed in the next section.

VI. FREQUENCY SIGNATURES OF TASK FAMILIARITY

Given that the decomposed signals exhibit interesting per-spectives, it is natural to probe whether the signals corre-sponding to different levels of spatial variations associate withlearning. To that end, we first describe how learning rate isevaluated. Given a participant, for each sequence completed,we defined the movement time M as the difference betweenthe time of the first button press and the time of the lastbutton press during a single sequence. We then estimate theparticipant’s learning rate by fitting an exponential function(plus a constant) using the robust outlier correction [44] tothe sequence of movement times M

M = c1et/κ + c2. (16)

where t is a sequence representing the time index, κ is theexponential drop-off parameter (which we call the “learningrate parameter”) used to describe the early and fast rate ofimprovement, and c1 and c2 are nonnegative constants. Theirsum c1 + c2 is an estimation of the starting speed of theparticipant of interest prior to training, while the parameterc2 entails the fastest speed to complete the sequence attainedby that participant after extended training. A negative valueof κ indicates a decrease in movement time M(t), whichis thought to indicate that learning is occurring [45]. Wechose exponential because it is viewed as the most statisticallyrobust choice [46]. Further, the approach that we used has theadvantage of estimating the rate of learning independent ofinitial performance or performance ceiling.

We evaluate the learning rate for all participants at eachscanning session, and then compute the correlation betweenthe norm ‖xL‖2 of the decomposed signal corresponding tolow spatial variation and the learning rates across subjects.The correlation (R value) between the norms ‖xM‖2 as wellas ‖xH‖2 and learning rates are also calculated. Fig. 9 plotsthe Pearson correlation coefficients at all scanning sessions ofthe two experiments considered. The horizontal axis denotesthe level of exposure of participants to the sequence – whichday in the 3 day experiment and how many number of trialsparticipants have completed at the end of the scanning sessionin the 6 week experiment. Points are densely distributed forsmall number of trials in the 6 week experiment, so to mitigatethis effect, we also plot the points by taking the logarithmof numbers of trials completed. We emphasize that due tonormalization at each sampling point, the correlation values

12

Number of Trials0 500 1000 1500 2000 2500

-0.5

0

0.5R

Val

ue

Number of Trials0 500 1000 1500 2000 2500

-0.5

0

0.5

Number of Trials0 500 1000 1500 2000 2500

-0.5

0

0.5

25 100 1000

-0.5

0

0.5

Number of Trials

R V

alue

25 100 1000

-0.5

0

0.5

Number of Trials25 100 1000

-0.5

0

0.5

Number of Trials

1 2 3

-0.5

0

0.5

Day

R V

alue

1 2 3

-0.5

0

0.5

Day1 2 3

-0.5

0

0.5

Day

Fig. 9. Scatter plots depicting the number of trials (level of task familiarity) and R values, defined here as correlations between learning rate parametersand the norm of the decomposed signal of interest (Pink points in the Left: xL, Red points in the Middle: xM, and Maroon points in the Right: xH). Toprow: 6 week experiment with number of trials described in linear scale. Middle row: 6 week experiment withe number of trials evaluated in logarithm scale.We examine 6 week experiment by ordering the number of trials in both linear and logarithm scales to alleviate the fact that number of trials are denselydistributed towards small values. Bottom row: 3 day experiment with scanning session ordered by the number of days in the experiment.

would all be 0 if graph frequency decomposition segmentsbrain signals into three equivalent pieces. There are scansessions where the correlation is of particular interest, howeverthe most noteworthy observation is the change of correlationvalues with the level of exposure of participants. For xLcorresponding to smooth spatial variation, its correlation withlearning is above zero (≈ 0.25) at the start of the trainingwhen participants perform the task for the first time. Thecorrelation gradually decreases with heavier training intensityuntil below zero (≈ −0.25) at the end of the experimentwhen individuals are highly familiar with the sequence. For xHcorresponding to vibrant spatial variation, its correlation withlearning is below zero (≈ −0.2) at the start of the training, andgradually increases throughout training until it is above zero(≈ 0.25) at the end of the experiment – the exact opposite ofxL. For xM, correlation between its norm ‖xM‖2 with learningrate generally increases with the intensity of training, howeverthis trend is not as obvious compared to other decompositioncounterparts. The correlation between the number of trials andR value is summarized in Fig. 8. For robustness testing, weconduct similar analysis using the other two sets of parametersdescribed in Section IV. The average absolute difference forentries in Fig. 8 between the original parameters and the othertwo parameter choices is 0.0377. Again, similar observationsare found in different experiments involving different learningtasks and different sets of participants.

A. Discussion

This result further implies that the most association betweenlearning or adaptability during the training process comes fromthe brain signals that either vary smoothly (xL, regularity) orrapidly (xH, randomness) with respect to the brain network.Therefore, the graph frequency decomposition could be usedto capture more informative brain signals by filtering out non-informative counterparts, most likely associated with middlegraph frequencies. Besides, the positive association between‖xL‖2 and learning rates as well as the negative associationbetween ‖xH‖2 and learning rates at the start of trainingindicates that it favors learning to have more smooth, spread,and cooperative brain signals when we face an unfamiliartask. As we gradually become familiar with the task, thesmooth and cooperative signal distribution becomes less andless important, and there is a level of exposure when suchsignal distribution becomes destructive instead of constructive.We note that the task in the 3 day experiment is more difficultcompared to that of the 6 week experiment, and therefore thetime when the cooperative signal distribution starts to becomedetrimental (the point where the regression line intercepts thehorizontal line of R value equaling 0) is also comparable inthe two experiments, describing a certain level of familiarityto the task. When we become highly familiar with the task,it is better and favors further learning to have varied, spiking,and competitive brain signals.

In the dataset evaluated here, we utilize the average co-

13

herence between time series at pairs of brain cortical andsubcortical regions during the training as the network. Hence,a concentration of brain activities towards low graph fre-quencies would imply that activities on brain regions thatare generally cooperative are indeed similar. Simultaneously,the interpretation of concentration of brain activities towardshigh graph frequencies is that brain activities on brain regionsthat are generally cooperative are in fact dissimilar. In termsof learning, one possible explanation is that there are twodifferent stages in learning: we start by grasping the big pictureof the task to perform relatively well, and then we refine thedetails to perform better and to approach our limits.

Because the graph frequency analysis method presented inthis paper applies to any setting where signals are defined ontop of a network structure representing proximities betweennodes, it would be interesting in future to use this method toinvestigate other types of signals and networks in neuroscienceproblems. As an example, in situations given fMRI measure-ments on structural networks, concentration of signals in lowgraph frequency components would imply functional activitiesdo behave according to the structural networks.

Besides, it has been understood that learning is differentwhen one is unfamiliar or familiar with a particular task – itis easy to improve performance at first exposure due to the factthat one is far from their performance ceiling. It would there-fore be interesting to utilize graph frequency decompositionto further analyze the difference between learning scenarios atdifferent stages of familiarity, e.g. adaptability at first exposureand creativity when one fully understands the components ofthe specific tasks.

VII. CONCLUSION

We used graph spectrum methods to analyze functionalbrain networks and signals during simple motor learning tasks,and established connections between graph frequency withprincipal component analysis when the networks of interestdenote functional connectivity. We discerned that brain ac-tivities corresponding to different graph frequencies exhibitdifferent levels of adaptability during learning. Further, thestrong correlation between graph spectral property of brainnetworks with the level of familiarity of tasks was observed,and the most contributing frequency signatures at different taskfamiliarity was recognized.

REFERENCES

[1] H. Haken, Principles of brain functioning: a synergetic approach tobrain activity, behavior and cognition. Springer Science & BusinessMedia, 2013, vol. 67.

[2] M. D. Fox and M. E. Raichle, “Spontaneous fluctuations in brain activityobserved with functional magnetic resonance imaging,” Nature ReviewsNeuroscience, vol. 8, no. 9, pp. 700–711, 2007.

[3] J. Britz, D. Van De Ville, and C. M. Michel, “Bold correlates of eegtopography reveal rapid resting-state network dynamics,” Neuroimage,vol. 52, no. 4, pp. 1162–1170, 2010.

[4] L. J. Cole, M. J. Farrell, E. P. Duff, J. B. Barber, G. F. Egan, andS. J. Gibson, “Pain sensitivity and fmri pain-related brain activity inalzheimer’s disease,” Brain, vol. 129, no. 11, pp. 2957–2965, 2006.

[5] J. D. Medaglia, W. Huang, S. Segarra, C. Olm, J. Gee, M. Grossman,A. Ribeiro, C. T. McMillan, and D. S. Bassett, “Brain network effi-ciency is influenced by pathological source of corticobasal syndrome,”Neurology, vol. (submitted), 2016.

[6] S. Achard, R. Salvador, B. Whitcher, J. Suckling, and E. Bullmore, “Aresilient, low-frequency, small-world human brain functional networkwith highly connected association cortical hubs,” The Journal of neuro-science, vol. 26, no. 1, pp. 63–72, 2006.

[7] E. Bullmore and O. Sporns, “The economy of brain network organiza-tion,” Nature Reviews Neuroscience, vol. 13, no. 5, pp. 336–349, 2012.

[8] J. Richiardi, S. Achard, H. Bunke, and D. Van De Ville, “Machine learn-ing with brain graphs: predictive modeling approaches for functionalimaging in systems neuroscience,” Signal Processing Magazine, IEEE,vol. 30, no. 3, pp. 58–70, 2013.

[9] H. Q. Nguyen, P. Chou, Y. Chen et al., “Compression of human bodysequences using graph wavelet filter banks,” in Acoustics, Speech andSignal Processing (ICASSP), 2014 IEEE International Conference on.IEEE, 2014, pp. 6152–6156.

[10] J. Ma, W. Huang, S. Segarra, and A. Ribeiro, “Diffusion filtering forgraph signals and its use in recommendation systems,” in Proc. Int. Conf.Acoustics Speech Signal Process, vol. (submitted), Shanghai, China,March 20-25 2016.

[11] S. Segarra, W. Huang, and A. Ribeiro, “Diffusion and superpositiondistances for signals supported on networks,” IEEE Trans. Signal Inform.Process. Networks, vol. 1, no. 1, pp. 20–32, March 2015.

[12] A. Gadde, A. Anis, and A. Ortega, “Active semi-supervised learningusing sampling theory for graph signals,” in Proceedings of the 20thACM SIGKDD international conference on Knowledge discovery anddata mining. ACM, 2014, pp. 492–501.

[13] A. Sandryhaila and J. M. Moura, “Discrete signal processing on graphs,”IEEE Trans Signal Process, vol. 61, no. 7, pp. 1644–1656, 2013.

[14] ——, “Discrete signal processing on graphs: Frequency analysis,” IEEETrans Signal Process, vol. 62, no. 12, pp. 3042–3054, 2014.

[15] D. Shuman, S. K. Narang, P. Frossard, A. Ortega, P. Vandergheynstet al., “The emerging field of signal processing on graphs: Extendinghigh-dimensional data analysis to networks and other irregular domains,”IEEE Signal Process Mag, vol. 30, no. 3, pp. 83–98, 2013.

[16] D. D. Garrett, N. Kovacevic, A. R. McIntosh, and C. L. Grady, “Themodulation of bold variability between cognitive states varies by ageand processing speed,” Cerebral Cortex, p. bhs055, 2012.

[17] J. J. Heisz, J. M. Shedden, and A. R. McIntosh, “Relating brain signalvariability to knowledge representation,” Neuroimage, vol. 63, no. 3, pp.1384–1392, 2012.

[18] N. Leonardi, J. Richiardi, M. Gschwind, S. Simioni, J.-M. Annoni,M. Schluep, P. Vuilleumier, and D. Van De Ville, “Principal componentsof functional connectivity: a new approach to study dynamic brainconnectivity during rest,” NeuroImage, vol. 83, pp. 937–950, 2013.

[19] R. Viviani, G. Gron, and M. Spitzer, “Functional principal componentanalysis of fmri data,” Human brain mapping, vol. 24, no. 2, pp. 109–129, 2005.

[20] L. M. Harrison, W. Penny, J. Daunizeau, and K. J. Friston, “Diffusion-based spatial priors for functional magnetic resonance images,” Neu-roimage, vol. 41, no. 2, pp. 408–423, 2008.

[21] J. Shi and J. Malik, “Normalized cuts and image segmentation,” PatternAnalysis and Machine Intelligence, IEEE Transactions on, vol. 22, no. 8,pp. 888–905, 2000.

[22] J. A. Kleim, S. Barbay, N. R. Cooper, T. M. Hogg, C. N. Reidel, M. S.Remple, and R. J. Nudo, “Motor learning-dependent synaptogenesis islocalized to functionally reorganized motor cortex,” Neurobiology oflearning and memory, vol. 77, no. 1, pp. 63–77, 2002.

[23] D. S. Bassett, M. Yang, N. F. Wymbs, and S. T. Grafton, “Learning-induced autonomy of sensorimotor systems,” Nature neuroscience,vol. 18, no. 5, pp. 744–751, 2015.

[24] F. Chung, Spectral graph theory. American Mathematical Soc., 1997,vol. 92.

[25] W.-T. Zhang, Z. Jin, G.-H. Cui, K.-L. Zhang, L. Zhang, Y.-W. Zeng,F. Luo, A. C. Chen, and J.-S. Han, “Relations between brain networkactivation and analgesic effect induced by low vs. high frequency elec-trical acupoint stimulation in different subjects: a functional magneticresonance imaging study,” Brain research, vol. 982, no. 2, pp. 168–178,2003.

[26] J. Grutzendler, N. Kasthuri, and W.-B. Gan, “Long-term dendritic spinestability in the adult cortex,” Nature, vol. 420, no. 6917, pp. 812–816,2002.

[27] S. Segarra, A. G. Marques, G. Leus, and A. Ribeiro, “Reconstructionof graph signals through percolation from seeding nodes,” Trans.Signal. Process., vol. (submitted), 2015. [Online]. Available: http://arxiv.org/abs/1507.08364

[28] D. S. Bassett, N. F. Wymbs, M. P. Rombach, M. A. Porter, P. J. Mucha,and S. T. Grafton, “Task-based core-periphery organization of humanbrain dynamics,” PLoS Comput. Biol, vol. 9, no. 9, p. e1003171, 2013.

http://arxiv.org/abs/1507.08364

http://arxiv.org/abs/1507.08364

14

[29] D. S. Bassett, N. F. Wymbs, M. A. Porter, P. J. Mucha, J. M. Carlson,and S. T. Grafton, “Dynamic reconfiguration of human brain networksduring learning,” Proc Amer Philos Soc Natl Acad Sci USA, vol. 108,no. 18, pp. 7641–7646, 2011.

[30] A. Zalesky, A. Fornito, I. H. Harding, L. Cocchi, M. Yucel, C. Pantelis,and E. T. Bullmore, “Whole-brain anatomical networks: does the choiceof nodes matter?” Neuroimage, vol. 50, no. 3, pp. 970–983, 2010.

[31] J. Wang, L. Wang, Y. Zang, H. Yang, H. Tang, Q. Gong, Z. Chen,C. Zhu, and Y. He, “Parcellation-dependent small-world brain functionalnetworks: A resting-state fmri study,” Hum Brain Mapp, vol. 30, no. 5,pp. 1511–1523, 2009.

[32] J. D. Power, A. L. Cohen, S. M. Nelson, G. S. Wig, K. A. Barnes, J. A.Church, A. C. Vogel, T. O. Laumann, F. M. Miezin, B. L. Schlaggaret al., “Functional network organization of the human brain,” Neuron,vol. 72, no. 4, pp. 665–678, 2011.

[33] F. T. Sun, L. M. Miller, and M. D’Esposito, “Measuring interregionalfunctional connectivity using coherence and partial coherence analysesof fmri data,” Neuroimage, vol. 21, no. 2, pp. 647–658, 2004.

[34] Y. He, Z. J. Chen, and A. C. Evans, “Small-world anatomical networks inthe human brain revealed by cortical thickness from mri,” Cereb cortex,vol. 17, no. 10, pp. 2407–2419, 2007.

[35] C. R. Genovese, N. A. Lazar, and T. Nichols, “Thresholding of statis-tical maps in functional neuroimaging using the false discovery rate,”Neuroimage, vol. 15, no. 4, pp. 870 – 878, 2002.

[36] M. Xia, J. Wang, Y. He et al., “Brainnet viewer: a network visualizationtool for human brain connectomics,” PloS one, vol. 8, no. 7, p. e68910,2013.

[37] C. Lohse, D. S. Bassett, K. O. Lim, and J. M. Carlson, “Resolvinganatomical and functional structure in human brain organization: Iden-tifying mesoscale organization in weighted network representations,”2014.

[38] P. ERDdS and A. R&WI, “On random graphs i,” Publ. Math. Debrecen,vol. 6, pp. 290–297, 1959.

[39] R. Albert and A.-L. Barabasi, “Statistical mechanics of complex net-works,” Reviews of modern physics, vol. 74, no. 1, p. 47, 2002.

[40] C. Aicher, A. Z. Jacobs, and A. Clauset, “Learning latent block structurein weighted networks,” Journal of Complex Networks, p. cnu026, 2014.

[41] V. Braitenberg and A. Schuz, Cortex: statistics and geometry of neuronalconnectivity. Springer Science & Business Media, 2013.

[42] K. M. Newell, G. Mayer-Kress, S. L. Hong, and Y.-T. Liu, “Adaptationand learning: Characteristic time scales of performance dynamics,” HumMov Sci, vol. 28, no. 6, pp. 655–687, 2009.

[43] O. Sporns, Networks of the Brain. MIT press, 2011.[44] D. A. Rosenbaum, Human motor control. Academic press, 2009.[45] E. Dayan and L. G. Cohen, “Neuroplasticity subserving motor skill

learning,” Neuron, vol. 72, no. 3, pp. 443–454, 2011.[46] A. Heathcote, S. Brown, and D. Mewhort, “The power law repealed:

The case for an exponential law of practice,” Psychonomic bulletin &review, vol. 7, no. 2, pp. 185–207, 2000.

Documents

Graph Frequency Analysis of Brain Signals - arXiv · 2016-05-04 · 1 Graph Frequency Analysis of Brain Signals Weiyu Huang, Leah Goldsberry, Nicholas F. Wymbs, Scott T. Grafton,