15
U.S. stock market interaction network as learned by the Boltzmann Machine Stanislav S. Borysov, 1, 2, * Yasser Roudi, 1, 3 and Alexander V. Balatsky 1, 4 1 Nordita, KTH Royal Institute of Technology and Stockholm University, Roslagstullsbacken 23, SE-106 91 Stockholm, Sweden 2 Nanostructure Physics, KTH Royal Institute of Technology, Roslagstullsbacken 21, SE-106 91 Stockholm, Sweden 3 The Kavli Institue for Systems Neuroscience, NTNU, 7030 Trondheim 4 Institute for Materials Science, Los Alamos National Laboratory, Los Alamos, NM 87545, USA (Dated: September 10, 2015) We study historical dynamics of joint equilibrium distribution of stock returns in the U.S. stock market using the Boltzmann distribution model being parametrized by external fields and pairwise couplings. Within Boltzmann learning framework for statistical inference, we analyze historical behavior of the parameters inferred using exact and approximate learning algorithms. Since the model and inference methods require use of binary variables, effect of this mapping of continuous returns to the discrete domain is studied. The presented analysis shows that binarization preserves market correlation structure. Properties of distributions of external fields and couplings as well as industry sector clustering structure are studied for different historical dates and moving window sizes. We found that a heavy positive tail in the distribution of couplings is responsible for the sparse market clustering structure. We also show that discrepancies between the model parameters might be used as a precursor of financial instabilities. PACS numbers: 89.65.Gh, 02.50.Tt, 64.70.kj I. INTRODUCTION Price formation on a financial market is a complex problem: It reflects opinion of investors about true value of the asset in question, policies of the producers, exter- nal regulation and many other factors. Given the big number of factors influencing price, many of which un- known to us, describing price formation essentially re- quires probabilistic approaches. In the last decades, syn- ergy of methods from various scientific areas has opened new horizons in understanding the mechanisms that un- derlie related problems. One of the popular approaches is to consider a financial market as a complex system, where not only a great number of constituents plays crucial role but also non-trivial interaction properties between them [1, 2]. For example, related interdisciplinary studies of complex financial systems have revealed their enhanced sensitivity to fluctuations and external factors near crit- ical events [3–6] with overall change of internal structure [7, 8]. This can be complemented by the research devoted to equilibrium [9–12] and non-equilibrium [13, 14] phase transitions. In general, statistical modeling of the state space of a complex system requires writing down the probability distribution over this space using real data. In a sim- ple version of modeling, the probability of an observable configuration (state of a system) described by a vector of variables s can be given in the exponential form p(s)= Z -1 exp {-βH(s)} , (1) * [email protected]; Present address: Singapore-MIT Al- liance for Research and Technology, 1 CREATE Way, #09-02, Create Tower, 138602 Singapore where H is the Hamiltonian of a system, β is inverse tem- perature (further β 1 is assumed) and Z is a statis- tical sum. Physical meaning of the model’s components depends on the context and, for instance, in the case of fi- nancial systems, s can represent a vector of stock returns and H can be interpreted as the inverse utility function [15]. Generally, H has parameters defined by its series ex- pansion in s. Basing on the maximum entropy principle [16, 17], expansion up to the quadratic terms is usually used, leading to the pairwise interaction models. In the equilibrium case, the Hamiltonian has form H(s)= -h | s - s | Js, (2) where h is a vector of size N of external fields and J is a symmetric N × N matrix of couplings (| denotes transpose). The model may also involve hidden states (nodes), which are not directly observable, but here we restrict ourselves to considering the visible nodes case only. The energy-based models represented by Eq. (1) play essential role not only in statistical physics but also in neuroscience (models of neural networks [18, 19]) and ma- chine learning (also known as Boltzmann machines [20]). Recently, applications of the pairwise interaction models to financial markets have been also explored [15, 21–23]. Given topological similarities between neural and finan- cial networks [24], these systems can be considered as examples of complex adaptive systems [25], which are characterized by the adaptation ability to changing envi- ronment, trying to stay in equilibrium with it. From this point of view, market structural properties, e.g. cluster- ing and networks [26–28], play important role for mod- eling of stock prices. Adaptation (or learning) in these systems implies change of the parameters of H as finan- arXiv:1504.02280v2 [q-fin.ST] 9 Sep 2015

arXiv:1504.02280v2 [q-fin.ST] 9 Sep 2015 · distribution over this space using real data. In a sim-ple version of modeling, the probability of an observable con guration (state of

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • U.S. stock market interaction network as learned by the Boltzmann Machine

    Stanislav S. Borysov,1, 2, ∗ Yasser Roudi,1, 3 and Alexander V. Balatsky1, 4

    1Nordita, KTH Royal Institute of Technology and Stockholm University,Roslagstullsbacken 23, SE-106 91 Stockholm, Sweden

    2Nanostructure Physics, KTH Royal Institute of Technology,Roslagstullsbacken 21, SE-106 91 Stockholm, Sweden

    3The Kavli Institue for Systems Neuroscience, NTNU, 7030 Trondheim4Institute for Materials Science, Los Alamos National Laboratory, Los Alamos, NM 87545, USA

    (Dated: September 10, 2015)

    We study historical dynamics of joint equilibrium distribution of stock returns in the U.S. stockmarket using the Boltzmann distribution model being parametrized by external fields and pairwisecouplings. Within Boltzmann learning framework for statistical inference, we analyze historicalbehavior of the parameters inferred using exact and approximate learning algorithms. Since themodel and inference methods require use of binary variables, effect of this mapping of continuousreturns to the discrete domain is studied. The presented analysis shows that binarization preservesmarket correlation structure. Properties of distributions of external fields and couplings as well asindustry sector clustering structure are studied for different historical dates and moving windowsizes. We found that a heavy positive tail in the distribution of couplings is responsible for thesparse market clustering structure. We also show that discrepancies between the model parametersmight be used as a precursor of financial instabilities.

    PACS numbers: 89.65.Gh, 02.50.Tt, 64.70.kj

    I. INTRODUCTION

    Price formation on a financial market is a complexproblem: It reflects opinion of investors about true valueof the asset in question, policies of the producers, exter-nal regulation and many other factors. Given the bignumber of factors influencing price, many of which un-known to us, describing price formation essentially re-quires probabilistic approaches. In the last decades, syn-ergy of methods from various scientific areas has openednew horizons in understanding the mechanisms that un-derlie related problems. One of the popular approaches isto consider a financial market as a complex system, wherenot only a great number of constituents plays crucial rolebut also non-trivial interaction properties between them[1, 2]. For example, related interdisciplinary studies ofcomplex financial systems have revealed their enhancedsensitivity to fluctuations and external factors near crit-ical events [3–6] with overall change of internal structure[7, 8]. This can be complemented by the research devotedto equilibrium [9–12] and non-equilibrium [13, 14] phasetransitions.

    In general, statistical modeling of the state space ofa complex system requires writing down the probabilitydistribution over this space using real data. In a sim-ple version of modeling, the probability of an observableconfiguration (state of a system) described by a vector ofvariables s can be given in the exponential form

    p(s) = Z−1 exp {−βH(s)} , (1)

    [email protected]; Present address: Singapore-MIT Al-liance for Research and Technology, 1 CREATE Way, #09-02,Create Tower, 138602 Singapore

    where H is the Hamiltonian of a system, β is inverse tem-perature (further β ≡ 1 is assumed) and Z is a statis-tical sum. Physical meaning of the model’s componentsdepends on the context and, for instance, in the case of fi-nancial systems, s can represent a vector of stock returnsand H can be interpreted as the inverse utility function[15]. Generally, H has parameters defined by its series ex-pansion in s. Basing on the maximum entropy principle[16, 17], expansion up to the quadratic terms is usuallyused, leading to the pairwise interaction models. In theequilibrium case, the Hamiltonian has form

    H(s) = −hᵀs− sᵀJs, (2)

    where h is a vector of size N of external fields and Jis a symmetric N × N matrix of couplings (ᵀ denotestranspose). The model may also involve hidden states(nodes), which are not directly observable, but here werestrict ourselves to considering the visible nodes caseonly.

    The energy-based models represented by Eq. (1) playessential role not only in statistical physics but also inneuroscience (models of neural networks [18, 19]) and ma-chine learning (also known as Boltzmann machines [20]).Recently, applications of the pairwise interaction modelsto financial markets have been also explored [15, 21–23].Given topological similarities between neural and finan-cial networks [24], these systems can be considered asexamples of complex adaptive systems [25], which arecharacterized by the adaptation ability to changing envi-ronment, trying to stay in equilibrium with it. From thispoint of view, market structural properties, e.g. cluster-ing and networks [26–28], play important role for mod-eling of stock prices. Adaptation (or learning) in thesesystems implies change of the parameters of H as finan-

    arX

    iv:1

    504.

    0228

    0v2

    [q-

    fin.

    ST]

    9 S

    ep 2

    015

    mailto:[email protected]

  • 2

    cial and economic systems evolve. Using statistical in-ference for the model’s parameters, the main goal is tohave a model capable of reproducing statistical observ-ables based on time series for a particular historical pe-riod. In the pairwise case, the objective is

    〈si〉data = 〈si〉model,〈sisj〉data = 〈sisj〉model,

    (3)

    where i = 1, . . . , N and angular brackets denote statisti-cal averaging over time.

    Having specified general mathematical model, one canalso discuss similarities between financial and infinite-range magnetic systems in terms of phenomena related,e.g. extensivity, order parameters and phase transitions,etc. These features can be captured even in the simpli-fied case, when si is a binary variable taking only twodiscrete values. Effect of the mapping to a binarized sys-tem, when the values si = +1 and si = −1 correspond toprofit and loss respectively, is also studied in the paper.In this case, diagonal elements of the coupling matrix,Jii, are zero because s

    2i = 1 terms do not contribute

    to the Hamiltonian. It is worth stressing that the cur-rent investigation develops ideas outlined in the previousstudies [15, 21–23] in a way that the effect of binarization,comparison of learning algorithms, evolution and scalingproperties of the parameters distributions are studied ina more systematic fashion.

    The paper is organized as follows. In Section II, basicstatistical definitions and inference methods for h andJ are presented. In Section III, effect of binarizationof stock returns and historical evolution of the modelparameters are discussed. Finally, the main findings ofour investigation are summarized in Section IV.

    II. DATA AND METHODS

    We study historical dynamics of the U.S. stock mar-ket using N = 71 stock prices time series [29] fromthe Standard & Poor’s 500 index (hereafter S&P 500)listed in Table I. We analyze discrete daily closing prices,Si(t), starting from 1990 till 2013 (5828 trading days),which are converted to logarithmic returns, srawi (t) =ln[Si(t)/Si(t− 1)].

    A. Basic statistical analysis of financial time series

    The top panel in Fig. 1(a) shows historical data forthe average stock return. In order to extract long-termtrends from the time series, we employ a simple movingwindow (or simple moving average, SMA) approach. Itacts like a low-pass filter, averaging out high frequencycomponents, which are usually related to noise. Withinthe SMA approach, data is divided into chunks (or win-dows) of size T , assuming the time series to be stationaryon this scale. In this case, tth chunk corresponds to T

    FIG. 1. Historical dynamics of the mean raw and binaryreturns (a) and amplitudes of their Discrete Fourier trans-forms (b). The distance between two labeled dates is 1000trading days. A few major financial crises are highlightedwith the light green background: (1) Asian and Russian cri-sis of 1997–1998, (2) dot-com bubble, (3) U.S. stock marketdownturn of 2002, (4) U.S. housing bubble, (5) bankruptcy ofLehman Brothers followed by the global financial crisis, (6)European sovereign debt crisis. The number in each panelshows overall historical correlation between the correspond-ing series. Historical values of average return, economic cyclesand frequency of crashes are preserved for the binary returnsdespite the maximum magnitude being bounded.

    equally weighted previous values of daily log-returns (in-cluding the current one). For each set of chunks, onecan calculate different statistical characteristics, such asaverage

    〈si〉 =1

    T

    T−1∑t′=0

    si(t′), (4)

    and covariance matrix, C (N ×N), with the elements

    Cij = 〈sisj〉 − 〈si〉〈sj〉, (5)

    where T ≥ N is assumed for the covariance matrix to bepositive definite. Hereinafter, angular brackets 〈 〉 de-note averaging over time (historical values), while bar ¯denotes averaging over index (vector or matrix elements).It is also possible to investigate nonlinear dependencebetween time series using more sophisticated statisticalconcepts [30] or nonlinear data transformations, howeveronly the simplest linear case is considered in the currentpaper. Series variance is autocovariance, σ2i ≡ Cii, whereσ denotes standard deviation or volatility in finance. Usu-ally, SMA volatility serves as the simplest risk measurequantifying stability of returns. In order to quantify de-viation from the normal distribution, it is also useful to

  • 3

    define higher-order moments, such as skewness

    〈Skew (si)〉 =

    〈(si − 〈si〉

    σi

    )3〉(6)

    and kurtosis (also known as excess kurtosis)

    〈Kurt (si)〉 =

    〈(si − 〈si〉)4

    〉σ4i

    − 3, (7)

    which both equal zero in the Gaussian case. Indeed, SMAfilter allows one to extract long-term trends in the marketand, for a large value of N , various moments of the distri-bution of returns can be used to identify market crashes(Figs. 2 and 3). Henceforth, we will call the consideredportfolio “market” as it represents a big number of thetop companies from S&P500 index. Also, we provide cor-responding confidence intervals for these moments usingbasic bootstrapping algorithm (sampling with replace-ment) which are depicted in Figs. 2 and 3. However,a more comprehensive discussion on the moments esti-mates for the non-stationary time series, besides movingwindow and correlated variable effects [31], would alsorequire considering nontrivial long-term memory effectsand intraday correlations [32, 33] inherent to financialtime series, what goes beyond the scope of the currentinvestigation.

    As mentioned in the Introduction, to reduce theamount of data required for our inference, we focus on abinarized version of the returns and not on the raw data.We thus define the binarized version of the returns as

    sbini = sign (srawi ) . (8)

    This procedure also mitigates the scaling problem of mul-tiple series since it assigns the same value of stock returnindependently of its absolute value. Another commontechnique to deal with this problem is standardization

    sstdi =srawi − 〈srawi 〉

    σi. (9)

    In this case, correlation matrix, Q (N ×N), is a normal-ized covariance matrix with the elements

    Qij =Cijσiσj

    . (10)

    Standardization procedure is known to preserve marketbehavior and widely used for the statistical analysis of thefinancial time series [34]. Effects of these transformationsdefined by Eqs. (8)–(10) are shown in Figs. 1–3 and willbe discussed further in Section III where some simplestatistical properties of the binarized return versus theraw and standardized returns are compared. Before this,however, we briefly describe the inference procedure.

    B. Equilibrium Boltzmann learning methods

    We harness inference methods based on maximizing ofthe model’s likelihood L(h,J | sdata). In the equilibriumcase, exact learning of the Hamiltonian [Eq. (2)] parame-ters implies solving Eq. (3) in a self-consistent way, wherecorrections δhi and δJij on each learning step can be cal-culated as

    δhi = ηh (〈si〉data − 〈si〉model) ,δJij = ηJ (〈sisj〉data − 〈sisj〉model) .

    (11)

    Here, ηh and ηJ are learning rates, 〈·〉data are empirical(observed) moments and 〈·〉model are the moments sam-pled from the model using Monte Carlo (MC) methods.The exact learning algorithm always yields optimal val-ues for h and J if there are no hidden nodes in the system[35].

    Being in general slow, the exact learning algorithmmight be substituted by the approximate inference meth-ods [36] which are based on expansion of the free energyof a system for small fluctuations around its mean value.The first-order (näıve) approximation within the meanfield theory (nMF) gives

    JnMF = A−1 −C−1,

    hnMFi = tanh−1〈si〉 −

    N∑j=1

    JnMFij 〈si〉,(12)

    where Aij = (1−〈si〉2)δij and δij is the Kronecker delta.Here, taking into account the diagonal element Jii (whichis usually discarded) for the calculation of correspond-ing hi improves accuracy of the approximation, beingknown as the diagonal-weight trick [36]. The second-order correction to the nMF approximation requires solv-ing Thouless-Anderson-Palmer (TAP) equations(

    C−1)ij

    = −JTAPij − 2(JTAPij

    )2 〈si〉〈sj〉,hTAPi = h

    nMFi − 〈si〉

    N∑j=1

    (JTAPij

    )2 (1− 〈si〉2

    ),

    (13)

    where Eq. (12) should be used instead for the calculationof the external fields if the diagonal-weight trick is used.

    Another class of approximations can be derived usingexpansion of the free energy in pairwise correlations. Thesimplest independent-pair (IP) approximation assumesindependence of every stock pair from the rest of thesystem. In this case, couplings and external fields can befound as [19]

    Jpairij =14 ln

    [(1+mi+mj+C∗ij)(1−mi−mj+C

    ∗ij)

    (1−mi+mj−C∗ij)(1+mi−mj−C∗ij)

    ],

    hpairi =12 ln

    (1+mi1−mi

    )−

    N∑j

    Jpairij mj +O(β2) (14)

    where C∗ij = Cij + mimj . Sessak and Monasson (SM)derived higher-order corrections to the IP approximation

  • 4

    FIG. 2. Historical dynamics of the first four temporal moments of the distribution of mean market return (a). Top-bottom:temporal mean, standard deviation, skewness and kurtosis calculated using SMA window of 250 days (approximately onetrading year) for the raw (sraw, blue), standardized (sstd, green) and binary (sbin, red) returns of 71 U.S. stocks (Table I). 95%confidence intervals for the moments of the distribution of sbin are calculated using bootstrapping algorithm and denoted withthe gray dashed lines. The number in each panel corresponds to overall historical correlation between the time series for sstd

    and sbin. Dependence of these overall correlations on the moving window size is shown in (b). Binary returns behave similarto raw and standardized returns, however information about kurtosis is completely lost.

    FIG. 3. Historical dynamics of the first four moments of the distribution of off-diagonal elements of covariance (Craw, blue)and correlation (Qraw, green) matrices of raw returns, and covariance matrix of binary returns (Cbin, red) calculated usingSMA window of 250 days (a). Top-bottom: mean, standard deviation, skewness and kurtosis of the off-diagonal elements of thematrices. 95% confidence intervals for the moments of distribution of Cbinij are calculated using bootstrapping algorithm anddenoted with the gray dashed lines. The number in each panel corresponds to overall historical correlation between the timeseries for Qraw and Cbin. Dependence of these overall correlations on the moving window size is shown in (b). Binarizationmakes covariance matrix similar to the correlation matrix of raw returns.

    in a closed form using other terms in the perturbativecorrelation expansion [37]

    JSMij = JnMFij + J

    pairij −

    Cij

    (1−m2i )(1−m2j)−(Cij)2 ,

    hSMi = hpairi

    (15)

    It is also worth noting that there have been developed afew other approximate inference schemes tailored for dif-ferent system regimes, such as a pseudo-maximum likeli-hood inference using all data [38], minimum probability

    flow [39] or mean field approximations for low tempera-tures [40]. For example, pseudo-maximum likelihood in-ference [38], being more suitable for sparse connections,can be studied in a context of the correlation clusteringstructures described in Subsection III D.

  • 5

    III. RESULTS AND DISCUSSION

    A. Effect of binarization

    The mapping defined by Eq. (8) obviously affects infor-mation contained in the time series. In order to estimatethis effect, we compare time series for the average rawand binary returns first. As Fig. 1 shows, correlation co-efficient between historical values of sraw and sbin is veryhigh (0.91). Moreover, correlation between amplitudesof their Fourier transforms is also high (0.80), suggest-ing that signatures of economic cycles and frequency ofmarket crashes are preserved in the binarized time se-ries. However, maximum magnitude of binary returns isobviously bounded by the definition.

    Comparison of the filtered time series using SMA alsosuggests the idea that the information about the markettrends is preserved in the binarized time series. Withthis aim, we compare historical evolution of the first fourmoments of the distribution of average binarized versusraw and standardized returns. The results presented inFig. 2 indicate that the binary returns behave similarlyto the raw and standardized returns, preserving dynam-ics of the first two moments and less so about the thirdmoment, while information about kurtosis is lost for allT [Fig. 2(b)]. Regarding the pairwise correlations, Fig-ure 3 shows that the covariance matrix of the binarizedreturns becomes similar to the correlation matrix of theraw returns. Indeed, their off-diagonal elements followsimilar distributions with very high correlation betweenmeans. For higher-order moments, correlation decreaseswith T [Fig. 3(b)].

    Another way to explore effects of the binarization pro-cedure is to study eigenvalue distribution of correlationmatrices for the original and transformed time series,since it is known to be different from the distribution of arandom correlation matrix [41]. With this aim, we com-pare historical dynamic of the four biggest eigenvalues,which do not match the Wigner law. As Fig. 4 shows, allfour values are well preserved and the biggest one corre-sponding to the “market mode” is in remarkable agree-ment after binarization independently of moving windowsize.

    Having validated that binary returns indeed capturemarket historical behavior similarly to standardized re-turns, we proceed to the Boltzmann model inference andevolution of its parameters.

    B. Comparison of approximate and exact learningmethods

    Following the SMA approach with T = 250 tradingdays (approximately one trading year), we study histori-cal evolution of the external fields and couplings inferredfor each chunk of the binarized time series. We em-ploy four approximate (nMF, TAP, IP and SM) learningmethods described in the previous section and compare

    them with the exact learning, where MC sampling is usedfor the latter.

    Figure 5 shows comparison of the inferred parametersfor four different historical dates. Without use of thediagonal-weight trick, inference of external fields in thenMF case works better when the dates far from mar-ket crashes are considered (top row in the two leftmostcolumns, corresponding to 22 Dec 1997 and 09 Jan 2002).TAP correction slightly improves the inference accuracyhowever producing overestimated values in comparisonwith the exact learning algorithm (second row). BothMF approximations perform worse for the data near mar-ket crashes (two top rows in the two rightmost columns,corresponding to 09 Oct 2008 and 13 March 2012), whilethe diagonal-weight trick allows to achieve almost per-fect accuracy in both cases (red triangles in the first tworows), justifying the need of high-order corrections. TheIP algorithm (third row) shows much worse performancefor the all dates except 09 Jan 2002. Although higher-order SM corrections (third row) considerably improveaccuracy of the IP external fields, it is still lower than inthe MF cases. Couplings inferred using both MF approx-imations show the same trend, being more/less accuratefar from/near crashes (fourth row). Although the bulkof the MF couplings is in an excellent agreement withthe exact couplings even near crashes, positive outliersare overestimated. The couplings estimated using the IPalgorithm are in a very poor agreement with the exactcouplings for the all dates considered (bottom row), whilethe use of the SM correction yields considerable improve-ment (bottom row). Moreover, it correctly captures thebiggest outliers, outperforming the MF algorithms.

    Historical dynamics of the inference quality for approx-imate methods is depicted in Fig. 6, where normalizedroot mean square error

    NRMSE(x,y) =

    √√√√ (xi − yi)2(yi − y)2

    (16)

    and correlation between the model parameters inferredusing different methods are presented. It confirms thatthe diagonal-weight trick considerably improves inferencequality of the MF external fields, while couplings are sys-tematically overestimated. Within the small correlationapproximation, IP algorithm does not perform well forall historical dates, while quality of the SM couplings iscomparable to the MF cases (however being still lowerin general) and decreases for the periods where highercorrelations are observed. Thus, the observed behaviorindeed justifies use of TAP as a reliable approximationof true couplings and external fields (making use if thediagonal trick), which has been extensively used in theprevious studies [15, 21–23]. However, special care shouldbe taken about overestimated positive outliers, where SMalgorithm can be helpful.

    To validate results obtained by the exact learning al-gorithm, we sampled stock returns from the model usingMC sampling. The single and pairwise empirical mo-

  • 6

    FIG. 4. Historical dynamics of the four biggest eigenvalues of the correlation matrix of raw returns (green) and covariancematrix of binary returns (red) calculated using SMA window of 250 days (a). The number in each panel corresponds to overallhistorical correlation between the time series. Dependence of these overall correlations on the moving window size is shown in(b). Binarization preserves biggest eigenvalues of the correlation matrix.

    ments are nicely recovered from the exact model (fig-ures are not shown), while the third-order covariances,〈(si − 〈si〉) (sj − 〈sj〉) (sk − 〈sk〉)〉, are almost not cap-tured by the model (Fig. 7). Although the inability tocapture third-order correlations could be due to the lackof data for estimating the real third-order covariances,this is unlikely to be the case as the degree of mismatchbetween the third-order covariances of the Ising modeland the data is the same even when somehow biggeramounts of data are used for their estimation. Never-theless, in spite of the fact that individual third-ordercovariances are not captured well, historical dynamics oftheir mean can be recovered from the model (Fig. 8).

    C. Distribution of inferred parameters

    Figure 9 shows histograms of the external fields andcouplings inferred using the exact learning algorithm forthree different window sizes: T = 250, 1000 and 5000trading days. The distribution of external fields is closeto the Gaussian and does not possess outliers indepen-dently of T . The bulk of the couplings is also distributednormally, however a heavy positive tail is present. Forbigger window sizes, the Gaussian bulk component ofthe inferred couplings becomes less prominent and thetail starts to dominate. Although this behavior has beenpreviously reported in Ref. [22], a role of the tail has re-mained unclear. We will address this question in the nextsubsection.

    In fact, as shown in Fig. 10, all moments of the dis-tribution of Jij scale with T . On the contrary, there isno obvious dependence for hi, which behavior is closeto the external fields inferred on randomly shuffled time

    series (where each element is swapped with other ran-domly picked element using uniform distribution withinthe moving window chunk). Higher value of h inferredon randomly shuffled time series can be explained by thefact that it compensates positive (ferromagnetic) contri-bution to the mean return stemming from the positivemean of the inferred couplings: In the shuffled case, Jbecomes zero while all

    〈sbini

    〉remain unaffected, there-

    fore this effect should be compensated by the externalfields.

    Historical dynamics of the moments of the distribu-tions shown in Fig. 11 indicates that the average externalfield is strongly correlated with the mean return (0.90),while higher moments, being almost stable over the his-torical period considered, seem not to convey any par-ticular information about market behavior. Without useof the diagonal-weight trick, h

    nMFis completely incon-

    sistent with hExact

    [Fig. 11(a)]. Although hTAPi behavemore similar to hExacti , they are significantly overesti-mated. External fields inferred using the IP and SMalgorithms show similar dynamics to hnMFi and h

    TAPi

    respectively, however being even more greatly overesti-mated (figures are not shown). At the same time, thediagonal-weight trick allows to achieve almost perfect ac-curacy in both MF cases [Fig. 11(b)]. Distribution ofcouplings has stable small positive mean correspondingto ferromagnetic interaction, however with almost halfof couplings being negative, i.e. the system is likely toexhibit frustrated configurations. The SM algorithm ingeneral performs better than TAP for couplings estima-

    tion except for their mean value JSM

    , which becomes

    inconsistent with JExact

    for the historical periods withhighly correlated market [Fig. 11(c), top panel].

    It is also interesting to note that the standard deviation

  • 7

    1

    0

    1h

    nM

    F

    Corr=0.87NRMSE=0.63Corr=1.00NRMSE=0.02

    23 Dec 1997

    nMF

    +diag

    1

    0

    1

    hT

    AP

    Corr=0.96NRMSE=0.63Corr=1.00NRMSE=0.02

    TAP

    +diag

    1 0 1

    hExact

    1

    0

    1

    hIP,S

    M

    Corr=0.43NRMSE=5.31Corr=0.93NRMSE=0.42

    IP

    SM

    1

    0

    1

    2

    JnM

    F,T

    AP

    Corr=1.00NRMSE=0.09Corr=1.00NRMSE=0.09

    nMF

    TAP

    1 0 1 2

    JExact

    1

    0

    1

    2

    JIP,S

    M

    Corr=0.52NRMSE=1.47Corr=1.00NRMSE=0.09

    IP

    SM

    Corr=0.94NRMSE=0.40Corr=1.00NRMSE=0.02

    09 Jan 2002

    nMF

    +diag

    Corr=0.97NRMSE=0.46Corr=1.00NRMSE=0.02

    TAP

    +diag

    1 0 1

    hExact

    Corr=0.74NRMSE=0.75Corr=0.92NRMSE=0.41

    IP

    SM

    Corr=0.99NRMSE=0.19Corr=0.99NRMSE=0.19

    nMF

    TAP

    1 0 1 2

    JExact

    Corr=0.51NRMSE=1.21Corr=0.97NRMSE=0.24

    IP

    SM

    Corr=0.54NRMSE=1.06Corr=1.00NRMSE=0.05

    09 Oct 2008

    nMF

    +diag

    Corr=0.81NRMSE=1.71Corr=1.00NRMSE=0.04

    TAP

    +diag

    1 0 1

    hExact

    Corr=0.27NRMSE=6.40Corr=0.77NRMSE=0.72

    IP

    SM

    Corr=0.99NRMSE=0.21Corr=0.99NRMSE=0.20

    nMF

    TAP

    1 0 1 2

    JExact

    Corr=0.42NRMSE=1.61Corr=0.99NRMSE=0.23

    IP

    SM

    Corr=0.78NRMSE=0.88Corr=1.00NRMSE=0.07

    13 Mar 2012

    nMF

    +diag

    Corr=0.87NRMSE=1.75Corr=1.00NRMSE=0.06

    TAP

    +diag

    1 0 1

    hExact

    Corr=0.35NRMSE=8.02Corr=0.80NRMSE=0.81

    IP

    SM

    Corr=0.99NRMSE=0.24Corr=0.99NRMSE=0.24

    nMF

    TAP

    1 0 1 2

    JExact

    Corr=0.44NRMSE=1.77Corr=0.96NRMSE=0.37

    IP

    SM

    FIG. 5. Comparison of external fields (three top rows) and couplings (two bottom rows) inferred using the exact and approximate(nMF, TAP, IP and SM) learning algorithms for four different historical dates and SMA window of 250 trading days. Thediagonal-weight trick is essential for correct inference of external fields in the mean field (nMF and TAP) cases. Both mean fieldapproximations overestimate couplings with big absolute values. Accuracy of the IP algorithm is very low for both externalfields (third row) and couplings (bottom row). SM corrections significantly improves the IP results and outperforms the TAPapproximation for the positive outliers in the distribution of couplings.

  • 8

    FIG. 6. Comparison (normalized root mean square error andcorrelation coefficient) of the external fields and couplings in-ferred using different methods: (a) nMF versus exact (blue),TAP versus exact (green) and nMF versus TAP (red); (b)IP versus exact (blue), SM versus exact (green) and IP ver-sus SM (red). Without use of the diagonal-weight trick, MFapproximation quality of the external fields is very low andsignificantly drops during financial crashes [(a), two top pan-els], while making use of the trick improves it considerably[(a), two middle panels]. SM algorithm considerably improvesquality of IP external fields, however being still lower thanMF with the diagonal trick [(b), two top panels]. Couplingsinferred using both MF methods are almost the same, withapproximation quality being lower during the periods of fi-nancial crises [(a), two bottom panels]. The same behavior isobserved for SM couplings [(b), two bottom panels].

    of couplings far from the crashes almost linearly increasesover the whole period considered from 0.14 in 1996 to 0.2in 2013 [the second panel in Fig. 11(c)]. This observa-tion gains more meaning when we note that the standarddeviation of the couplings in 1996 equals approximatelyto the standard deviation of the couplings inferred on

    � � � � � � � � � � � � � � � � �

    � � � �

    � � � �

    � � �

    � � �

    � � �

    ���

    ���

    ���

    � � � � � � � � � �

    � � � � � � � � � �

    �� � � �

    � � �

    � � � � � � � � � � � � � � � �

    � � � � � � � � � �

    � � � �

    � � �

    � � �

    ���

    ���

    ���

    �� � � �

    � � �

    � � � � � � � � � � � � � � � � �

    � � � � � � � � �

    � � � � � � � � � � �

    FIG. 7. Comparison of the third-order covariances calculatedusing the real data and 50000 MC samples from the exactIsing model for two different moving window sizes. The Isingmodel is not capable of recovering observed individual third-order covariances independently of moving window size.

    � � � � �

    � � � �

    � � � �

    � � � �

    � � � �

    � � � �

    � � �

    � � �

    � � � � � � � � � � � � � � � � � �

    � � �

    � � �

    � �

    � � �

    �� � � � � � � � � �

    �� � � � � � � � � � � � � ��

    � � � � � � � � � �

    �� � � � �

    � � � �

    � � � � �

    � � � � � � �

    ��

    ��

    FIG. 8. Historical evolution of the first four moments of dis-tribution of the third-order covariances calculated using real(blue), randomly shuffled (gray) and 5000 MC samples (foreach historical date) from the exact Ising model (red) data.The exact Ising model is only capable of capturing histori-cal dynamics of the mean value of the empirical third-ordercovariances. However, all higher moments for the real databehave similarly to the ones calculated on randomly shuffledtime series.

    randomly shuffled returns, while its value is almost twicebigger than one for the shuffled time series in 2013. Also,during the biggest market crashes, standard deviationof couplings has jumps because interconnections on themarket become tighter as a result of the herding behaviorduring a financial turmoil: Until system is not adaptedto a new economic reality, prices tend to move collec-tively with overall market performance as a benchmark[42]. The heavy positive tail, which can be characterizedby higher order moments, increases for some historicalperiods. A reason for it is unknown at the moment.

  • 9

    100

    101T

    =25

    0

    100

    101

    T=

    1000

    0.4 0.0 0.4

    hExacti

    100

    101

    T=

    5000

    100101102

    100101102

    1 0 1

    JExactij

    100101102

    100101102

    100101102

    10-1 100

    logJExactij

    100101102

    FIG. 9. Histograms of the external fields and couplings in-ferred using the exact learning algorithm for 27 Jan 2010 forthree different moving window sizes. The red curve denotesthe Gaussian fit. For small moving window sizes, the bulk ofcouplings is distributed normally, while a heavy positive taildominates for bigger sizes of moving window.

    D. Industry-related clustering structure

    It is well known that a stock market possesses a hier-archical clustering structure which can be detected usingcorrelations [26, 43] or couplings between stock prices[15] or trading volumes [23]. One of the most populartechniques employed to find related structures is basedon the minimum spanning tree (MST) algorithm. Forinstance, Prim’s algorithm—a basic MST constructionalgorithm—consists of the three following steps:

    1. Initialize a tree with a single stock, chosen arbitrar-ily from the all stocks.

    2. Find the stock not yet in the tree which has thestrongest coupling (biggest value of Cij or Jij) toa stock in the tree and include it to the tree.

    3. Repeat step 2 until all stocks are in the tree.

    Figure 12 shows examples of the MST constructed usingcovariance and coupling matrices of binary stock returns,which are similar to the previously reported in Refs. [15]and [23]. The remarkable feature of these structures is amanifestation of industry sector clustering. Here, we de-fine an industry sector cluster as a connected subset of thetree where all stocks belong to the same industry sector,i.e. each cluster member interacts to the other membersonly through the stocks from the same sector. Further,we denote cluster size as Nm,k, where m = 1, . . . ,M is asector index (M = 9 is the total number of sectors listedin Table I) and k = 1, . . . ,Km is an index of the cluster(Km is a number of such clusters for the sector m).

    As mentioned above, there can be many clusters foreach industry sector. In order to estimate overall indus-try clustering degree of the market, let us introduce asimple metric based on finding clusters of the maximum

    size

    Qmst =1

    N

    M∑m=1

    maxk

    Nm,k. (17)

    Intuitively speaking, a small value of Qmst correspondsto the case where stocks do not group with each otherbased on which industry sector they belong to. Its mini-mum value, M/N , is defined by the situation where thebiggest cluster for each sector has only one stock, i.e.Km equals to the total number of stocks in the sectorm. The maximum value of Qmst is 1, which correspondsto the perfect industry clustering structure when there isonly one cluster for each industry sector (Km = 1). Thisclustering measure shows interesting dependence on thesize of moving window (Fig. 13), suggesting an increas-ing degree of sectoral connectedness of the stock marketfor bigger time windows as inferred by the Ising model.Also, the degree of connectedness increases with devia-tion of the distribution from the Gaussian measured byskewness and kurtosis.

    To further investigate the network structure and clus-tering degree of J, we perform the clustering analysisbased on two different filtering procedures, namely, con-sidering only a subset of (i) couplings Jij and (ii) eigen-modes of J corresponding to different eigenvalues λi.With this aim, we choose thresholds J th and λth andconstruct MST only using values Jij ≶ J th and λi ≶λth respectively. Figure 14(a) shows that the biggestdrop/increase of Qmst occurs only when the a small per-cent of the strongest couplings is excluded/included forMST construction. In a similar way, it is also sensitive todiscarding the biggest eigenvalues [Fig. 14(b)]. Thus, onemight conclude that distribution of couplings indeed canbe viewed as a mixture of the Gaussian bulk and heavypositive tail which contains all information about marketclustering structure. From this perspective, use of sparseregularization methods for couplings inference, for exam-ple, `1-regularization discussed in Ref. [44], would be ofpractical interest.

    Finally, it is also worth noting that intraday inter-nal structure of couplings is neither stable (quenched)nor completely random (annealed), somehow preservinga clustering structure with the diameter however beingsmaller near market crashes (figures are not shown, seefor example Ref. [43]). These non-trivial features of acoupling matrix will be studied in more detail in the fu-ture works.

    E. Scaling of inferred parameters

    In order to study extensive properties of the system,we investigate scaling properties of the parameter distri-butions with number of stocks, which are usually char-acterized by the scaling exponent Nα. We estimate itsaverage value for each distribution moment over scalingexponents for randomly selected subsets of stocks of dif-ferent size.

  • 10

    � � � � � � � � � � � � � � � � � � � � �

    � � � � �

    � � � �

    � � � �

    � � � �

    � � �

    � � �

    � � � � � � � � � � �

    � � � � � � � � � � �

    � � � � � � � �

    ��

    ��

    ����

    � � � � � � � � � � � � � � � � � � � � �

    � � � �

    � � � �

    � � � �

    � � � �

    ���

    ���

    ��

    ��

    � � � � � � � � � � �

    � � � � � � � � � � �

    � � � � � � � �

    � � � � � � � � � � � � � � � � � � � � �

    � �

    ��

    ��

    ��

    ��

    ����

    � � � � � � � � � � �

    � � � � � � � � � � �

    � � � � � � � �

    � � � � � � � � � � � � � � � � � � � � �

    � �

    ��

    ���

    ����

    � � � � � � � � � � �

    � � � � � � � � � � �

    � � � � � � � �

    � � � � � � � � � � � � � � � � � � � � �

    � � � � �

    � � � �

    � � � �

    � � � �

    � � � �

    � � � �

    � � � �

    � � � � � � � � � � �

    � � � � � � � � � � �

    � � � � � � � �

    ��

    ��

    �����

    � � � � � � � � � � � � � � � � � � � � �

    � � � �

    � � � �

    � � � �

    � � � �

    � � � �

    � � � �

    ���

    ���

    ��

    ���

    � � � � � � � � � � �

    � � � � � � � � � � �

    � � � � � � � �

    � � � � � � � � � � � � � � � � � � � � �

    � �

    ��

    ��

    ��

    ��

    �����

    � � � � � � � � � � �

    � � � � � � � � � � �

    � � � � � � � �

    � � � � � � � � � � � � � � � � � � � � �

    � �

    � �

    � �

    � �

    � �

    ��

    ���

    �����

    � � � � � � � � � � �

    � � � � � � � � � � �

    � � � � � � � �

    FIG. 10. Scaling of the first four moments of the distribution of exact external fields (top row) and couplings (bottom row) withmoving window size T for two different historical dates and randomly shuffled time series for 27 Jan 2010. Higher value of themean value of the external fields inferred on randomly shuffled time series compensates positive (ferromagnetic) contributionto the mean return of the mean value of couplings, which is zero for the shuffled time series.

    FIG. 11. Historical dynamics of the first four moments of distribution of the external fields inferred without (a) and with (b) useof the diagonal-weight trick using the nMF (blue), TAP (green) and exact (red) inference algorithms; off-diagonal couplings (c)inferred using the SM (blue), TAP (green) and exact (red) inference algorithms. Moments of the distributions of the couplingsand external fields inferred using the TAP algorithm on randomly shuffled returns are denoted with the gray dashed line. Bothmean field approximations incorrectly estimate external fields without use of the diagonal-weight trick and tend to overestimatecouplings. The SM approximation allows to correctly infer the biggest couplings, however their mean is incorrectly capturedfor the historical periods with high correlations. During periods of crisis, increase of couplings strength is observed.

    As Fig. 15 (top row) shows, the distribution of externalfields does not possess any particular scaling law, exceptthe mean, which has α close to −0.75. The other mo-ments scale similar to the corresponding moments of theexternal fields inferred on randomly shuffled time series.Scaling properties of distribution of couplings dependson the size of moving window (Fig. 15, bottom row).When T is small, empirical couplings show similar scal-ing features to the couplings inferred on randomly shuf-fled returns. However, scaling of the mean and standarddeviation becomes closer to the properties of the normaldistribution with growth of T . This dependence mightbe related to the presence of finite-size effects, when useof a small number of historical values is not enough toestimate true correlations on the market. These results

    are similar to the scaling properties obtained in Ref. [22],however their SMA window dependence has been shownfor the first time.

    There are two extreme ways in which a change in thedistribution of couplings can arise as one increases thesize of the observed system. One is that the structure be-tween the previous stocks entirely changes by adding newstocks. Alternatively, the couplings could only changetheir absolute magnitude, while they maintain their mag-nitude relative to one another. To better understandwhere in this spectrum our financial market exists, weperformed analysis similar to Ref. [19]: We chose a ran-dom subset of 20 stocks and analyzed couplings betweenthem for different total number of stocks taken into ac-count for the inference (including the original 20). Fig-

  • 11

    FIG. 12. Minimum spanning tree for the covariance matrix(a) and corresponding exact couplings (b) for 27 Jan 2010 cal-culated using SMA window T = 4000 trading days. Similarindustry-related clustering structure is observed in both cases.The considered sectors are Healthcare (red), Consumer Goods(blue), Basic Materials (green), Financial (cyan), IndustrialGoods (purple), Services (yellow), Technology (orange), Con-glomerate (magenta) and Utilities (dark blue). The graphsare visualized using the NetworkX Python package [45].

    � � � � � � � � � � � � � � � � � � � � �

    � � �

    � � �

    � � �

    � � �

    � �

    � � �

    � �� � �

    � � � � �� � �

    � � � � �

    ��

    ��

    � �

    � � � � � � � � � �� ��

    � �

    � �

    � �

    � �

    � �

    � � � � � � � � �� ��

    ��

    ��

    ��

    ��

    FIG. 13. Quality (industry sector clustering degree) of a min-imum spanning tree, Qmst, depending on moving window sizefor exact and TAP couplings, and covariance matrix (27 Jan2010). The gray dashed and dotted lines denote mean and99.7% confidence interval respectively for the quality of MSTbuilt on randomly shuffled time series. Quality of MST forcouplings increases with deviation of their distribution fromthe Gaussian, characterized by skewness and kurtosis.

    ure 16 shows that the biggest/smallest values of Jij re-main the same with growth of N and their scaling be-comes closer to the normal distribution for bigger timewindows. This behavior also suggests that important fea-tures of market connectivity are preserved with the num-ber of stocks.

    F. External and internal influence

    Considering the two terms in the system’s Hamiltonian[Eq. (2)], it is also possible to define internal and exter-nal influences in the market. For this purpose, exter-nal fields can be interpreted as the influence of externalfactors, hext ≡ h, while couplings define internal bias,hint = 〈sᵀ〉J (in the MF sense) [22]. In this case, exter-nal contribution corresponds to the individual stocks bi-ases which come from outside the market, while internal

    FIG. 14. Cutoff analysis for the coupling matrix (a) and itseigenvalues (b) for two SMA windows T = 1000 and T = 5000trading days (27 Jan 2010). Histograms of couplings and cor-responding eigenvalues (left column), and quality (industrysector clustering degree) of a minimum spanning tree, Qmst,(left and right columns) depending on the positive (blackdashed line) and negative (blue solid line) cutoffs, Jth andλth, where all values above or below a cutoff, respectively, arediscarded for MST construction. The red curve denotes theGaussian fit. Discarding both biggest couplings and eigenval-ues significantly affects quality of MST, while negative cou-plings and small eigenvalues do not contribute to the marketclustering structure.

    one is solely defined in terms of internal market interac-tions. Similarly, one can also define two energy terms asH = Eext + Eint, where Eext,int = −

    (hext,int

    )ᵀ 〈s〉. Fig-ure 17 shows that both energies have almost the sameorder of magnitude over the historical period considered,while near the major crashes Eext is more than 10 timesbigger than Eint. The ratio between the mean biases alsopossesses interesting historical dynamics. Being in prin-ciple strongly correlated with the mean return (0.9 for

  • 12

    � � � � � � � � � � � � � � � � � � � � �

    � � � �

    � � � �

    � � �

    � � �

    � � � � � � � � � � � � � �

    � � � � � � � � � � �

    � � � � � � � �

    �

    ����

    ����

    � � � � � � � � � � � � � � � � � � � � �

    � �

    � �

    � �

    � � � � � � � � � � � �

    � � � � � � � � � �

    �

    ��

    ��

    ���

    ��

    � � � � � � � � � � � � � � � � � � � � �

    � �

    � �

    � � � � � � � � � � � �

    � � � � � � � � � �

    �

    ��

    ���

    ��

    � � � � � � � � � � � � � � � � � � � � �

    � � � �

    � � � �

    � � � �

    � � �

    � � �

    � � � �

    � � � � � � � � � � � �

    � � � � � � � � � �

    �

    ��

    ���

    ���

    � � � � � � � � � � � � � � � � � � � � �

    � � � �

    � � � �

    � � �

    � � �

    � � � � � � � � � � � � � �

    � � � � � � � � � � �

    � � � � � � � �

    �

    ����

    �����

    � � � � � � � � � � � � � � � � � � � � �

    � � � �

    � � �

    � � �

    � � � �

    � � � �

    � � �

    � � �

    � � �

    � �

    � � � � � � � � � � � �

    � � � � � � � � � �

    �

    ��

    ��

    ���

    ���

    � � � � � � � � � � � � � � � � � � � � �

    � � � �

    � � � �

    � � �

    � � �

    � � �

    � � � � � � � � � � � �

    � � � � � � � � � �

    �

    ��

    ���

    ���

    FIG. 15. Average scaling exponent with number of stocks, α, for the exact external fields (top row) and couplings (bottomrow) distributions depending on moving window size, T , for two different historical dates and randomly shuffled time series for27 Jan 2010. Scaling exponent of the mean of external fields is close to −0.75 for big values of T , while the higher momentsbehave similar to the external fields inferred on randomly shuffled time series. Empirical couplings scale similar to the couplingsinferred on randomly shuffled time series for small window sizes, while their scaling properties become closer to the propertiesof the Gaussian distribution for bigger values of T .

    � � � � � �

    � � � �

    � � �

    � � �

    � � �

    � � �

    � � �

    � � �

    ���

    � � � � � � � � � � � �

    � � � �

    � � �

    α� � � � � � � �

    � � � � � � ��

    � � � � � � � � � � � ��

    α� � � � � � � �

    � � � � � � �

    � � � �

    � � �

    � � �

    � � �

    � � �

    � � �

    � � �

    ��

    ���

    ���

    ��

    � � �

    � � � � �

    � � � � �

    � � � � � �

    � � � �

    � � �

    � � �

    � � �

    � � �

    � � �

    � � �

    ���

    � � � � � � � � � � � �

    � � � �

    � � �

    α� � � � � � � �

    � � � � � � ��

    � � � � � � � � � � � ��

    α� � � � � � � �

    � � � � � � �

    � � � �

    � � �

    � � �

    � � �

    � � �

    � � �

    � � �

    ��

    ���

    ���

    ��

    � � �

    � � � � �

    � � � � �

    FIG. 16. Scaling of the 380 couplings between a randomly selected subset of 20 stocks depending on total number of stocks usedfor the inference (including original 20) for 27 Jan 2010 and T = 1000 (top row) and T = 5000 (bottom row): 10 biggest/smallestcouplings (first column), mean and standard deviation of the couplings (second column) and their visualization (third column).Relative magnitude of the couplings decreases as the number or stocks grow, while the underlying network structure is preserved.

    hext

    and 0.99 for hint

    ) discrepancies between them mightbe used as a leading indicator of financial instabilities.Away from the periods of crisis, their ratio is almost sta-ble, while divergent behavior is observed before the U.S.market crashes (two bottom panels in Fig. 17). Possi-ble explanation of the observed behavior from a financialpoint of view is still an open question.

    IV. CONCLUSION

    We have investigated various aspects of applicationof the pairwise interaction model to financial time se-ries. The model, being parametrized by external fieldsand couplings, is used for the approximation of the jointequilibrium distribution of stock returns. Since the con-

  • 13

    � � �

    � �

    � � �

    � � � �

    � � � �

    � � �

    � �

    � � � � � � � � � � � � � � � � �

    � �

    ���

    ����

    �����

    ��� �

    � ���

    ����

    �����

    ���

    � �

    FIG. 17. Comparison of external and internal biases for the exact Ising model. Top-bottom: ratio of external and internalenergies contributing to the Hamiltonian, absolute value of the ratio of the mean external and internal biases and its sign.Major market crashes are preceded by growth of the external energy and discrepancy between the biases.

    sidered learning algorithms require use of binary vari-ables, the logarithmic returns are binarized using thesign function. Effect of the binarization suggests that thedistribution of binary returns captures the distributionsof raw and standardized returns to a good degree, pre-serving information about market correlation structure,economic cycles and market crashes as well as industry-related market clustering structure.

    The model parameters are inferred using approximateand exact learning algorithms. Mean field approxima-tions nicely recover bulk of the couplings except outliers,which can be correctly inferred using small correlation ex-pansions. External fields are in the almost perfect agree-ment with the exact algorithm if the diagonal-weighttrick is used. The obtained results also suggest that thequality of approximate inference methods drops in theperiods of financial crises due to increase of magnitude ofthe correlations observed. For the historical period con-sidered, the distribution of external fields is close to theGaussian and does not possess any outliers independentlyof moving window size. On the contrary, the distributionof couplings possesses a heavy positive tail, which startsto dominate over the Gaussian bulk for bigger movingwindow sizes. Mean of external fields decreases with thenumber of stocks while their standard deviation remainsalmost constant, corresponding to the standard deviationof the external fields inferred on randomly shuffled timeseries. Scaling properties of the distribution of couplingsdepends on the moving window size, becoming closerto the properties of the Gaussian distribution with its

    growth. Despite possible presence of finite-size effects, anindustry-related clustering structure is observed for bothexact and approximate couplings. The performed cutoffanalysis suggests that the biggest positive couplings aswell as eigenmodes with the biggest eigenvalues containall information about this structure. Scaling propertiesof the couplings between a small random subset of stockssuggest that the underlying network structure is also pre-served with the number of stocks. Finally, the pairwiseinteraction model also allows one for defining the exter-nal and internal biases which correspond to contributionof external fields and couplings respectively. Discrepan-cies between them might be used as a precursor of immi-nent financial instabilities and should be explained froma financial point of view. These non-trivial market prop-erties as well as their historical evolution will be studiedin the future works, where one of the natural extensionsto incorporate dynamics is to consider the kinetic Isingmodel [46].

    V. ACKNOWLEDGMENTS

    This work is supported by Nordita, VR VCB 621-2012-2983, U.S. DOE, the Marie Curie Training Network NE-TADIS (FP7, grant 290038), the Kavli Foundation andthe Norwegian Research Councils Centre of ExcellentScheme. We are also grateful to the anonymous refer-ees for their valuable suggestions.

    [1] N. F. Johnson, P. Jefferies, and P. M. Hui, FinancialMarket Complexity (Oxford University Press, Oxford,2003).

    [2] D. Sornette, Why Stock Markets Crash: Critical Eventsin Complex Financial Systems (Princeton UniversityPress, Princeton, NJ, 2009).

    [3] R. Albert and A.-L. Barabási, Rev. Mod. Phys. 74, 47(2002).

    [4] K. Kacperski and J. A. Ho lyst, Phys. Lett. A 254, 53(1999).

    [5] K. Kiyono, Z. R. Struzik, and Y. Yamamoto, Phys. Rev.Lett. 96, 068701 (2006).

    [6] A.-L. Barabási and R. Albert, Science 286, 509 (1999).

    http://dx.doi.org/10.1103/RevModPhys.74.47http://dx.doi.org/10.1103/RevModPhys.74.47http://dx.doi.org/ http://dx.doi.org/10.1016/S0375-9601(99)00083-3http://dx.doi.org/ http://dx.doi.org/10.1016/S0375-9601(99)00083-3http://dx.doi.org/10.1103/PhysRevLett.96.068701http://dx.doi.org/10.1103/PhysRevLett.96.068701http://dx.doi.org/10.1126/science.286.5439.509

  • 14

    [7] S. N. Dorogovtsev and J. F. F. Mendes, Evolution ofNetworks: From Biological Nets to the Internet andWWW (Physics) (Oxford University Press, New York,NY, 2003).

    [8] S. Bornholdt and H. G. Schuster, eds., Handbook ofGraphs and Networks: From the Genome to the Inter-net (John Wiley & Sons, New York, NY, 2003).

    [9] P. Gai and S. Kapadia, Philos. Trans. R. Soc. Lond. A466, 2401 (2010).

    [10] M. Newman, SIAM Rev. Soc. Ind. Appl. Math. 45, 167(2003).

    [11] P. Fronczak, A. Fronczak, and J. A. Ho lyst, Eur. Phys.J. B 59, 133 (2007).

    [12] S. H. Strogatz, Nature 410, 268 (2001).[13] P. Holme and M. E. J. Newman, Phys. Rev. E 74, 056108

    (2006).[14] K. Klemm, V. M. Egúıluz, R. Toral, and M. San Miguel,

    Phys. Rev. E 67, 026120 (2003).[15] T. Bury, Physica A 392, 1375 (2013).[16] E. T. Jaynes, Phys. Rev. 106, 620 (1957).[17] E. T. Jaynes, Phys. Rev. 108, 171 (1957).[18] E. Schneidman, M. J. Berry, R. Segev, and W. Bialek,

    Nature 440, 1007 (2006).[19] Y. Roudi, J. Tyrcha, and J. Hertz, Phys. Rev. E 79,

    051915 (2009).[20] Y. LeCun, S. Chopra, R. Hadsell, M. Ranzato, and F. J.

    Huang, in Predicting Structured Data (The MIT Press,Cambridge, MA, 2007) pp. 191–246.

    [21] J. Maskawa, Physica A 311, 563 (2002).[22] T. Bury, Eur. Phys. J. B 86, 1 (2013).[23] H.-L. Zeng, R. Lemoy, and M. Alava, J. Stat. Mech.

    2014, P07008 (2014).[24] P. E. Vértes, R. M. Nicol, S. Chapman, N. Watkins, D. A.

    Robertson, and E. T. Bullmore, Front. Syst. Neurosci.5, 75 (2011).

    [25] C. Hommes, Quant. Financ. 1, 149 (2001).[26] R. Mantegna, Eur. Phys. J. B 11, 193 (1999).[27] J.-P. Onnela, A. Chakraborti, K. Kaski, J. Kertsz, and

    A. Kanto, Phys. Scr. 2003, 48 (2003).

    [28] G. Bonanno, G. Caldarelli, F. Lillo, S. Miccich, N. Vande-walle, and R. Mantegna, Eur. Phys. J. B 38, 363 (2004).

    [29] http://www.finance.yahoo.com/.[30] C. Lee, J. Lee, and A. Lee, Statistics for Business and

    Financial Economics, 3rd ed. (Springer, New York, NY,2013).

    [31] R. S. Pindyck and D. L. Rubinfeld, Econometric mod-els and economic forecasts, Vol. 4 (Irwin/McGraw-HillBoston, 1998).

    [32] A. Utsugi, K. Ino, and M. Oshikawa, Phys. Rev. E 70,026110 (2004).

    [33] B. Podobnik, D. Wang, D. Horvatic, I. Grosse, and H. E.Stanley, EPL (Europhysics Letters) 90, 68001 (2010).

    [34] D. J. Fenn, M. A. Porter, S. Williams, M. McDonald,N. F. Johnson, and N. S. Jones, Phys. Rev. E 84, 026109(2011).

    [35] S.-I. Amari, K. Kurata, and H. Nagaoka, IEEE Trans.Neural. Netw. 3, 260 (1992).

    [36] T. Tanaka, Phys. Rev. E 58, 2302 (1998).[37] V. Sessak and R. Monasson, Journal of Physics A: Math-

    ematical and Theoretical 42, 055001 (2009).[38] E. Aurell and M. Ekeberg, Phys. Rev. Lett. 108, 090201

    (2012).[39] J. Sohl-Dickstein, P. B. Battaglino, and M. R. DeWeese,

    Phys. Rev. Lett. 107, 220601 (2011).[40] H. C. Nguyen and J. Berg, Phys. Rev. Lett. 109, 050602

    (2012).[41] L. Laloux, P. Cizeau, J.-P. Bouchaud, and M. Potters,

    Phys. Rev. Lett. 83, 1467 (1999).[42] S. S. Borysov and A. V. Balatsky, PLoS ONE 9, e105874

    (2014).[43] R. Coelho, S. Hutzler, P. Repetowicz, and P. Richmond,

    Physica A 373, 615 (2007).[44] P. Ravikumar, M. J. Wainwright, and J. D. Lafferty,

    Ann. Statist. 38, 1287 (2010).[45] A. A. Hagberg, D. A. Schult, and P. J. Swart, in Proceed-

    ings of the 7th Python in Science Conference (SciPy2008)(Pasadena, CA USA, 2008) pp. 11–15.

    [46] Y. Roudi and J. Hertz, Phys. Rev. Lett. 106, 048702(2011).

    http://dx.doi.org/10.1098/rspa.2009.0410http://dx.doi.org/10.1098/rspa.2009.0410http://dx.doi.org/10.1137/S003614450342480http://dx.doi.org/10.1137/S003614450342480http://dx.doi.org/10.1140/epjb/e2007-00270-8http://dx.doi.org/10.1140/epjb/e2007-00270-8http://dx.doi.org/10.1038/35065725http://dx.doi.org/10.1103/PhysRevE.74.056108http://dx.doi.org/10.1103/PhysRevE.74.056108http://dx.doi.org/10.1103/PhysRevE.67.026120http://dx.doi.org/http://dx.doi.org/10.1016/j.physa.2012.10.046http://dx.doi.org/10.1103/PhysRev.106.620http://dx.doi.org/10.1103/PhysRev.108.171http://dx.doi.org/10.1038/nature04701http://dx.doi.org/10.1103/PhysRevE.79.051915http://dx.doi.org/10.1103/PhysRevE.79.051915http://dx.doi.org/ http://dx.doi.org/10.1016/S0378-4371(02)00818-Xhttp://dx.doi.org/10.1140/epjb/e2013-30598-1http://stacks.iop.org/1742-5468/2014/i=7/a=P07008http://stacks.iop.org/1742-5468/2014/i=7/a=P07008http://dx.doi.org/ 10.3389/fnsys.2011.00075http://dx.doi.org/ 10.3389/fnsys.2011.00075http://dx.doi.org/10.1080/713665542http://dx.doi.org/10.1007/s100510050929http://stacks.iop.org/1402-4896/2003/i=T106/a=011http://dx.doi.org/ 10.1140/epjb/e2004-00129-6http://www.finance.yahoo.com/http://dx.doi.org/10.1103/PhysRevE.70.026110http://dx.doi.org/10.1103/PhysRevE.70.026110http://stacks.iop.org/0295-5075/90/i=6/a=68001http://dx.doi.org/ 10.1103/PhysRevE.84.026109http://dx.doi.org/ 10.1103/PhysRevE.84.026109http://dx.doi.org/10.1109/72.125867http://dx.doi.org/10.1109/72.125867http://dx.doi.org/10.1103/PhysRevE.58.2302http://stacks.iop.org/1751-8121/42/i=5/a=055001http://stacks.iop.org/1751-8121/42/i=5/a=055001http://dx.doi.org/10.1103/PhysRevLett.108.090201http://dx.doi.org/10.1103/PhysRevLett.108.090201http://dx.doi.org/10.1103/PhysRevLett.107.220601http://dx.doi.org/10.1103/PhysRevLett.109.050602http://dx.doi.org/10.1103/PhysRevLett.109.050602http://dx.doi.org/10.1103/PhysRevLett.83.1467http://dx.doi.org/10.1371/journal.pone.0105874http://dx.doi.org/10.1371/journal.pone.0105874http://dx.doi.org/http://dx.doi.org/10.1016/j.physa.2006.02.050http://dx.doi.org/10.1214/09-AOS691http://dx.doi.org/10.1103/PhysRevLett.106.048702http://dx.doi.org/10.1103/PhysRevLett.106.048702

  • 15

    TABLE I. List of the companies which stock prices are used for the calculations in the paper.

    Ticker Name Sector Ticker Name Sector

    ABT Abbott Laboratories Hea AIG American International Group, Inc. FinAMGN Amgen Inc. Hea APA Apache Corp. BasAPC Anadarko Petroleum Corp. Bas AAPL Apple Inc. ConAXP American Express Company Fin BA The Boeing Company IndBAC Bank of America Corp. Fin BAX Baxter International Inc. HeaBMY Bristol-Myers Squibb Company Hea C Citigroup, Inc. FinCAT Caterpillar Inc. Ind CELG Celgene Corporation HeaCL Colgate-Palmolive Co. Con CMCSA Comcast Corporation SerCOP ConocoPhillips Bas COST Costco Wholesale Corp. SerCSCO Cisco Systems, Inc. Tec CVS CVS Caremark Corp. SerCVX Chevron Corp. Bas DD E. I. du Pont de Nemours and Co. BasDE Deere & Company Ind DELL Dell Inc. TecDHR Danaher Corp. Ind DIS The Walt Disney Company SerDOW The Dow Chemical Company Bas EMC EMC Corporation TecEMR Emerson Electric Co. Tec EOG EOG Resources, Inc. BasEXC Exelon Corp. Uti F Ford Motor Co. ConGE General Electric Company Ind HAL Halliburton Company BasHD The Home Depot, Inc. Ser HON Honeywell International Inc. IndHPQ Hewlett-Packard Company Tec IBM International Business Machines Corp. TecINTC Intel Corp. Tec JNJ Johnson & Johnson HeaJPM JPMorgan Chase & Co. Fin KO The Coca-Cola Company ConLLY Eli Lilly and Company Hea LOW Lowe’s Companies Inc. SerMCD McDonald’s Corp. Ser MDT Medtronic, Inc. HeaMMM 3M Company Cng MO Altria Group Inc. ConMRK Merck & Co. Inc. Hea MSFT Microsoft Corp. TecNKE Nike, Inc. Con ORCL Oracle Corporation TecOXY Occidental Petroleum Corp. Bas PEP Pepsico, Inc. ConPFE Pfizer Inc. Hea PG The Procter & Gamble Company ConPNC The PNC Financial Services Group Fin SLB Schlumberger Limited BasSO Southern Company Uti T AT&T, Inc. TecTGT Target Corp. Ser TJX The TJX Companies, Inc. SerTXN Texas Instruments Inc. Tec UNH UnitedHealth Group Incorporated HeaUNP Union Pacific Corp. Ser USB U.S. Bancorp FinUTX United Technologies Corp. Ind VZ Verizon Communications Inc. TecWFC Wells Fargo & Company Fin WMT Wal-Mart Stores Inc. SerXOM Exxon Mobil Corp. Bas

    Industry sectors are defined as Basic Materials (Bas), Conglomerate (Cng), Consumer Goods (Con), Financial (Fin),Healthcare (Hea), Industrial Goods (Ind), Services (Ser), Technology (Tec) and Utilities (Uti).

    U.S. stock market interaction network as learned by the Boltzmann MachineAbstractI IntroductionII Data and methodsA Basic statistical analysis of financial time seriesB Equilibrium Boltzmann learning methods

    III Results and discussionA Effect of binarizationB Comparison of approximate and exact learning methodsC Distribution of inferred parametersD Industry-related clustering structureE Scaling of inferred parametersF External and internal influence

    IV ConclusionV Acknowledgments References