Jean-Francois Cardoso, Jacques Delabrouille and Guillaume Patanchon- Independent Component Analysis of the Cosmic Microwave Background

Embed Size (px)

Citation preview

  • 8/3/2019 Jean-Francois Cardoso, Jacques Delabrouille and Guillaume Patanchon- Independent Component Analysis of the Cos

    1/6

    INDEPENDENT COMPONENT ANALYSIS

    OF THE COSMIC MICROWAVE BACKGROUND

    Jean-Francois Cardoso

    CNRS / ENST - TSI, Paris France

    Jacques Delabrouille, Guillaume Patanchon

    PCC - College de France, Paris, France

    ABSTRACT

    This paper presents an application of ICA to astro-nomical imaging. A first section describes the astro-physical context and motivates the use of source sepa-ration ideas. A second section describes our approachto the problem: the use of a noisy Gaussian station-ary model. This technique uses spectral diversity and

    takes explicitly into account contamination by additivenoise. Preliminary and extremely encouraging resultson realistic synthetic signals and on real data will bepresented at the conference.

    1. COSMIC MICROWAVE BACKGROUND

    The cosmic microwave background (CMB), see fig 1 is a

    Figure 1: COBE (1989) measurements of the CMB

    temperature fluctuations over the sky. The amplitudeof the fluctuations is of a few micro Kelvins with respectto an average temperature of about 2.726 K.

    fossil light whose observation and characterization hasbecome an essential tool for cosmology, the study of theformation of the Universe. This section reviews a fewbasic facts about the CMB and explains the relevanceof source separation to CMB studies.

    Big bang The current, widely accepted, model of theformation of the Universe is the big bang, a scenarioin which the Universe starts very hot and dense andthen expands and cools down. In this scenario, the ex-istence of a fossil light has been predicted even beforeit was observed. The story can be summarized as fol-lows. The time elapsed since the big bang is estimatedto be close to 15 billions years. One of the latest mile-stone events in cosmic life is called the atomic recom-bination and happened when the Universe was 300,000years old. Just before atomic recombination, the Uni-verse was extremely homogeneous, its temperature hadcooled down to about 3000K, and it was made mostly ofphotons, electrons and protons (hydrogen nuclei) witha small proportion of other light nuclei. All these par-ticles were strongly interacting and thus were tightlycoupled and in thermal equilibrium. However, underthe action of a continuing expansion, such a peacefulstate of affairs could not go on forever: expansion di-lutes energy with a subsequent decrease of tempera-

    ture. Below a threshold of about 3000K, the thermalagitation is no longer strong enough to prevent atomicrecombination, that is the formation of atoms fromnuclei and electrons. Atoms being electrically neutral,their interaction with light becomes much weaker: allof a sudden (on a cosmic time scale. . . ), photons canfreely travel through a transparent Universe: radiationand matter decouple. The photons are set free and formost of them, they are set free forever.

    Most of the photons that can be observed todayhave been released at the the time of atomic recombi-nation and have not interacted with matter since then.Thus, the cosmic microwave background is a fossil ra-

    diation. It belongs to the microwave domain becausephotons have cooled down as the Universe itself: theiraverage wavelength, which corresponded to 3000K atrecombination time, has been stretched as the Uni-verse itself by a factor of about 1000 into the mil-limetric/centimetric domain. What can be measuredtoday is a radiation with a spectral power density cor-responding (to a very high precision) to the radiationof a black body at temperature T = 2.726K.

    Author manuscript, published in "4th International Symposium on Independent Component Analysis and Blind Signal Separatio(ICA03), Nara : Japo

    http://hal.archives-ouvertes.fr/
  • 8/3/2019 Jean-Francois Cardoso, Jacques Delabrouille and Guillaume Patanchon- Independent Component Analysis of the Cos

    2/6

    Tiny fluctuations The discovery, by Penzias andWilson, of a microwave radiation with a black bodypower spectrum of about 3K, in strong support of thebig bang model, is a turning point of modern cos-mology. The CMB was found to be extremely homo-geneous: the same 3K spectral density was observed

    to come from all directions in the sky. However, toohomogeneous a CMB would be a puzzle in the bigbang scenario because the large structures of the Uni-verse (galaxies and galaxy clusters) are thought to haveformed via a process of gravitational collapse in whichsmall local over-densities tend to grow by attractingmore strongly matter from neighboring regions of lowerdensity. This phenomenon of condensation is in compe-tition with the overall expansion of the Universe whichtends to dilute matter. Hence the presence of smallinhomogeneities in the original spatial distribution ofmatter is necessary to understand the latter formationof large scale structures. Because of the coupling be-

    tween light and matter before atomic recombination,small inhomogeneities in matter density should be re-flected in the CMB as small inhomogeneities in thespatial distribution of its temperature. Hence, observa-tional cosmology has been feverishly working towardsthe detection of the spatial pattern of CMB tempera-ture in the sky, that is, going beyond its first orderfeature of being a constant field at T = 2.726K. It wasnot before the COBE space mission in 1989 that in-strument sensitivity became high enough to detect tinyfluctuations of a few tens of micro-Kelvins, as displayedin figure 1.

    Even though an observational breakthrough, muchleft to be desired from the COBE experiment becauseof its limited spatial resolution. As is apparent from thepicture, the angular size of the details visible by COBEis of the order of ten degrees. This angular separationcorresponds to regions of the Universe which are solarge that they cannot have been causally connectedin the past! This implies that the low resolution spa-tial pattern does not carry much information about thephysical processes which were taking place at recombi-nation time. Thus, in order to extract more informa-tion from the fossil light, it is necessary to obtain muchfiner pictures of CMB anisotropies. Many experiments

    ground-based, space-based or balloon-borne have fol-lowed COBE. The ultimate space mission, expectedto reach unprecedented levels of accuracy, sensitivityand resolution, is the Planck surveyor, whose launchby the European Spatial Agency is planned in 2007.Planck angular resolution is expected to be well underthe degree scale. The gain in resolution with respect toCOBE is illustrated by the simulation of figure 2. Themajor difference the two maps of fig. 1 and fig 2 is that

    Figure 2: Simulation of CMB anisotropies at expectedPlancks resolution and noise (Credit: G.P. Efstathiou)

    the size of the details visible in the COBE map is lim-ited by the resolution of the instruments whereas thetypical size of the fluctuation patterns seen on the sim-ulated Planck map is governed by physics: it roughly

    corresponds to the size of the horizon at recombinationtime, or about 300,000 light years. There is a conspic-uous main peak in the spatial power spectrum of theCMB corresponding to such an angular scale (fig. 3)

    Harmonic spectrum Gaussian stationary time se-ries are characterized by their power spectrum. Forhomogeneous isotropic processes on the sphere, the cor-responding quantity is called the harmonic spectrum.Denote T(, ) the temperature excess in direction(, ). It can be decomposed into spherical harmonicsas

    T(, ) =l=1

    lm=l

    almYlm(, ), (1)

    where the doubly indexed set Ylm(, ) of spherical har-monic functions is to the sphere what the sines andcosines (of the discrete Fourier transform) are to a one-dimensional interval. A spherical harmonic Ylm(, )accounts for spatial patterns with an angular resolu-tion of 2/l. For a given multipole l, there are 2l + 1spherical harmonics whose coefficients alm are uncorre-lated with variance independent ofm when T(, ) isan homogeneous isotropic process on the sphere. Theso-called harmonic spectrum, or C

    lspectrum, then is

    Cl = E{|alm|2} and its empirical estimate is

    Cl = 12l + 1

    lm=l

    |alm|2. (2)

    Let us stress at this point that the CMB sky stands still(on the human time scale): there is only one Universeand one realization of T(, ) available to us.

  • 8/3/2019 Jean-Francois Cardoso, Jacques Delabrouille and Guillaume Patanchon- Independent Component Analysis of the Cos

    3/6

    Spectral models Cosmologists have constructed mod-els of the CMB formation which predict the shape ofthe CMB spatial power spectrum as a function of a fewcosmologic parameters such as the energy density inthe Universe and other quantities of paramount impor-tance to cosmology. An example of the predicted CMB

    harmonic spectrum is given at figure 3 which is an (ap-propriately rescaled) plot of the Cl as a function of l.The plot shows in solid line an harmonic spectrum, as

    Figure 3: Measurements of the harmonic spectrum Cl

    by several independent experiments and the best fit bya low dimensional cosmological model.

    predicted by the best fit to the data of several CMBexperiments. The peaks in the harmonic spectrum aresignatures of acoustic oscillations, so called becausethey result from the competition between inertia andelasticity of the medium. This is the reason why theshapes and locations of the peaks directly carry physi-cal information.

    The superimposed squares in fig. 3 show how thespectrum is constrained by observations from COBE

    and Boomerang, a balloon-borne experiment which flewover the North pole in 1998. Note that the existenceof a main peak is confirmed by the experiments.

    Astronomical components The CMB is not ex-pected to be the only emission visible in the centime-ter range. Contributions of other sources are called

    foregrounds and should include Galactic dust emission,emission from very remote (and hence quasi point-like)

    galaxy clusters, and several others, including instru-mental effects. Figure 4 illustrates the situation.

    Figure 4: Several emissions contributes to the sky map

    in addition to the CMB. Credit: Bouchet & Gispert

    Separating the cosmic background from the fore-grounds is possible ICA style, because the instrumentsperform multi-spectral imaging, meaning that sky mapsare obtained in several frequency bands (or channels).The good news for ICA people is that the map builtfrom the i-th channel is expected to be well modeled(after some heavy pre-processing) as

    Ti(, ) =j

    AijSj(, ) + Vi(, )

    where Sj(, ) is the spatial pattern for the j-th com-ponent and Vi(, ) is an additive detector noise. Inother words, cosmologists expect to observe a noisy in-stantaneous (non convolutive) mixture of components.Further, these components should be statistically inde-pendent due to their physically distinct origins.

    Blind separation methods This paper is concernedwith blind component separation. The motivation for

  • 8/3/2019 Jean-Francois Cardoso, Jacques Delabrouille and Guillaume Patanchon- Independent Component Analysis of the Cos

    4/6

    a blind approach is obvious: even though some coeffi-cients of the mixture may be known in advance withgood accuracy (in particular those related to the CMB),some other components are less well known or pre-dictable. It is thus very tempting to run blind algo-rithms which do not require a priori information about

    the mixture coefficients.Several attempts at blind component separation for

    CMB imaging have already been reported. The firstproposal, due to Baccigalupi et al. use a non Gaussiannoise-free i.i.d. model for the components[3], hence fol-lowing the standard path to source separation. Oneproblem with this approach is that the most impor-tant component, namely the CMB itself, closely fol-lows a Gaussian distribution. It is well known that, ini.i.d. models, it is possible to accommodate at most oneGaussian component. It seems hazardous, however, touse a non Gaussian model when the main componentitself has a Gaussian distribution.

    Another reason why the i.i.d. modeling (which isimplicit in standard ICA) probably is not appropri-ate to our application: most of the components arevery much dominated by the low-frequency part of theirspectra. Thus sample averages taken through the dataset tend not to converge very quickly to their expectedvalues.

    Section 2 describes a generic method for the blindseparation of noisy mixtures of stationary processes,based on a spectral approximation to the likelihoodwhich seems to work very well on CMB data (see sec. 3).

    2. BLIND SEPARATION OF NOISYMIXTURES VIA SPECTRAL MATCHING

    We describe a spectral matching method for the blindseparation of noisy mixtures of stationary sources. Forease of exposition, we explain the method as applied totimes series rather than to images. Extension to imagesis straightforward (see sec. 3).

    The noisy stationary Gaussian model We modelan m 1-dimensional process y(t) = [y1(t); . . . ; ym(t)]as

    y(t) = As(t) + v(t) (3)where A is an m n matrix with linearly independentcolumns. The n-dimensional source process s(t) (thecomponents) and the m-dimensional noise process v(t)are modeled as real valued, mutually independent andstationary with spectra Rs() and Rv() respectively.The observed process then has spectrum

    Ry() = ARs()A + Rv(). (4)

    Matrices Ry(), Rs() and Rv() are sometimes calledspectral covariance matrices. Independence betweencomponents implies that Rs() is a diagonal matrix:

    [Rs()]ij = ijPi() 1 i, j n

    where Pi() is the power spectrum of the ith sourceat frequency and ij is the Kronecker symbol. Forsimplicity, we also assume that the observation noise isuncorrelated across sensors:

    [Rv()]ij = ijVi() 1 i, j m (5)

    Spectral statistics. A key feature of our methodis that it uses low dimensional statistics obtained asaverages over spectral domains in Fourier space. In the1D case, spectral domains simply are frequency bands.Let (12 , 12) = Qq=1Dq be a partition of the frequencyinterval (

    12 ,

    12) into Q domains (bands). We require

    them to be symmetric: f Dq f Dq.For any function g() of frequency, denote gq its

    average over the q-th spectral domain when sampled atmultiples of 1/T:

    gq = 1wq

    p

    TDq

    g pT

    q = 1, . . . , Q (6)

    where wq is the number of points in domain Dq.Denoting Y() the DFT ofT samples:

    Y() =1

    T

    T1

    t=0y(t)exp(

    2it), (7)

    the periodogram is Ry() = Y()Y() and its aver-aged version is

    Ryq = Y()Y()q. (8)Note that Y() = Y() for real data so that Ryqactually is a real valued matrix if Dq is a symmetricdomain.

    This sample spectral covariance matrix will be ourestimate for the corresponding averaged quantity

    Ryq = ARsqA + Rvq (9)where equality (9) results from averaging (4). Thestructure of the model is not affected by spectral aver-aging in the sense that Rs() and Rv() remain diago-nal matrices after averaging:

    Rsq = diag [P1q, . . . , Pnq] (10)Rvq = diag [V1q, . . . , Vmq] (11)

  • 8/3/2019 Jean-Francois Cardoso, Jacques Delabrouille and Guillaume Patanchon- Independent Component Analysis of the Cos

    5/6

    Blind identification via spectral matching Ourproposal for blind identification simply is to match thesample spectral covariance matrices Ryq, which de-pend on the data, to their theoretical values Ryq,which depend on the parameters to be estimated.

    It is up to the user to decide what the unknown

    parameters are. For instance, one may assume that thenoise is temporally white so that Viq = 2i for all q =1, Q. Similarly, if the spectrum of the i-th componentis known in advance, there is no need to include Piqin the adjustable parameter set. Whatever constraintsare imposed on A, on Piq or on Viq, we denote by the set of parameters needed to represent the unknownquantities in (A, Piq, Viq).

    The unknown quantities are estimated by mini-mizing the following mismatch measure

    () =

    Qq=1

    wq DRyq, Ryq

    (12)

    between the sample statistics and their expected values,where D(, ) is the measure of divergence between twom m positive matrices defined as

    D(R1, R2) = trR1R

    12

    log det(R1R12 ) m (13)(which is nothing but the Kullback divergence betweentwo m-dimensional zero-mean Gaussian distributionswith positive covariance matrices R1 and R2.)

    Thus, the data are summarized by the matrix setRy1, . . . ,

    RyQ and the dependence of () on is

    due to determining (by definition) A, Piq, and Viqwhich, in turn, determine Ryq via eqs (9-10-11).The reason for using the mismatch measure (12) is

    its connection to maximum likelihood principle. Mini-mizing (12) is equivalent to maximizing the likelihoodof the observations in a Gaussian stationary model wherethe DFT coefficients are approximated as being nor-mally and independently distributed. This is knownas the Whittle approximation and and has been usedfor noise-free ICA by Pham [5]. In the noise-free case,the objective (12) boils down to a joint diagonalizationcriterion which can be optimized very efficiently. Algo-rithms for the noisy case are briefly discussed next.

    Algorithmics We only hint at the algorithm issueof minimizing the mismatch (12); more details can befound in [4].

    Because of its connection to likelihood, criterion (12)can be optimized using the EM algorithm. In contrastto the case of noisy mixtures of non Gaussian sources,the Gaussian criterion (12) is easily minimized withEM. Two key points contribute to the simplicity of a

    spectral domain EM. First, the conditional expecta-tions which appear in EM are linear functions of thedata in the Gaussian model. Second, this linearity ispreserved through domain averaging, meaning that EMonly needs to operate on the sample covariance matri-ces

    Ryq. This set of matrices form a sufficient statis-

    tic in our model; it is all that is needed to run theEM algorithm. It is also of much lower size than thedata themselves (for reasonable choices of the spectraldomains).

    The EM algorithm provides us with a simple algo-rithm for the minimization of (12) but it may be slowin the finishing phase. This is true in particular in thecase of CMB data for which the components are wellbelow the noise level in high frequency bands, that is,A2ijPj(q) Vi(q) for some channel i and some compo-nent j for the largest qs. In this case, Pj(q) contributesvery little to Ryq, so that its variations have a veryweak impact on the likelihood. Thus, we have found

    necessary in our application, to complete the minimiza-tion of (12) with a quasi-Newton method. The EMis used to provide us with a very good starting pointso that the quasi-Newton descent (we use the BFGSmethod) can be initialized in a favorable situation.

    3. APPLICATION TO CMB DATA

    In the CMB application, one must process mixturesof images. The spectral matching method applies justas well for small images, with the only modificationthat the spectral domains Dq now are areas in thebi-dimensional Fourier plane rather than just spectralbands. The most natural choice for CMB data analy-sis is to use spectral rings because we expect isotropicspectra but one may elect to use other shapes like frac-tions of spectral rings. This could be a sensible choicein the presence of stripes: these are directional arti-facts (due the sky-scanning strategy) and, if they arebelieved to significantly pollute the data set, they canbe excluded from the averaging whenever they are welllocalized in the Fourier plane.

    When processing large areas of the sky, one may nolonger ignore curvature effects: the 2D Fourier trans-form must be replaced by an harmonic transform as

    in (1) and the spectral domains should logically includeall (l, m) pairs for a given l. The only effect, fromthe algorithmic point of view, of using an harmonictransform is that it takes longer to compute than a 2DFourier transform. Once the sample covariance matri-ces Ryq are computed, the method is identical.

    Room is lacking to include figures illustrating theperformance of the method. The interested reader isreferred to [4] for a performance report on realistic

  • 8/3/2019 Jean-Francois Cardoso, Jacques Delabrouille and Guillaume Patanchon- Independent Component Analysis of the Cos

    6/6

    synthetic data sets. Results on real data coming fromthe balloon-borne Archeops mission (2002) will be pre-sented at the conference.

    Multi-detector multi-component spectral esti-

    mation. In ICA, the parameters governing the source

    distributions are usually considered as nuisance pa-rameters as opposed to the parameter of interest whichis the mixing matrix. Indeed, in the noise-free case, anestimate of the mixing matrix is all that is needed toinvert the mixture and recover the sources. The situ-ation is different in the noisy case because the sourcesare best estimated by taking the noise into account.In our experiments, we use Wiener filtering in the fre-quency domain to estimate the sky maps of the sources.Implementing the Wiener filter requires to know thespectra of the sources and of the noise; these quantitiesare precisely those which are estimated (as nuisanceparameters) in the minimization of (12).

    It should be stressed that our spectral matchingmethod is of interest even in the non-blind case, thatis, when the mixing matrix A is known, because theminimization of the spectral mismatch (12) with re-spect to the source spectra Pi(q) (and also, possiblywith respect to the noise V(q)) for a fixed value of Ayields a non parametric estimate of the source spectra.These are maximum likelihood estimates (in the Whit-tle approximation) which take into account the jointinformationprovided by all the channels and these esti-mates are obtained without even separating the sources.This is to be compared with a more naive approach to

    the estimation of the source spectral density in whichone would first try to separate the sources and thenperform regular univariate spectral estimation. In thiscase, it is not clear how to take into account the effectsof the Wiener filtering and how to remove the effectsof the additive noise. In contrast, the spectral match-ing method directly yields ML spectral estimates forall components.

    4. CONCLUSION

    Blind source separation appears as a promising toolfor CMB studies. We have proposed a method which

    exploits spectral diversity and boils down to adjust-ing smoothed versions of the spectral covariance ma-trices (4) to their empirical estimates.

    The matching criterion is equivalent to the likeli-hood in the Whittle approximation, providing us witha principled way of exploiting the spectral structure ofthe process and leading to the selection of ( 12) as anobjective function. This objective function uses a sum-mary of the data in the form of a set of average spectral

    covariance matrices, offering large computational sav-ings, especially when dealing with images.

    Implementing the EM algorithm on spectral do-mains offers a simple method to minimize the objec-tive function but it may be necessary to complementit with a specialized optimization method. Our ap-

    proach is able to estimate all the parameters: mixingmatrix, average source spectra, noise level in each sen-sor. These parameters can then be used to reconstructthe sources (if needed) via Wiener filtering.

    We conclude by noting that, in the presence of nonGaussian data, the Whittle approximation remains avalid tool (the minimization of (12) still yields consis-tent estimates) but it does not capture all the probabil-ity structure, even in large samples. Thus, more workremains to be done, in particular to exploit the highnon Gaussianity of some components (galaxy clustersappears as quasi point-like structures at the currentresolutions).

    Resources For the interested reader, many wonder-ful sites are available on the Internet to get startedwith the big bang, the Planck mission and everythingcosmologic. Among those, I have particularly appreci-ated Ned Wrights cosmology tutorial [2] (with its sec-tion News of the Universe) and the extensive materialmaintained by Wayne Hu on his home page [1] at theUniversity of Chicago.

    5. REFERENCES

    [1] The home page of Wayne Hu at the University ofChicago. http://background.uchicago.edu/whu/.

    [2] Ned wrights cosmology tutorial.http://www.astro.ucla.edu/ wright/cosmolog.htm.

    [3] C. Baccigalupi, L. Bedini, et al. Neural networksand separation of cosmic microwave backgroundand astrophysical signals in sky maps. Monthly No-tices of the Royal Astronomical Society, 318:769780, November 2000.

    [4] J. Delabrouille, J.-F. Cardoso, and G. Patanchon.Multidetector multicomponent spectral matching

    and applications for CMB data analysis. Submittedto Monthly Notices of the Royal Astronomical Soci-ety, 2003. Available as http://arXiv.org/abs/astro-ph/0211504

    [5] Dinh-Tuan Pham. Blind separation of instanta-neous mixture of sources via the gaussian mutualinformation criterion. In Proc. EUSIPCO, pages621624, 2000.

    http://arxiv.org/abs/astro-ph/0211504http://arxiv.org/abs/astro-ph/0211504http://arxiv.org/abs/astro-ph/0211504http://arxiv.org/abs/astro-ph/0211504