15
The ubiquitous problem of learning system parameters for dissipative two-level quantum systems: Fourier analysis versus Bayesian estimation Sophie G. Schirmer 1 and Frank C. Langbein 2 1 College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 8PP, United Kingdom 2 College of Physical Sciences & Engineering (Computer Science & Informatics), Cardiff University, 5 The Parade, Cardiff, CF24 3AA, United Kingdom (Dated: December 16, 2014) We compare the accuracy, precision and reliability of different methods for estimating key sys- tem parameters for two-level systems subject to Hamiltonian evolution and decoherence. It is demonstrated that the use of Bayesian modelling and maximum likelihood estimation is superior to common techniques based on Fourier analysis. Even for simple two-parameter estimation problems, the Bayesian approach yields higher accuracy and precision for the parameter estimates obtained. It requires less data, is more flexible in dealing with different model systems, can deal better with un- certainty in initial conditions and measurements, and enables adaptive refinement of the estimates. The comparison results shows that this holds for measurements of large ensembles of spins and atoms limited by Gaussian noise as well as projection noise limited data from repeated single-shot measurements of a single quantum device. PACS numbers: 03.67.Lx, 03.65.Wj I. INTRODUCTION Quantum systems play an important role in atomic and molecular physics, chemistry, material science and many important current technologies such as nuclear magnetic resonance imaging [1] and spectroscopy [2], promising nascent quantum technologies such as spin- tronic devices [3], and potential future technologies such as quantum information processing [4]. Novel applica- tions require increasingly sophisticated control, and ac- curate and precise models to facilitate controlled manip- ulation of their dynamics. Although theoretical device modelling remains impor- tant, system identification and data-driven models are becoming increasingly important in many areas of sci- ence and technology to accurately describe individual systems [5]. System identification comprises a range of problems including model identification, model discrim- ination and model verification. Once a model has been selected, the task often reduces to identifying parameters in the model from experimental data. In the quantum domain this is often data from one of the many types of spectroscopy, from magnetic resonance to laser to elec- tron transmission spectroscopy, depending on the phys- ical system. More recently single shot measurements of quantum systems have also become important for quan- tum devices relying on individual quantum states. Fourier analysis of the spectra is frequently used to identify model parameters such as chemical shifts and re- laxation rates by examination of the positions and shape of peaks in a free-induction-decay (FID) spectrum [6]. Fourier analysis of Rabi oscillation spectra has also been used to identify Hamiltonians [7, 8], as well as decoher- ence and relaxation parameters for two-level systems [9], and concurrence spectroscopy [10] has been applied to de- termine information about coupling between qubits. For more complex systems, Bayesian techniques and max- imum likelihood estimation [11] have proved to be ex- tremely valuable to construct data-driven models to iden- tify Hamiltonian parameters [12] and decoherence pa- rameters for multi-level systems [13]. Bayesian tech- niques have also been applied for adaptive Hamiltonian learning using sequential Monte-Carlo techniques [14]. In this work we revisit simpler systems: two-level sys- tems subject to decoherence, one of the simplest but arguably most important models in quantum physics. The model is ubiquitous in magnetic resonance imag- ing, where the magnetization signal from protons (spin- 1 2 particles) precessing and dephasing in a magnetic field is the basis for non-invasive, in-vivo imaging. In quan- tum information it describes qubits as the fundamental building blocks subject to decoherence. Therefore, char- acterization of two-level systems is extremely important. We compare two frequently used estimation strategies based on Fourier analysis and a Bayesian approach com- bined with maximum likelihood estimation, for the ubiq- uitous parameter estimation problem of a two-level sys- tem subject to decoherence. We consider accuracy, pre- cision and efficiency for different systems and noise mod- els, including Gaussian noise, typically encountered for large ensemble measurements, and projection noise, typ- ically present in data from repeated single-system mea- surements. II. SYSTEM AND EXPERIMENTAL ASSUMPTIONS In this section we introduce our dynamic model of the physical system and our assumptions about initialisation and measurement of the system. We focus in particular on the different options for the measurements depend- arXiv:1412.4282v1 [quant-ph] 13 Dec 2014

College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 … · 1College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 8PP, United Kingdom

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 … · 1College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 8PP, United Kingdom

The ubiquitous problem of learning system parameters for dissipative two-levelquantum systems: Fourier analysis versus Bayesian estimation

Sophie G. Schirmer1 and Frank C. Langbein2

1College of Science (Physics), Swansea University,Singleton Park, Swansea, SA2 8PP, United Kingdom

2College of Physical Sciences & Engineering (Computer Science & Informatics),Cardiff University, 5 The Parade, Cardiff, CF24 3AA, United Kingdom

(Dated: December 16, 2014)

We compare the accuracy, precision and reliability of different methods for estimating key sys-tem parameters for two-level systems subject to Hamiltonian evolution and decoherence. It isdemonstrated that the use of Bayesian modelling and maximum likelihood estimation is superior tocommon techniques based on Fourier analysis. Even for simple two-parameter estimation problems,the Bayesian approach yields higher accuracy and precision for the parameter estimates obtained. Itrequires less data, is more flexible in dealing with different model systems, can deal better with un-certainty in initial conditions and measurements, and enables adaptive refinement of the estimates.The comparison results shows that this holds for measurements of large ensembles of spins andatoms limited by Gaussian noise as well as projection noise limited data from repeated single-shotmeasurements of a single quantum device.

PACS numbers: 03.67.Lx, 03.65.Wj

I. INTRODUCTION

Quantum systems play an important role in atomicand molecular physics, chemistry, material science andmany important current technologies such as nuclearmagnetic resonance imaging [1] and spectroscopy [2],promising nascent quantum technologies such as spin-tronic devices [3], and potential future technologies suchas quantum information processing [4]. Novel applica-tions require increasingly sophisticated control, and ac-curate and precise models to facilitate controlled manip-ulation of their dynamics.

Although theoretical device modelling remains impor-tant, system identification and data-driven models arebecoming increasingly important in many areas of sci-ence and technology to accurately describe individualsystems [5]. System identification comprises a range ofproblems including model identification, model discrim-ination and model verification. Once a model has beenselected, the task often reduces to identifying parametersin the model from experimental data. In the quantumdomain this is often data from one of the many types ofspectroscopy, from magnetic resonance to laser to elec-tron transmission spectroscopy, depending on the phys-ical system. More recently single shot measurements ofquantum systems have also become important for quan-tum devices relying on individual quantum states.

Fourier analysis of the spectra is frequently used toidentify model parameters such as chemical shifts and re-laxation rates by examination of the positions and shapeof peaks in a free-induction-decay (FID) spectrum [6].Fourier analysis of Rabi oscillation spectra has also beenused to identify Hamiltonians [7, 8], as well as decoher-ence and relaxation parameters for two-level systems [9],and concurrence spectroscopy [10] has been applied to de-termine information about coupling between qubits. For

more complex systems, Bayesian techniques and max-imum likelihood estimation [11] have proved to be ex-tremely valuable to construct data-driven models to iden-tify Hamiltonian parameters [12] and decoherence pa-rameters for multi-level systems [13]. Bayesian tech-niques have also been applied for adaptive Hamiltonianlearning using sequential Monte-Carlo techniques [14].

In this work we revisit simpler systems: two-level sys-tems subject to decoherence, one of the simplest butarguably most important models in quantum physics.The model is ubiquitous in magnetic resonance imag-ing, where the magnetization signal from protons (spin- 12particles) precessing and dephasing in a magnetic fieldis the basis for non-invasive, in-vivo imaging. In quan-tum information it describes qubits as the fundamentalbuilding blocks subject to decoherence. Therefore, char-acterization of two-level systems is extremely important.We compare two frequently used estimation strategiesbased on Fourier analysis and a Bayesian approach com-bined with maximum likelihood estimation, for the ubiq-uitous parameter estimation problem of a two-level sys-tem subject to decoherence. We consider accuracy, pre-cision and efficiency for different systems and noise mod-els, including Gaussian noise, typically encountered forlarge ensemble measurements, and projection noise, typ-ically present in data from repeated single-system mea-surements.

II. SYSTEM AND EXPERIMENTALASSUMPTIONS

In this section we introduce our dynamic model of thephysical system and our assumptions about initialisationand measurement of the system. We focus in particularon the different options for the measurements depend-

arX

iv:1

412.

4282

v1 [

quan

t-ph

] 1

3 D

ec 2

014

Page 2: College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 … · 1College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 8PP, United Kingdom

2

ing on the nature of the physical system and hence themeasurements from which we wish to estimate the pa-rameters.

A. Dynamic system model

The state of a quantum system is generally describedby a density operator ρ, which, for a system subject to aMarkovian environment, evolves according to a Lindblad-type master equation

ρ(t) = [H0, ρ(t)] +D[V ]ρ,

D[V ] = V ρV † − 12 (V †V ρ+ ρV †V ),

(1)

where H represents the Hamiltonian and V the dephas-ing operator. If the dephasing occurs in the same basisas the Hamiltonian evolution then we can choose a basisin which both H0 and V are diagonal. For a two-levelsystem we can thus write H = ωσz and V = γσz, whereγ ≥ 0, leaving us essentially with two core system pa-rameters to identify, ω and γ, or often γ = 2γ2.

B. Initialization and Readout

A basic experiment involves initalizing the system insome state |ψI〉 and measuring the decay signal, a so-called free-induction decay experiment. The measuredsignal depends on the system parameters as well as theinitial state and the measurement. Taking the measure-ment operator to be of the form

M =

(cos θM sin θMsin θM − cos θM

), (2)

and taking the initial state to be

|ψI〉 = cos(θI)|0〉+ sin(θI)|1〉, (3)

the measurement signal is of the form

p(t) = e−γt cos(ωt) sin(θI) sin(θM ) + cos(θI) cos(θM ).(4)

Assuming the system is initially in the ground state |0〉,e.g., corresponding to spins being aligned with an ex-ternal magnetic field, the initialization procedure corre-sponds to applying a short pulse to put the system into asuperposition of the ground and excitation state. Noticeif the system is not well characterized then it is likelyto be infeasible to prepare the system in a well-definedsuperposition state with a known angle θI . Rather, θIbecomes an additional parameter to be estimated.

The operator M corresponds to measuring the systemwith regard to an axis tilted by an angle θM from thesystem axis in the (x, z) plane, which can describe manydifferent experimental situations. In an FID experimentin NMR, for example, an x-magnetization measurementcorresponds to setting θM = π

2 . In a Rabi spectroscopy

experiment of a quantum dot, where the population ofthe ground and/or excited state is measured, e.g., via afluorescence measurement, we would typically set θM =0. In some situations, such as the examples mentioned,the Hamiltonian and measurement bases may be well-known. In other situations, however, such as in a doublequantum dot system with charge state read-out via asingle electron transistor perhaps, θM may a priori atmost be approximately known. In this case θI becomesan additional parameter to be estimated. In this workwe employ a formalism that does not require either theinitial state or measurement to be known a priori.

C. Continuous vs discrete-time and adaptivemeasurements

In an FID experiment we could in principle measurethe decay signal continuously. However, modern receiverstypically return a digitized signal, i.e., a vector of timesamples, usually the signal values integrated over shorttime intervals ∆t. For this type of readout, the num-ber N of time samples and their spacing ∆t are usuallyfixed, or at least selected prior to the start of the experi-ment. In this set-up there is usually little opportunity foradaptive refinement short of simply repeating the entireexperiment with shorter ∆t or larger N .

In other situations, such as Rabi spectroscopy [15],each measurement corresponds to a separate experiment.For example, we prepare the system in a certain initialstate, let it evolve under some Hamiltonian (with param-eters to be estimated) for some time t before performinga measurement to determine the state of the system. Inthis case we are more flexible and can in principle choosethe measurement times adaptively, trying to optimize thetimes to maximize the amount of information obtainedin each measurement.

Here we mainly consider the case of a regularly sam-pled measurement signal but we also briefly consider howthe estimation can be improved in the latter case byadaptive sampling with particular focus on the compari-son between the different estimation strategies.

D. Ensemble vs single-system measurements

In many settings from NMR and MRI to electron spinresonance (ESR) to atomic ensembles in atom traps,large ensembles of spins or atoms are studied resultingin ensemble average measurements. In this setting, thebackaction from the measurement is negligible and thesystem can be measured continuously to obtain a mea-surement signal s(t). The noise in the signal is well ap-proximated by Gaussian noise, which can be simulated byadding a zero-mean Gaussian noise signal g(t) to the idealsignal p(t), i.e., the measured signal d(t) = p(t) + g(t).By the Law of Large Numbers and Iterated LogarithmLaw [16] this gives a Gaussian distribution for d(t) with

Page 3: College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 … · 1College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 8PP, United Kingdom

3

mean p(t) and variance σ2 ∼ log logNe

2Nefor Ne →∞. This

is a good error model for simulating physical systems andestimating the noise in actual measurement data whenthe ensemble size Ne is large.

More recently single quantum systems, such as trappedions [17], trapped atoms [18], single electron spins [19],and charge states in Josephson junctions [20], have be-come an important topic for research because of theirpotential relevance to quantum technolgoies. Given asingle copy of a two-level system, measurement of anyobservable yields only a single bit of information indicat-ing a 0 or 1 result. To determine the expectation valueof an observable the experiment has to be repeated manytimes and the results averaged. Furthermore, due to thebackaction of the measurement on the system, we cangenerally only perform a single projective measurement.To obtain data about the observable at different timesthe system has to be re-initialized and the experimentrepeated for each measurement. In this context the en-semble size Ne is the number of times each experimenton a single copy of the system is repeated. As repetitionsare time- and resource-intensive, it is desirable to keep Nesmall. However, this means the precision of the expecta-tion values of observables becomes limited by projectionnoise, following a Poisson distribution. To simulate ex-periments of this type we compute the probability p1 ofmeasurement outcome 1 for the simulated system, gener-ate Ne random numbers rn between 0 and 1, drawn froma uniform distribution, and set p1 = N1/Ne, where N1 isthe number of rn ≤ p1.

III. PARAMETER ESTIMATION STRATEGIES

This section introduces the three parameter estimationstrategies based on Fourier and Bayesian analysis we wishto compare.

A. Fourier-spectrum based estimation

A common technique to find frequency components ina noisy time-domain signal is spectral analysis. Considera measurement signal of the form

p(t) = a+ be−γt cos(ω0t), t ≥ 0, (5)

which corresponds directly to measurement (4) if we seta = cos θI cos θM and b = sin θI sin θM . Subtracting themean of the signal 〈p(t)〉 = a and rescaling gives f(t) =(p(t)− a)/b. To account for the fact that f(t) is definedonly for t ≥ 0 we multiply f(t) by the Heaviside function

u(t) =

0 if t < 0

1 if t ≥ 0.

The Fourier transform of u(t)f(t) = u(t)e−γt cos(ω0t) is

F (ω) =γ + iω

(γ + iω)2 + ω20

and the power spectrum is P (ω) = |F (ω)|2. Differ-entiating with respect to ω and setting the numeratorto 0 shows that |F (ω)|2 has extrema for ω = 0 and(γ2 + ω2)2 − ω2

0(4γ2 + ω20) = 0. The real roots ω∗ of

this equation satisfy

E1(ω0, γ) = ω2∗ + γ2 − ω0

√4γ2 + ω2

0 = 0 (6)

and the corresponding maximum of the power spectrum

P∗ = P (ω∗) =ω20 + ω0

√4γ2 + ω2

0

8γ2ω20

=ω20 + ω2

∗ + γ2

8γ2ω20

.

Defining the error term

E2(ω0, γ) = 8γ2ω20P∗ − ω2

0 + γ2 + ω2∗, (7)

we can estimate the frequency ω0 and dephasing rate γfrom the peak height P∗ and position ω∗ via

Strategy 1:

ω0, γ = arg minω′

0,γ′|E1(ω′0, γ

′)|+ |E2(ω′0, γ′)|. (8)

Determining the maximum P∗ and its location ω∗from |F (w)|2, we may choose ω′0 = ω∗ and γ′ =√

2ω∗/(8ω2∗P∗ − 1) as starting point for a local minimiza-

tion routine provided γ ω0 as is usually the case.Instead of estimating the height of the peak, estimates

for ω0 and γ can also be obtained using the width ofthe peak. Let ω1,2 be the (positive) frequencies forwhich |F (ω)| assumes half its maximum. One way toestimate ω1,2 is to take the minimum and maximum ofω : |F (ω) ≥ max(F ), assuming that sufficient mea-surements have been made such that F is symmetric andpeaked, i.e., it has low skewness and high kurtosis.

The full-width-half-maximum 2d of |F (ω)| is |ω2−ω1|and we can derive the following expression:

d =

[√ω20 − γ2 + 2

√3ω0γ −

√ω20 − γ2

]=

[√ω2∗ + 2

√3γ√ω2∗ + γ2 − ω∗

].

Hence, given the location ω∗ and half-width d of the peaksolving for γ gives the alternative

Strategy 2:

γ =1

6

√6g(ω∗, d)− 18ω2

∗, ω0 =√ω2∗ + γ2, (9)

where g(ω∗, d) =√

9ω4∗ + 12d2ω2

∗ + 12d3ω∗ + 3d4.Strategy 2 based on peak-positions and linewidths is

probably the most common approach for estimating fre-quencies and R2-relaxation rates from FID signals inNMR and in many other contexts. The expressions for|P (ω)|2, the peak heights and linewidth are more com-plicated than those for quadrature measurements as weonly have a real cosine signal but the approach is funda-mentally the same.

Page 4: College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 … · 1College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 8PP, United Kingdom

4

B. Bayesian and Maximum Likelihood Approach

Given discrete time-sampled data represented by a rowvector d of length Nt containing the measurement resultsobtained at times tn for n = 1, . . . , Nt, let p be the vectorof the corresponding measurement outcomes predicted bythe model. p depends on the model parameters, here ω0

and γ. Assuming Gaussian noise with variance σ2 wedefine the joint likelihood [11]

P (p,d, σ) =1

(√

2πσ)Ntexp

[−||p− d||22

2σ2

]. (10)

If the noise level σ of the data is not known a priori,we can eliminate this parameter following the standardBayesian approach by integrating over σ from 0 to ∞,using the Jeffrey’s prior σ−1. This gives

P (p,d) =Γ(Nt

2 − 1)

4πNt/2||p− d||2−Nt

2 (11)

where Γ is the Gamma function. It is usually more con-venient and numerically robust to work with the (nega-tive) logarithm of the likelihood function, the so-calledlog-likelihood. When the noise level σ is known the log-likelihood reduces to

L(p,d, σ) = − logP (p,d, σ) = 12σ2 ||p− d||22 + const,

(12)where the constant is usually omitted; when σ is notknown a priori we obtain instead

L(p,d) = − logP (p,d) =1−Nt

2log ||p− d||22 + const.

(13)The idea of maximum likelihood estimation is to findthe model parameters that maximize this (log-)likelihoodfunction. To simplify this task, we follow a similar ap-proach as in previous work [11–13] and express the sig-nals as linear combinations of a small number mb of basisfunctions gm(t) determined by the functional form of thesignals. In our case the measurement signal p(t) can bewritten as a linear combination of mb = 2 basis functions

p(t) = α1g1(t) + α2g2(t). (14)

with g1(t) = 1 and g2(t) = e−γt cos(ω0t). As the ba-sis functions are not orthogonal, we define an orthogo-nal projection of the data onto the basis functions sam-pled at times tn as follows. Let G be a matrix whoserows are the basis functions gm(t) evaluated at timestn, Gmn = gm(tn), and E diag(αm)E† be the eigende-composition of the positive-definite matrix GG†. Then

H = diag(α−1/2m )E†G is a matrix satisfying H†H = GG†,

whose rows form an orthonormal set, HH† = I, and wedefine the orthogonal projection of the data vectors ontothe basis function by h = Hd†.

Projecting the data onto a linear combination of basisfunctions introduced mb nuisance parameters αm. Us-ing a standard Bayesian approach we can eliminate them

by integration using a uniform prior, and following fur-ther simplifications [11], it can be shown that the log-likelihood (11) becomes

L(ω0, γ|d) =mb −Nt

2log

[1− mb〈h2〉

Nt〈d2〉

](15)

where 〈d2〉 = 1Nt

∑Nt−1n=0 d2n and 〈h2〉 = 1

mb

∑mb−1m=0 h2m

and we have dropped the constant offset. This log-likelihood function can be evaluated efficiently, and wecan use standard optimization algorithms to find its max-imum, motivating

Strategy 3: ω0, γ = arg maxω′

0,γ′L(ω′0, γ

′|d). (16)

Note that in general, finding the global maximum ofthe log-likelihood function is non-trivial as it is non-convex, tends to become sharply peaked, especially forlarge data sets, and may have many local extrema, ne-cessitating global search techniques. However, for ourtwo-parameter case, finding the global optimum over rea-sonable ranges for ω and γ proved straightforward usingeither standard quasi-Newton or even Nelder-Mead Sim-plex optimization. For more complex functions a den-sity estimator such as particle filters (sequential MonteCarlo methods) or kernel density estimators may be used,which also enable effective determination of the maxi-mum.

IV. EVALUATION AND COMPARISON OFESTIMATION STRATEGIES

We now compare the three strategies introduced in theprevious section for ensemble and single-shot measure-ments and also discuss the uncertainty in the estimatedparameters and show how Strategy 3 enables the estima-tion of additional initialisation and measurement param-eters. For this we use 10 systems with different values forω and γ, given in Table I, and collect measurement datafrom simulations with the relevant noise models. For eachsystem the signal was sampled uniformly at Nt = 100time points tk ∈ [0, 30]. We assume that we have someorder of magnitude estimate of the system frequency ωbased on the physical properties of the system, giving usa range for the values of ω. Without loss of generality wecan express both ω and γ in units of ω. Accordingly alltimes quoted in the following will be in units of ω−1. Inour simulations we choose ω ∈ [0.2, 2] and γ ∈ [0.05, 0.4]in units of ω.

To calculate an average relative error for the parame-ter estimates, Ns = 1000 runs were performed for eachsystem and noise level and the error computed as

e(ω) =1

Ns

Ns∑n=1

ω−1|ω(n)est − ω| (17a)

e(γ) =1

Ns

Ns∑n=1

γ−1|γ(n)est − γ| (17b)

Page 5: College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 … · 1College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 8PP, United Kingdom

5

ω 1.0000 0.9000 0.5003 0.7304 1.2161 1.6211 0.2218 1.5195 0.7551 0.8029γ 0.1000 0.1000 0.1243 0.1875 0.2031 0.0993 0.1234 0.0751 0.0533 0.1921

TABLE I: Model parameters for 10 models compared below (in units of ω).

where ω and γ are the actual parameters of the simulated

system and ω(n)est and γ

(n)est are the estimated values for the

nth run.

A. Ensemble measurements with Gaussian noise

To compare the different estimation strategies for dis-cretely sampled signals with Gaussian noise we simulatethe measurement result dk at time tk. The expectedsignal p(tk) was calculated based on the selected modeland Gaussian noise of mean 0 and standard deviationσ added to each value. Fig. 1 (left) shows an exampleof an ideal measurement signal and simulated data withuniform sampling at times tn = n∆t with ∆t = 0.3.

Fig. 2 compares the errors according to (17) for thethree strategies. Strategy 2, probably the most commontechnique for estimating the frequency and dephasing pa-rameter using the position and width of the peak in theFourier spectrum, actually gives the least accurate andleast precise estimates — the median error of the esti-mated values is large, as is the spread of the errors fordifferent systems as indicated by the large error bars.Strategy 1 produces slightly improved estimates, but pa-rameter estimates based on Strategy 3 are significantlybetter. The results are similar for ω and γ. Fig. 3 fur-thermore suggests that Strategies 1 and 2 are not unbi-ased estimators. The mean of the distribution over theestimation runs does not appear to converge to the truevalue of the parameter even for very lowest noise leveland 1000 runs. Strategy 3, however, appears to be anunbiased Gaussian estimator.

One interesting feature of Strategies 1 and 2 is thatthe median estimation errors appear to be almost con-stant over the range of noise levels considered, whilefor Strategy 3 the error increases with increasing noiselevel, as one would expect. A probable reason for this isthat the uncertainties in the position, and indirectly thewidth, of the peaks in the Fourier spectrum primarilydepend on the length of the signal T . Specifically, for afixed number of samples, [9] found that the uncertaintyin the parameter estimates was mainly proportional to1/√T . This would explain why the accuracy of the es-

timates obtained from the Fourier-based strategies ap-pears roughly constant as the signal length and numberof samples were both fixed in our simulated experiments(T = 30, Nt = 100). So it might be argued that theFourier-based strategies are less sensitive to noise. How-ever, it is important to notice that even for noise withσ = 0.1, Strategy 3 still outperforms the other strategiesin all cases.

Furthermore, accurately and precisely estimating loca-

tion and width of a peak in the Fourier spectrum for arelatively short, noisy signal can be challenging, as illus-trated by the power spectrum examples in Fig. 4. Theblue bars show the |F (k)|2, where F (k) is the discreteFourier transform of the measured discrete signal

F (k) =

Nt∑n=1

d′ne−2πi(k−1)(n−1)/Nt , 1 ≤ k ≤ Nt, (18)

computed using the Fast Fourier Transform (FFT), aftercentering and rescaling, d′ = (d − d)/dmax with d =1Nt

∑Nt

n=1 dn and dmax = max |dn − d|. The red curve isan approximation to the continuous Fourier transform

F (ω) =

∫ ∞−∞f(t)e−iωtdt ≈

Nt∑n=1

d′neiωtn 1

2 (∆tn + ∆tn−1)

(19)where the integral has been approximated using thetrapezoidal rule with ∆tn = tn+1 − tn = T/Nt forn = 1, . . . , Nt − 1 and ∆t0 = ∆tNt

= 0. The left figureshows a “good” power spectrum for a low-noise input sig-nal. Even in this case the frequency resolution is limitedbut the peak has a more or less Lorentzian shape andthe width is well defined. However, for increasing noisethe peak can become increasingly distorted (center) andfor very noisy signals it may even become split (right)making width estimation difficult and assumptions aboutkurtosis and skewness are no longer valid.

A further advantage of Strategy 3 is that it also pro-vides direct estimates for the noise variance [11]

σ = 1Nt−mb−2 (Nt〈d2〉 −mb〈h2〉) (20)

and Fig. 5 shows that the estimates are very accurateacross the board.

B. Single-system measurements

To assess if there are significant differences in the per-formance of different estimation strategies in the presenceof projection noise, we repeat the analysis in the previoussubsection for the same 10 model systems, sampled overthe same time interval [0, 30], but with various levels ofprojection noise added instead of Gaussian noise. Fig. 1(right) shows an example of an ideal measurement signaland simulated data. Fig. 6 shows the relative errors forthe different estimation strategies for the same model sys-tems but subject to (simulated) projection noise. Strat-egy 3 again performs significantly better than the otherstrategies. Fig. 7 shows that the likelihood of the es-timates increases with increasing number of repetitions

Page 6: College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 … · 1College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 8PP, United Kingdom

6

0 5 10 15 20 25 30

−0.5

0

0.5

1

time (arb. units)

mea

sure

men

t sig

nal

Model 1, Gaussian noise σ=0.05

ideal measurementsimulated data

0 5 10 15 20 25 30−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

time (arb. units)

mea

sure

men

t sig

nal

Model 1, Projection noise, Ne=100

ideal measurementsimulated data

FIG. 1: Example of ideal measurement signal and data from simulated experiments with Gaussian noise (σ = 0.05, left) andprojection noise (each data point is the average of Ne = 100 binary-outcome single shot measurements, right).

0 0.02 0.04 0.06 0.08 0.110−4

10−3

10−2

10−1

Noise level (std deviation, additive Gaussian noise)

Med

ian/

Min

/Max

of e

(ω)

ω estimation errors (Model 1, N=100, t=[0,30], reg. sampling)

strategy 1

strategy 2

strategy 3

0 0.02 0.04 0.06 0.08 0.110−4

10−3

10−2

10−1

Noise level (std deviation, additive Gaussian noise)

Med

ian/

Min

/Max

of e

(γ)

γ estimation errors (Model 1, N=100, t=[0,30], reg. sampling)

strategy 1

strategy 2

strategy 3

FIG. 2: Minimum, maximum and median of relative error (averaged over 1000 runs for each system and noise level) of ω (left)and γ estimates (right) as a function of the magnitude of the Gaussian noise for 10 model systems (Table I).

Ne, as expected. It also shows again that the maximumlikelihood for some model systems is consistently higherthan for others, as was observed for Gaussian noise.

Fig. 8 shows that even the estimates for the noise vari-ance σ2 obtained automatically with Strategy 3 are veryaccurate in that the results obtained closely track the the-oretical values σ2 = 1/Ne expected for projection noise.

Overall this shows that although the noise strictly fol-lows a Poisson distribution in this case, we still obtainvery good estimates of the noise level for typical valuesof Ne using a Gaussian error model in the derivation ofthe maximum likelihood estimation strategy. So overallStrategy 3 appears to be consistently better than Strate-gies 1 and 2, independent of the types of measurementsand their associated noise for the two-level frequency anddephasing estimation problem.

C. Uncertainty in parameter estimates

The error statistics are useful for comparing differentstrategies in terms of both the accuracy (mean or medianof error) and precision (spread of errors) of the estimatedparameters, and the graphs above show that Strategy 3outperforms the other strategies on both counts. How-ever, obtaining such statistics requires data from manysimulated experiments as well as knowledge of the actualsystem parameters. In practice, the actual values of thesystem parameters to be estimated are usually unknown,as otherwise there would be no need to estimate the pa-rameters in the first place, so we cannot use error statis-tics directly to determine the accuracy and precision ofour estimates. However, we can estimate the uncertaintyof the parameter estimates, as discussed next.

For the Fourier-based strategies we have already men-tioned that the uncertainty in the parameter estimatesis mainly determined by the frequency resolution, lim-ited by the sampling rate based on the Nyquist-Shannon

Page 7: College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 … · 1College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 8PP, United Kingdom

7

0.997 0.998 0.999 1 1.001 1.002 1.003 1.004 1.0050

0.2

0.4

0.6

0.8

1di

strib

utio

n of

est

imat

es

x

mean(x) = 1.000855, std(x)=0.0011

(a) ω estimates: Strategy 1

1.01 1.011 1.012 1.013 1.014 1.015 1.0160

0.2

0.4

0.6

0.8

1

dist

ribut

ion

of e

stim

ates

x

mean(x) = 1.013136, std(x)=0.001

(b) ω estimates: Strategy 2

0.9980.9985 0.9990.9995 1 1.0005 1.0011.0015 1.0020

0.2

0.4

0.6

0.8

1

dist

ribut

ion

of e

stim

ates

x

mean(x) = 1.00, std(x)=0.0005

(c) ω estimates: Strategy 3

0.1105 0.111 0.1115 0.112 0.1125 0.113 0.11350

0.2

0.4

0.6

0.8

1

dist

ribut

ion

of e

stim

ates

x

mean(x) = 0.112073, std(x)=0.00053

(d) γ estimates: Strategy 1

0.109 0.11 0.111 0.112 0.113 0.1140

0.2

0.4

0.6

0.8

1

dist

ribut

ion

of e

stim

ates

x

mean(x) = 0.111705, std(x)=0.0011

(e) γ estimates: Strategy 2

0.9980.9985 0.9990.9995 1 1.0005 1.0011.0015 1.0020

0.2

0.4

0.6

0.8

1

dist

ribut

ion

of e

stim

ates

x

mean(x) = 0.10, std(x)=0.00075

(f) γ estimates: Strategy 3

FIG. 3: Distribution of ω and γ estimates for 1000 runs for model 1 with 1% Gaussian noise for strategies 1, 2 and 3.

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.80

0.2

0.4

0.6

0.8

1

angular frequency ω (arb. units)

|FT

(sig

nal)|

2 /max

(FT

(sig

nal)|

2

Normalized power spectrum

FFTDFT

(a) 1% Gaussian Noise

0.2 0.4 0.6 0.8 1 1.20

0.2

0.4

0.6

0.8

1

angular frequency ω (arb. units)

|FT

(sig

nal)|

2 /max

(FT

(sig

nal)|

2

Normalized power spectrum

(b) 5% Gaussian Noise

0.5 1 1.5 20

0.2

0.4

0.6

0.8

1

angular frequency ω (arb. units)

|FT

(sig

nal)|

2 /max

(FT

(sig

nal)|

2

Normalized power spectrum

(c) 10% Gaussian Noise

FIG. 4: Limits of Fourier resolution and difficulty in estimating peak width for short, noisy signals.

sampling theorem, which is fixed Nt/T in our case, andthe length of the sampled input signal as the Gabor limitimplies as trade-off between time- and band-limits.

For the maximum likelihood estimation we can obtainuncertainty estimates for the parameters from the widthof the peak of the likelihood function around the maxi-mum. We use the following simple strategy. Let (ω, γ) bethe parameters for which the log-likelihood assumes its(global) maximum Lmax. To estimate the uncertainty inω we compute the log-likelihood L(ω+δω, γ|d) for valuesδω where L is significantly larger than 0 (implemented bysampling under the assumption that L is not too far offa peaked distribution). Then we find the range of δω forwhich the actual likelihood

exp(L(ω + δω, γ|d) ≥ 12 exp(Lmax) (21)

to determine the full width at half maximum (FWHM)

δωFWHM of the likelihood peak in the ω direction. As-suming a roughly Gaussian peak the uncertainty in ω isthen given by

∆ω = 2√

2 ln(2) δωFWHM, (22)

and similarly for γ. Fig. 9 shows the resulting peaks inthe likelihood function for a typical experiment togetherwith the FWHM estimates, showing greater uncertaintyin the γ estimates.

Fig. 10 show the resulting uncertainties for parameterestimates obtained by Strategy 3 for the ensemble mea-surements. The uncertainty in the ω and γ estimatesincreases with the noise level, as one would expect, butfor some systems the increase is steeper than for others.In particular, the uncertainties are greater for models 4,5 and 10, for which γ is large, and lowest for model sys-tem 9, which has the lowest γ of the 10 models. The

Page 8: College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 … · 1College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 8PP, United Kingdom

8

1 2 3 4 5 6 7 8 9 100

0.02

0.04

0.06

0.08

0.1

0.12

Model system (Type 1)

Est

imat

ed N

oise

Var

ianc

e ⟨σ

est⟩

FIG. 5: The estimated noise level σ of the measurementdata for 10 model systems of type 1 obtained from Strategy3 closely track the actual noise levels of the simulated data(0.01, 0.02, 0.04, 0.05, 0.6, 0.8, 1.0).

higher uncertainties coincide with dips in the maximumof the log-likelihood in Fig. 12. Although there is somevariation in the value of the maximum log-likelihood be-tween different runs for the same model and error level,the differences between the average of the maximum log-likelihood over many runs for model systems 1 and 5 areseveral standard deviations, e.g. max logL ≈ 47.9 ± 3.2(for model 1, σ = 0.1) vs 34.3 ± 3.3 (model 5, σ = 0.1).This is consistent with the peak of the (log-)likelihoodbeing lower and broader for model 5, resulting in higheruncertainty, and narrower and higher for model 1, result-ing in less uncertainty. Fig. 11 shows that the uncertain-ties for parameter estimates behave the same ways forsingle shot measurements as a function of the projectionnoise level 1/

√Ne.

This suggests that given the same amount of data theuncertainty of our estimates increases slightly with largerdephasing rate. A probable explanation for this is thatthe signal decays faster for higher dephasing and thusthe signal-to-noise ratio of the later time samples is re-duced. For higher dephasing rates the results could likelybe improved by adding more samples for shorter times orintroducing weights and reducing the latter for measure-ments obtained for longer times.

D. Estimating initialisation and measurementparameters

According to (14) Strategy 3 also provides informationabout the initialization and measurement procedure viaestimates for the parameters α1 and α2. For this modelwe obtain

α1 ± α2 = cos θI cos θM ± sin θI sin θM = cos(θI ∓ θM )

and thus

θI = 12 [arccos(α1 − α2) + arccos(α1 + α2)], (23a)

θM = 12 [arccos(α1 − α2)− arccos(α1 + α2)]. (23b)

Fig. 13 shows the estimates for the parameters α1 andα2 with error bars indicating uncertainty for the ensemblemeasurements. From the plot it is evident that α1 → 0and α2 → 1 for σ → 0, which suggests θI = θM = π

2 ,which agrees with the values of the initialization andmeasurement angles used in the simulated experiments.Fig. 14 shows that the same is true in the case of pro-jection noise for single shot measurements. The associ-ated estimates for the parameters α1 and α2 in convergeto α1 → 0 and α2 → 1 for Ne → ∞, which suggestsθI = θM = π

2 , which also agrees with the values of theinitialization and measurement angles used in the sim-ulated experiments. Similar behaviour is observed forother choice of the initialization and measurement an-gles.

E. Fisher Information and Cramer Rao Bound

The Fisher information matrix I = (Iij) is defined by

Iij = E

[∂L

∂θi

∂L

∂θj

]=

∫∂L

∂θi

∂L

∂θjf(x|θ)dx = −E

[∂2L

∂θi∂θj

](24)

where L(x, θ) is the log-likelihood of the measurementoutcome x given θ and E the expectation w.r.t. x. Ifthe estimator T for the parameters θ is unbiased, i.e. themean square error of T is

MSE(T ) = Bias(T )2 + Var(T ) = Var(T ) (25)

where Var(T ) is the covariance matrix of the estimator,then the matrix C = Var(T )−I−1 must be positive semi-definite and ||C|| gives an estimate of how close we areto the Cramer-Rao limit.

Applied to our case, θ = (ω, γ) and

L(x|θ) = −N log(√

2πσ)− 1

2σ2

N∑n=1

|p(θ, tn)− xn|2

with p(θ, t) = e−θ2t cos(θ1t), we get

∂L

∂θ1= − 1

σ2

N∑n=1

[p(θ, tn)− xn]∂p(θ, tn)

∂θ1(26a)

∂L

∂θ2= − 1

σ2

N∑n=1

[p(θ, tn)− xn]∂p(θ, tn)

∂θ2(26b)

(26c)

Page 9: College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 … · 1College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 8PP, United Kingdom

9

102 103 10410−4

10−3

10−2

10−1

100

Noise level (std deviation, additive Gaussian noise)

Med

ian/

Min

/Max

of e

(ω)

ω estimation errors (Model 1, N=100, t=[0,30], reg. sampling)

strategy 1

strategy 2

strategy 3

102 103 10410−4

10−3

10−2

10−1

100

Noise level (std deviation, additive Gaussian noise)

Med

ian/

Min

/Max

of e

(γ)

γ estimation errors (Model 1, N=100, t=[0,30], reg. sampling)

strategy 1

strategy 2

strategy 3

FIG. 6: Minimum, maximum and median of relative error of ω (left) and γ estimates (right) as a function of the number ofsingle-shot measurement repetitions per data point, Ne, for 10 model systems (Table I).

1 2 3 4 5 6 7 8 9 1020

40

60

80

100

120

140

160

Model system (Type 1)

Max

imum

like

lihoo

d

Ne=100

Ne=500

Ne=1000

Ne=5000

Ne=10000

FIG. 7: Maximum likelihood for 10 model systems, averagedover 100 runs each, obtained from Strategy 3.

1 2 3 4 5 6 7 8 9 10102

103

104

105

Model system (Type 1)

Est

imat

ed N

e = ⟨σ

est⟩−

2

FIG. 8: Estimated Ne = 〈σest〉−2 for single shot measure-ments for 10 model systems, averaged over 100 runs each,obtained from Strategy 3. The Ne estimates closely track theactual number of repetitions of the single shot measurementsfor the simulated data (100, 500, 1000, 5000, 10000).

−5 −4 −3 −2 −1 0 1 2 3 4 5x 10−3

0

2

4

6

8

10

12

14x 1063

δx

Like

lihoo

d eL fo

r L(

x,γ

|d),

L(

ω,x

|d)

x=ωx=γ

FIG. 9: Estimation of width of likelihood peak with regardto ω and γ.

and

∂p(θ, tn)

∂θ1= −tne−θ2tn sin(θ1tn) =: αn (27a)

∂p(θ, tn)

∂θ2= −tne−θ2tn cos(θ1tn) =: βn. (27b)

(27c)

Setting pn = p(θ, tn) we have

∂L

∂θ1

∂L

∂θ2=

1

σ4

(N∑n=1

αnpn − αnxn

)(N∑n=1

βnpn − βnxn

)

= σ−4

(AB −

N∑n=1

cnxn +

N∑m,n=1

αmβnxmxn

)with A =

∑n αnpn and B =

∑n βnpn, cn = αnB+βnA.

Similarly for the other partial derivatives. Noting

1√2πσ

∫ ∞−∞

xn exp

[−|pn − xn|2|

2σ2

]dxn = pn (28)

Page 10: College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 … · 1College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 8PP, United Kingdom

10

0.02 0.04 0.06 0.08 0.10

0.005

0.01

0.015

0.02

0.025

noise level σ

Unc

erta

inty

∆ω

model 1

model 2

model 3

model 4

model 5

model 6

model 7

model 8

model 9

model 10

0.02 0.04 0.06 0.08 0.10

0.005

0.01

0.015

0.02

0.025

0.03

0.035

noise level σ

Unc

erta

inty

∆γ

model 1

model 2

model 3

model 4

model 5

model 6

model 7

model 8

model 9

model 10

FIG. 10: Uncertainties of ω (left) and γ estimates (right) for 10 model systems (Table I) as a function of Gaussian noise level.

0.02 0.04 0.06 0.08 0.10

0.005

0.01

0.015

0.02

0.025

unce

rtai

nty

∆ω

noise level (Ne−1/2)

model 1

model 2

model 3

model 4

model 5

model 6

model 7

model 8

model 9

model 10

0.02 0.04 0.06 0.08 0.10

0.005

0.01

0.015

0.02

0.025

0.03

0.035

noise level (Ne−1/2)

unce

rtai

nty

∆γ

model 1

model 2

model 3

model 4

model 5

model 6

model 7

model 8

model 9

model 10

FIG. 11: Uncertainties of ω (left) and γ estimates (right) for 10 model systems as a function of projection noise level.

1 2 3 4 5 6 7 8 9 1020

40

60

80

100

120

140

160

Model system (Type 1)Max

imum

of l

og−

likel

ihoo

d L

(ave

. 100

run

s)

σ=0.1

σ=0.05

σ=0.01

FIG. 12: Maximum of log-likelihood (Strategy 3) for 10 modelsystems (Table I) for different noise levels.

and assuming the estimator is unbiased, we finally obtain

−0.02

0

0.02

α 1

0 0.02 0.04 0.06 0.08 0.10.9

1

1.1

α 2

noise level (σ)

FIG. 13: Estimates for parameters α1 and α2 including uncer-tainty as a function of the noise level σ for 10 model systems.

Page 11: College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 … · 1College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 8PP, United Kingdom

11

0 2000 4000 6000 8000 10000−0.02

−0.01

0

0.01

0.02α 1

0 2000 4000 6000 8000 100000.9

0.95

1

1.05

1.1

α 2

single shot repetitions Ne

FIG. 14: Estimates for parameters α1 and α2 as a function ofthe number of single shot repetitions Ne for 10 model systems(Type 1, averaged over 100 runs each).

102 103 10410−8

10−6

10−4

10−2

Ne

min

eig

(co

v(ω

,γ)−

inv(

FI)

)

model 1model 2model 3model 4model 5model 6model 7model 8model 9model 10

FIG. 15: Plot of the minimum eigenvalue of the covariancematrix of the estimator minus the inverse Fisher informationfor various models as a function of Ne.

the entries of the Fisher information matrix

I11 = σ−4

(A2 − 2A

∑n

αnpn +∑m,n

αmαnpmpn

)

I12 = σ−4

(AB −

∑n

cnpn +∑m,n

αmβnpmpn

)

I22 = σ−4

(B2 − 2B

∑n

βnpn +∑m,n

βmβnpmpn

).

(29)

While our simulations suggest that the estimators basedon Strategies 1 and 2 are not unbiased, Strategy 3 ap-pears to be unbiased. Fig. 15, showing the smallest eigen-value of the matrix C for our various test systems subjectto projection noise, suggests that we indeed approach the

Cramer-Rao bound for Ne →∞ and σ = N−1/2e .

V. ADAPTIVE ESTIMATION STRATEGIES

We may find that the accuracy or precision of the pa-rameters obtained from an initial data set is not sufficientand we would like to improve it by acquiring additionaldata. Adaptive refinement strategies depend on the ex-perimental set-up and system and a detailed analysis ofspecific strategies is beyond the scope of this paper. How-ever, we shall briefly discuss general approaches for iter-ative refinement for the Fourier and Bayesian estimationapproaches and compare these for a few examples.

In some settings an entire measurement trace is ob-tained in a single experimental run and we are only ableto sample the signal at regular time intervals restrictedby the experimental equipment available. In this casethe only options available to us are extending the signallength (keeping sampling density or number of samplepoints constant) or repeating the experiment. If Fourier-based estimation strategies are used, the only way to re-ally improve the resolution of the Fourier spectrum, andthus the accuracy and precision of our estimates, is by in-creasing the signal length. However, for a decaying signalthe signal-to-noise ratio progressively deteriorates untilthe signal vanishes, limiting the accuracy and precisionthat are attainable. This is illustrated in Fig. 16(left),which shows the (normalized) power spectrum for 1 to1000 repetitions of the experiment for model parameters4, assuming each individual measurement trace is sub-ject to Gaussian noise at σ = 0.1 and the signals areaveraged. For a single run of the experiment with thislevel of noise, the peak is distorted but the power spec-trum quickly converges. The corresponding estimates forω and γ (Fig. 16, center and right) also converge but notto the true value. For Strategy 2 the ω and γ estimatesare inaccurate. The optimization step in Strategy 1 ap-pears to improve the accuracy of the ω estimates but theγ estimates are still inaccurate. Strategy 3 does not suf-fer from these limitations and averaging multiple shorttraces should increase the accuracy of our estimates. In-deed the figure shows that this appears to be the case:both the ω and γ estimates converge to the true values.

This shows that Strategy 3 allows adaptive refinementeven if all we are able to do is to repeat the experi-ment multiple times and average the measurement traces.However, in some situations we have more freedom. ForRabi spectroscopy, for example, each data point, corre-sponding to a measurement at a particular time tn, maybe obtained in a separate experiment, and we may be freeto choose the measurement times tn flexibly. In this case,having obtained Nt measurements we can try to choosethe next measurement time tNt+1 such that it optimizesthe amount of information we gain from the experiment.We could ask, for example, considering all possible out-comes of a measurement at time t and their probabilitybased on our current knowledge, at what time should wemeasure next to achieve the largest reduction in the un-certainty of our estimates. However, this would requirecalculating the uncertainty of the parameters (e.g., by es-

Page 12: College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 … · 1College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 8PP, United Kingdom

12

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.80

0.2

0.4

0.6

0.8

1

angular frequency ω (arb. units)

|FT

(sig

nal)|

2 /max

(FT

(sig

nal)|

2Normalized power spectrum

ave 1 runave 5 runsave 10 runsave 50 runsave 100 runsave 500 runsave 1000 runs

100 101 102 103

0.65

0.7

0.75

0.8

number of runs

ω e

stim

ate

true value

strategy 1

strategy 2

strategy 3100 1020.725

0.73

0.735

100 101 102 103

0.19

0.2

0.21

0.22

0.23

0.24

0.25

number of runs

γ es

timat

e

true value

strategy 1

strategy 2

strategy 3

FIG. 16: Iterative refinement by averaging of signal traces: power spectra (left), ω estimates (center) and γ estimates (right).

0 1 2 3 4 5 6 7 8 9−1

−0.5

0

0.5

1

time (units of π x base unit)

p(t,

ω,γ

)

104 × var(p(t,ωj,γ

j))

FIG. 17: Prior likelihood after 25 time samples (tn = 1.2n) for model 1 (left) with ωj , γj samples (red dots) and correspondingpredicted measurement traces pj(t) = p(t, ωj , γj) and variance of pj(t) as function of t (right).

timating the width of the likelihood peaks) for all possiblemeasurement times and outcomes. Given the continuumof measurement outcomes and measurement times, thisis generally too expensive to calculate.

We therefore consider a simpler heuristic. We gener-ate a number of guesses (ωj , γj), j = 1, . . . , J for theparameters based on the current likelihood distributionfor the parameters. We then calculate the measurementsignal p(t, ωj , γj) for a set of discrete times and selectthe next measurement time where the variance of thepredicted measurement results is greatest. The idea be-hind this strategy is that a larger spread in the predictedresults indicates greater uncertainty, and a measurementat such a time should result in a greater reduction of theuncertainty. We illustrate this strategy in Fig. 17. Thevariance of the predicted traces pj(t) = p(t, ωj , γj) ex-hibits oscillations at about twice the frequency of thesignal, being largest around the minima and maxima ofthe oscillatory signal but due to the damping of the sig-nal there is an overall envelope and a global maximumaround 3 in units of πω−1. To avoid repeated samplingat the same time it is desirable to introduce a degreeof randomness, e.g., by selecting the next measurementtime based on the maximum of the variance of pj(ts)sampled over a discrete set of times ts, such as a non-uniform low-discrepancy sampling of the time interval

[0, T ]. Furthermore, in practice it may be rather inef-ficient to recalculate the variance of the traces after asingle measurement. Instead, it we shall acquire an ini-tial set of N0 data points and then select the next N1

measurement times to coincide with peaks in the vari-ance of the traces where we allow N1 to vary dependingon the number of peaks. In Fig. 17, for example, thereare eight local peaks and we would choose the next eightmeasurement times to coincide with these maxima anditerate the process.

An even simpler way of iterative refinement is via low-discrepancy (ld) time sampling, a generalization of uni-form sampling that lends itself to easy iterative refine-ment. The basic idea of ld sequences is to ensure thelargest gap between samples is asymptotically optimal,while there is little uniformity in the sampling points toavoid aliasing effects (see blue noise criterion). In thiscase the initial measurement times are chosen to be thefirst N0 elements in a low-discrepancy quasi-random se-quence such as the Hammersley sequence [22], and ineach subsequent iteration the next Ni elements of the se-quence are used. The number of initial measurements N0

and subsequent measurements per iteration Ni are com-pletely flexible, the elements of the sequence can be scaledto uniformly cover any desired time interval, and we canperform as many iterations as desired. Fig. 18 shows

Page 13: College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 … · 1College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 8PP, United Kingdom

13

0 5 10 15 20 25 300

2

4

6

8

10

Measurement times tk (arb. units)

itera

tion

FIG. 18: Selection of measurement times for interative low-discrepancy sampling. The new measurement times in eachiteration as chosen such as to fill in the largest existing gaps.

the measurement times as a function of the iteration asdetermined by the Hammersley sequence with N0 = 20and Ni = 8 for 10 iterations and total sampling timesT = 30, showing that uniform coverage of the samplinginterval is maintained. For a fixed number of measure-ments Nt = 100 we verified that there was no significantdifference in the errors and uncertainties of the parame-ter estimates between low-discrepancy and uniform sam-pling for the cases considered above. Furthermore, it-erative refinement based on ld-sampling performed verywell. Fig. 19 for model system 4 with measurementssubject to 5% Gaussian noise shows that simple itera-tive ld-sampling actually outperforms the adaptive re-finement strategy based on the trace-variance describedabove. While this may not be universally the case, andmay be due to the variations in the trace variance be-ing relatively small in our example, it shows that simplestrategies such as iterative ld-sampling can be highly ef-fectively.

VI. GENERALIZATION TO OTHER MODELS

So far we have considered a particular model of a de-phasing two-level system with dephasing acting in theHamiltonian basis. However, if control fields are applied,as in a Rabi oscillation experiment for example, then theeffective Hamiltonian and the dephasing basis may notcoincide. For example, for two-level atoms in a cavitydriven resonantly by a laser, the effective Hamiltonianwith regard to a suitable rotating frame is H = Ωσx,where Ω is the Rabi frequency of the driving field. As-suming the driving field does not alter the dephasing pro-cesses, so that we still have V =

√γ2σz, the resulting

measurement trace is given by [21]:

p(t) = e−γt sin θI sin θM + Φx3(t) cos θI cos θM (30)

20 30 40 50 60 70 80 90 100−2.6

−2.4

−2.2

−2

−1.8

−1.6

−1.4

−1.2

number of measurements Nt

med

ian

log 10

|ωes

t−ω|,

|γ est−γ

|

ω, ld, σ=0.05

γ, ld, σ=0.05

ω, ld, Ne=100

γ, ld, Ne=100

ω, var−trace, σ=0.05

γ, var−trace, σ=0.05

ω, var−trace, Ne=100

γ, var−trace, Ne=100

FIG. 19: Median error of ω and γ parameter estimates foriterative ld-sampling and adaptive sampling based on tracevariance for model system 4 with measurements subject to 5%

Gaussian noise and projection noise σ = N−1/2e , respectively.

where

Φx3(t) = e−γ2 t[cos(ωt) +

γ

2ωsin(ωt)

], (31)

ω =

√Ω2 − γ2

4 . (32)

If Ω2 < γ2/4 then ω is purely imaginary and the sine andcosine terms above turn into their respective hyperbolicsine and cosine equivalents. If Ω2 = γ2/4, the expressionω−1 sin(ωt) must be analytically continued.

Due to the more complex nature of the signal, theFourier estimation strategies are not directly applicable.However, we can very easily adapt Strategy 3. All thatis required is a change in the basis functions, settingg1(t) = e−γt and g2(t) = Φx3(t).

Fig. 20 shows the log-likelihood functions for a verysparsely sampled signal with significant projection noisefor a system of type (30) for a simulated experiment per-formed with θM = π

4 and θI = π3 . The signal is a damped

oscillation, though not a simple damped sinusoid. Strat-egy 3 easily succeeds in identifying the model parame-ters and the log-likelihood function has a clearly definedpeak. In fact, we are showing the log-likelihood here asthe actual likelihood function is so sharply peaked thatits internal structure, especially the squeezed nature, isnot easy to see.

Finally, Fig. 21 (left) shows the error statistics for theω and γ estimates obtained using Strategy 3 for 10 mod-els of type (30) with the same values for Ω and γ asin Table I. We compare two experimental conditions:θI = θM = 0, which corresponds to maximum visibil-ity of the oscillations and θI = π

3 , θM = π4 , for which

the signal is more complex and the visibility of the oscil-lations is reduced as shown in Fig. 20. The estimationerrors are very similar to those for models of type 1. Forγ they are effectively identical for both experimental con-ditions; for Ω they are slightly larger in case 2b, as might

Page 14: College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 … · 1College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 8PP, United Kingdom

14

0 5 10 15 20 25 30−0.2

0

0.2

0.4

0.6

0.8

1

1.2

time (arb. units)

mea

sure

men

t sig

nal

γ

Ω0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

FIG. 20: Ideal signal (blue) and sparsely sampled noisy data (green ∗, Nt = 100, t ∈ [0, 25], Ne = 100 single shot experimentsper data point) for a system described by Eq. (30) with ω = 1, γ = 0.1 (left) and corresponding log-likelihood (right).

0 0.02 0.04 0.06 0.08 0.1 0.1210−4

10−3

10−2

10−1

noise level σ

Med

ian/

Min

/Max

of e

(Ω),

e(γ

) (N=100, t=[0,30], reg. sampling)

e(Ω), 2a

e(γ), 2a

e(Ω), 2b

e(γ), 2b

0 0.02 0.04 0.06 0.08 0.1−0.2

0

0.2

0.4

0.6

0.8

1

noise level σ

θ I and

θM

est

imat

es

θI (2a)

θM

(2a)

θI (2b)

θM

(2b)

FIG. 21: Minimum, maximum and median of relative error (averaged over 100 runs for each system and noise level) of ω andγ estimates as a function of noise level σ (left) and estimates for the initial state and measurement angles θI and θM (right)for 10 model systems of type (30) with model parameters given in (Table I) for two experimental conditions: θI = θM = 0 (2a,maximum visibility) and θI = π

3, θM = π

4(2b).

be expected as the visibility of the oscillations is reducedin this case.

In both cases we also obtain excellent estimates of thenoise level σ of the data as well as estimates for theparameters α1 and α2. As before, if the initial stateprepared or the precise measurement performed are un-known a priori, as may well be the case for a system thatis not yet well characterized, we can use these parametersto derive estimates for θI and θM :

θI =1

2[arccos(α2 − α1) + arccos(α2 + α1)] (33a)

θM =1

2[arccos(α2 − α1)− arccos(α2 + α1)] (33b)

Fig. 21 (right) shows the estimates derived for the anglesθI and θM for both experimental conditions. The mark-ers indicate the average of the estimate for all runs andall model systems, the errorbars indicate the standard

deviation of the estimates. The estimates are not as ac-curate as those for the system parameters, as one wouldexpect as we have marginalized the amplitudes α1 and α2

and thus θI and θM . However, they are still quite closeto the actual values (black dash-dot lines) with the ex-ception of the θI estimate for case (2a), which is slightlymore biased and less accurate – it should be 0, coincidingwith the measurement angle θM .

VII. CONCLUSIONS

We have investigated the ubiquitous problem of identi-fying crucial parameters from experimental data for two-level systems subject to decoherence. Comparing differ-ent strategies based on the analysis of Fourier spectra aswell as Bayesian modelling and maximum likelihood esti-mation, the latter approach was found to be vastly supe-

Page 15: College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 … · 1College of Science (Physics), Swansea University, Singleton Park, Swansea, SA2 8PP, United Kingdom

15

rior to commonly used Fourier based strategies in termsof accuracy and precision of the estimates obtained.

Strategies based on simple Fourier analysis are lim-ited by the accuracy with which the positions, heightsand widths of the Fourier peaks can be determined. Asthe spectral resolution is limited by signal length andsampling rate, the accuracy of Fourier-based estimationschemes for short, decaying signals or sparse noisy data islimited. The Bayesian approach is not constrained in thisway and yields uncertainties for the system parametersas well as information about the noise in the data.

An additional advantage of the Bayesian estimation isthat it does not require a priori knowledge of the initial-ization or measurement angles θI and θM . Rather, theestimation procedure provides values for the coefficientsof the basis functions, which are related to the parame-ters θI and θM .

The results are widely applicable to many experimentalsettings from the analysis for free-induction decay signalsfor spin systems, e.g., in NMR, MRI and ESR to Rabispectrocopy fo atomic ensembles, trapped ions, quantumdots or Josephson junction devices.

Acknowledgments

We acknowledge funding from the Ser Cymru NationalResearch Network in Advanced Engineering and Mate-rials. SGS also thanks the Royal Society for fundingthrough a Leverhulme Senior Fellowship grant and theUK Engineering and Physical Sciences Research Councilfor recent funding. FCL acknowledges funding from theCardiff University Research Leave Fellowship Scheme.

[1] D. W. Mc Robbie, E. A. Moore, M. J. Graves, M. R.Prince MRI from Picture to Proton (Cambridge Univer-sity Press, 2007)

[2] J. B. Lambert, E. P. Mazzola, Nuclear Magnetic Reso-nance Spectroscopy: An introduction to Principles, Ap-plications and Experimental Methods (Pearson Educa-tion, 2004)

[3] T. Shinjo, Nanomagnetism and Spintronics (Elsevier,London, 2014)

[4] M. A. Nielsen and I. L. Chuang, Quantum Computationand Quantum Information (Cambridge University Press,2000)

[5] J. Nathan Kutz, Data-Driven Modeling and ScientificComputation (Oxford University Press, 2013)

[6] M. Balci, Basic 1H and 13C NMR Spectroscopy (Elsevier,2005)

[7] S. G. Schirmer, A. Kolli and D. K. L. Oi, Phys. Rev. A 69050306(R) (2004)

[8] J. H. Cole et al. Phys. Rev. A 71 062312 (2005)[9] J. H. Cole et al. Phys. Rev. A 73 062333 (2006)

[10] S. J. Devitt, J. H. Cole, L. C. L. Hollenberg, Phys. Rev.A 73 052317 (2006)

[11] G. L. Bretthorst, Bayesian Spectrum Analysis and Pa-rameter Estimation (Springer, Berlin, 1998)

[12] S. G. Schirmer and D. K. L. Oi. Phys. Rev. A 80 022333(2009)

[13] D. K. L. Oi and S. G. Schirmer. Laser Physics 20(5)1203-1209 (2010)

[14] C. E. Granade et al, New J. Phys. 14 103013 (2012)[15] S. Blatt et al. Phys. Rev. A 80 052703 (2009)[16] W. Feller, An Introduction to Probability Theory and Its

Applications (John Willey & Sons, 1968)[17] R. Blatt and C. F. Roos Nature Physics 8 277-284 (2012)[18] A. M. Kaufman et al., Science 345, 306-309 (2014)[19] V. S. Pribiag et al. Nature Nanotechnology 8 170-174

(2013)[20] Y. Nakamura, Yu. A. Pashkin and J. S. Tsai, Phys. Rev.

Lett. 87 246601 (2001)[21] Erling Gong, Weiwei Zhou and S. G. Schirmer, Model

discrimination for dephasing two-level systems. J. Phys.A, in press.

[22] J. M. Hammersley and D. C. Handscomb Monte CarloMethods (Springer 1964)