A deconvolution-based objective function for wave-equation ...newton.mines.edu/paul/meetings/seg2011/LuoSavaSEG2011.pdf · image domain methods (Shen et al., 2003; Sava and Biondi,

A deconvolution-based objective function for wave-equation inversionSimon Luo and Paul Sava, Center for Wave Phenomena, Colorado School of Mines

SUMMARY

We propose a new objective function for wave-equation in-version that seeks to minimize the norm of the weighted de-convolution between synthetic and observed data. Comparedto more the conventional difference-based objective functionwhich minimizes the norm of the residual between syntheticand observed data, the deconvolution-based objective func-tion is less susceptible to cycle skipping and local minima.Compared to a crosscorrelation-based objective function, thedeconvolution-based objective function is less sensitive to abandlimited or non-impulsive source function, which may re-sult in a nonzero gradient of the objective function even whenthe constructed velocity model matches the true model.

INTRODUCTION

The construction of a subsurface velocity model is a centralproblem in exploration seismology. Recently there has beenmuch effort devoted toward wave-equation inversion methods,which can provide high resolution velocity models.

Wave-equation inversion methods can be divided into two cat-egories: data domain methods (Lailly, 1984; Tarantola, 1984,1986; Mora, 1987; Pratt et al., 1998; Shin and Ha, 2008), andimage domain methods (Shen et al., 2003; Sava and Biondi,2004). Data domain methods minimize the residual betweensynthetic and observed seismograms, while image domain meth-ods optimize some measure of image quality (e.g., flatness ofcommon-image gathers). See Symes (2008) for an overviewand discussion of these methods.

Data domain methods can be further categorized as either traveltime-based inversion (Luo and Schuster, 1991; Zhang and Wang,2009), which matches the phase information contained in ob-served seismograms; or full-waveform inversion (Lailly, 1984;Tarantola, 1984; Mora, 1987), which matches both the ampli-tude and phase information in observed data. Because of thisstrict data matching, full-waveform inversion has high reso-lution, but is susceptible to problems such as cycle skippingand local minima, especially when the data lack low frequen-cies. In addition, if we synthesize data using the acoustic waveequation, any inconsistencies in the amplitudes between ob-served and synthetic data due to non-acoustic effects, such asdensity variations and converted waves, negatively impact theinversion result. In comparison, traveltime-based inversion isless susceptible to cycle skipping, and in general the objec-tive function tends to have fewer local minima compared tothat of full-waveform inversion. Also, because it emphasizesthe phase information contained in data, traveltime inversionis less affected by inability to correctly model seismic ampli-tudes. However, traveltime inversion generally has lower res-olution compared to full-waveform inversion.

Nonetheless, traveltime-based inversion may be useful whenan initial velocity model is not available, or may be used tobuild an initial model. One example of a traveltime-basedobjective function is the norm of the weighted crosscorrela-tion of synthetic and observed data (van Leeuwan and Mulder,2010). While this crosscorrelation-based objective function isless susceptible to cycle skipping, it may be sensitive to a ban-dlimited or non-impulsive source function. For a bandlimitedor non-impulsive source, when the constructed velocity modelmatches the true model, the crosscorrelation is centered at zerotime lag, but it is not confined to zero lag. Consequently, thismay result in artifacts in the gradient, even with the correctvelocity model.

To resolve this issue, we propose a new objective functionfor wave-equation inversion that minimizes the norm of theweighted deconvolution, rather than the crosscorrelation, ofsynthetic and observed data. As for the crosscorrelation-basedobjective function, the deconvolution-based objective functionis less susceptible to cycle skipping compared to the conven-tional difference-based objective function used in full-waveforminversion. In addition, it can also avoid potential problemswith the crosscorrelation-based objective function when usinga non-impulsive source function. The method is fully auto-matic as it does not require traveltime picking, and it is simpleto implement as a variation of an existing full-waveform inver-sion implementation.

THEORY

To compute the gradient of a functional J(m) =H(u(m)) withrespect to the model parameter m, we use the adjoint-statemethod (Tarantola, 2005; Plessix, 2006). Here, H(u(m)) isthe objective function we seek to minimize.

Wave propagation in an arbitrary medium characterized by theslowness s(x) is governed by the Helmholtz equation:[

−ω2s2(x)−∇

2]

u(e,x,ω,m) = fs(e,x,ω), (1)

where e is the shot number, x is the position in space, ω isthe wavefield frequency, fs(e,x,ω) is the source function, andu(e,x,ω,m) is the wavefield. The dependency of the func-tional J(m) on the model parameter m is through the state vari-able u(e,x,ω,m). We write the functional F linking the modelparameter space and the state variable space as

F(x,ω,u,m, fs) = L(x,ω,m)u(e,x,ω,m)− fs(e,x,ω), (2)

whereL(x,ω,m) =

[−ω

2s2(x)−∇2], (3)

is the Helmholtz operator, and

m(x) = s2(x). (4)

Deconvolution-based objective function

To solve for the state variable u, we impose the condition

F(x,ω,u,m, fs) = 0, (5)

to obtain

L(x,ω,m)u(e,x,ω,m) = fs(e,x,ω). (6)

Thus, the state variable u represents the wavefield simulatedusing the forward modeling operator L, with the source func-tion fs. To obtain the adjoint-state variable a(e,x,ω,m), wesolve the adjoint-state equation:[

∂F(x,ω,u,m, fs)∂u

]∗a(e,x,ω,m) =

∂H(u)∂u

, (7)

where ∗ denotes the adjoint. For a derivation of this equation,see Plessix (2006). From equation 2, we have

∂F(x,ω,u,m, fs)∂u

= L(x,ω,m), (8)

and from the adjoint-state equation,

L∗(x,ω,m)a(e,x,ω,m) =∂H(u)

∂u. (9)

Thus, the adjoint-state variable a represents the wavefield sim-ulated using the adjoint modeling operator L∗, with the adjointsource ∂H/∂u given by the derivative of the objective functionH with respect to the state variable u.

The gradient of the functional J with respect to the model pa-rameter m is given by

∂J(m)

∂m=∑

e

∑ω

ω2u(e,x,ω,m)a(e,x,ω,m), (10)

which is simply the scaled zero-lag correlation of the wave-field simulated using the forward modeling operator L andsource function fs, with the wavefield simulated using the ad-joint modeling operator L∗ and adjoint source function ∂H/∂u.

Because the state variable u is independent of the choice of ob-jective function H, the gradient of the functional J ultimatelydepends on the adjoint-state variable a, which is computed us-ing the adjoint source.

Difference-based objective functionWe first consider the conventional difference-based objectivefunction, which is defined as the squared l2-norm of the differ-ence between synthetic and observed data:

HDIF (us) =12

∑e,x,ω||K(e,x)(us(e,x,ω,m)−uo(e,x,w)||22,

(11)where uo is the observed (i.e., recorded) wavefield, us is thesynthetic wavefield, and K(e,x) is a masking operator that lim-its the wavefields to the receiver locations. Applying the maskto a wavefield gives the corresponding data. From the adjoint-state equation, the adjoint source is obtained by taking the par-tial derivative of the objective function HDIF with respect tothe state variable us. For the difference-based objective func-tion,

∂HDIF (us)

∂us= K(e,x)K(e,x)ℜ[us(e,x,ω,m)−uo(e,x,ω)].

(12)

The adjoint source for the difference-based objective functionis simply the residual between the synthetic and observed data.

Crosscorrelation-based objective functionNext we consider the crosscorrelation-based objective func-tion proposed by van Leeuwan and Mulder (2010). The objec-tive function is defined as the squared l2-norm of the weightedcrosscorrelation of synthetic and observed data:

HCOR(us) =12

∑e,x,τ||K(e,x)P(τ)c(e,x,τ)||22, (13)

where c(e,x,τ) is the crosscorrelation:

c(e,x,τ) =∑

ω

us(e,x,ω,m)uo(e,x,ω)e2iωτ , (14)

and P(τ) is a penalty function. Generally, P(τ) should be cho-sen to penalize energy at nonzero time lags τ in the crosscor-relation. One simple penalty function, as suggested by vanLeeuwan and Mulder (2010), is the time lag within a window:

P(τ) =

{τ if |τ| ≤ τ0

0 otherwise(15)

The parameter τ0 is the maximum allowable time lag, whosepurpose is to prevent energy in the crosscorrelation at unrea-sonably large time lags to influence the gradient. Its valueshould be chosen based on the expected maximum traveltimeerror.

The adjoint source is given by the partial derivative of the ob-jective function HCOR with respect to the state variable us. Forthe crosscorrelation-based objective function,

∂HCOR(us)

∂us= K(e,x)K(e,x)× (16)∑

τ

P(τ)P(τ)ℜ[uo(e,x,ω)c(e,x,τ)e−2iωτ

].

(17)

This adjoint source can be interpreted as the observed datashifted by the traveltime difference between the synthetic andobserved data, which is given by the crosscorrelation c(e,x,τ).If the observed and synthetic data match, the energy in thecrosscorrelation is maximized at zero time lag, and the en-ergy at zero time lag is subsequently annihilated by the penaltyfunction.

The goal is for the adjoint source, and as a result the gradient,to be zero when the constructed velocity model matches thetrue model. However, with the penalty function given in equa-tion 15, this is true only for an impulsive, i.e., infinite band-width, source function. For a bandlimited or non-impulsivesource, the crosscorrelation peak for the correct velocity modelis centered at zero time lag, but is not confined to zero lag.Consequently, the adjoint source and the gradient are nonzeroeven when the model is correct.

To resolve this issue, one possibility is to choose a differentpenalty function that is zero over a window centered about zerotime lag. This approach, however, effectively reduces the res-olution of the inversion, as any traveltime differences between


a)

b)

Figure 1: The true velocity model (a) and the source func-tion (b). The trial velocity model is a constant 2.5 km/s.

the observed and synthetic data that fall within the zero-valuedwindow cannot be resolved.

Deconvolution-based objective functionAn alternative approach is to use a deconvolution-based ob-jective function. We define the deconvolution-based objectivefunction as the squared l2-norm of the weighted deconvolutionof synthetic and observed data:

HDEC(us) =12

∑e,x,τ||K(e,x)P(τ)d(e,x,τ)||22, (18)

where d(e,x,τ) is the deconvolution:

d(e,x,τ) =∑

ω

uo(e,x,ω)us(e,x,ω,m)e2iωτ

uo(e,x,ω)uo(e,x,ω)+ ε2, (19)

where ε is a constant for stabilization. The adjoint source isgiven by the partial derivative of the objective function HDECwith respect to us:

∂HDEC(us)

∂us= K(e,x)K(e,x)× (20)

∑τ

P(τ)P(τ)ℜ

[uo(e,x,ω)d(e,x,τ)e−2iωτ

uo(e,x,ω)uo(e,x,ω)+ ε2

].

(21)

The interpretation of this adjoint source is similar to that ofthe crosscorrelation-based objective function, with the impor-tant distinction being that the traveltime shift between the syn-thetic and observed data is now represented by the deconvolu-tion d(e,x,τ) instead of the crosscorrelation c(e,x,τ).

Given synthetic and observed data that differ by only a travel-time shift, in the limit of infinite bandwidth, the deconvolutionof these data is a shifted delta function. If the constructed ve-locity model matches the true model, the energy in the decon-volution is both centered and confined to zero time lag, and iscompletely annihilated by the penalty function P(τ) given inequation 15. This is not the case for the crosscorrelation-basedobjective function. Thus, we expect the deconvolution-basedobjective function to provide a more reasonable gradient and

d)

c)

a)

b)

Figure 2: Observed (solid line) and synthetic data (a) forthe velocity model shown in Figure 1, and the adjointsource for the difference-based (b), convolution-based (c), anddeconvolution-based (d) objective functions.

faster convergence compared to the crosscorrelation-based ob-jective function when using a bandlimited or non-impulsivesource function.

EXAMPLES

To compare the difference-based, crosscorrelation-based, anddeconvolution-based objective functions, we compare their ad-joint sources and gradients for a synthetic example. The truemodel shown in Figure 1a consists of a background velocity of2.5 km/s with a Gaussian anomaly in the center, while the trialmodel consists of only the background velocity of 2.5 km/s.We use the tapered 5 Hz sine wave shown in Figure 1b as oursource function to demonstrate the effect of a non-impulsivesource. Note that for this source function, the magnitude ofthe velocity anomaly is large enough to produce cycle skip-ping for the difference-based objective function.

Figure 2a shows the observed data (solid line) modeled withthe true velocity model and the synthetic data (dotted line)modeled with the trial velocity model for a single shot locatedat distance 3 km and depth 0.06 km, and a single receiver atdistance 3 km and depth 1.94 km. Figures 2b, 2c, and 2d showthe adjoint sources for the difference-, crosscorrelation-, anddeconvolution-based objective functions, respectively. Notethe oscillations in the adjoint sources for the difference- andcrosscorrelation-based objective functions compared to that of


d)a)

c)

b) e)

f)

Figure 3: The sensitivity kernels for the difference-based (a), correlation-based (b), and deconvolution-based (c) objective functionscontribute to the gradients of the difference-based (d), correlation-based (e), and deconvolution-based (f) objective functions.

the deconvolution-based objective function, which is more im-pulsive.

The sensitivity kernels and gradients for these objective func-tions are shown in Figure 3. Figures 3a, 3b, and 3c show sen-sitivity kernels computed for the difference-, crosscorrelation-,and deconvolution-based objective functions, respectively, us-ing their corresponding adjoint sources shown in Figure 2.Figures 3d, 3e, and 3f show the gradients computed for thedifference-, crosscorrelation-, and deconvolution-based objec-tive functions, respectively, for 96 shots at depth 0.06 km andfull coverage of receivers at depth 1.94 km.

Note that the oscillations in the adjoint sources for the difference-and crosscorrelation-based objective functions shown in Fig-ure 2 are reflected in their sensitivity kernels and gradients.Also, notice that because the velocity anomaly is large enoughto produce cycle skipping, the gradient of the difference-basedobjective function (Figure 3d) does not correctly recover thesign of the anomaly. Comparing the gradients to the correctvelocity update (i.e., the difference between the true velocitymodel and the trial model), we observe that the gradient of thedeconvolution-based objective function is closest to the correctupdate.

CONCLUSION

Crosscorrelation- and deconvolution-based objective functionsfor wave-equation inversion are less susceptible to the cycleskipping and local minima problems that are inherent to strict

data matching inversions such as full-waveform inversion us-ing the difference-based objective function. Thus, they maybe useful for building an initial velocity model, which thenmay be close enough to the true model for full-waveform in-version to converge to the global minimum of the difference-based objective function. However, the crosscorrelation-basedobjective function may be sensitive to a bandlimited or non-impulsive source function, because the crosscorrelation of syn-thetic and observed data produced by a non-impulsive sourceis not confined to zero lag even when the constructed velocitymodel matches the true model. In comparison, the deconvo-lution of these data is more impulsive, and is more confinedto zero lag given the correct velocity model. For this reason,wave-equation inversion using the proposed deconvolution-basedobjective function may provide more reasonable gradient esti-mates and faster convergence compared to the crosscorrelation-based objective function.

ACKNOWLEDGEMENTS

This work was supported by the sponsors of the Center forWave Phenomena at the Colorado School of Mines.


REFERENCES

Lailly, P., 1984, The seismic inverse problem as a sequence ofbefore stack migration: Conference on Inverse Scattering,SIAM, 206–220.

Luo, Y., and G. T. Schuster, 1991, Wave-equation traveltimeinversion: Geophysics, 56, 645–653.

Mora, P. R., 1987, Nonlinear two-dimensional elastic inversionof multioffset seismic data: Geophysics, 52, 1211–1228.

Plessix, R.-E., 2006, A review of the adjoint-state method forcomputing the gradient of a functional with geophysicalapplications: Geophysical Journal International, 167, 495–503.

Pratt, G., C. Shin, and G. Hicks, 1998, Gauss-newton and fullnewton methods in frequency-space seismic waveform in-version: Geophysical Journal International, 113, 341–462.

Sava, P., and B. Biondi, 2004, Wave-equation migration veloc-ity analysis. I: Theory: Geophysical Prospecting, 52, 593–606.

Shen, P., W. Symes, and C. C. Stolk, 2003, Differential sem-blance velocity analysis by wave-equation migration: 73rdAnnual International Meeting, SEG Expanded Abstracts,2132–2135.

Shin, C., and W. Ha, 2008, A comparison between the be-havior of objective functions for waveform inversion in thefrequency and laplace domains: Geophysics, 73, VE119–VE133.

Symes, W. W., 2008, Migration velocity analysis and wave-form inversion: Geophysical Prospecting, 56, 765–790.

Tarantola, A., 1984, Inversion of seismic reflection data in theacoustic approximation: Geophysics, 49, 1259–1266.

——–, 1986, A strategy for nonlinear elastic inversion of seis-mic reflection data: Geophysics, 51, 1893–1903.

——–, 2005, Inverse Problem Theory: Society of Industrialand Applied Mathematics.

van Leeuwan, T., and W. A. Mulder, 2010, A correlation-basedmisfit criterion for wave-equation traveltime tomography:Geophysical Journal International, 182, 1383–1394.

Zhang, Y., and D. Wang, 2009, Travetime information-based wave-equation inversion: Geophysics, 74, WCC27–WCC36.

Documents

A deconvolution-based objective function for wave-equation ...newton.mines.edu/paul/meetings/seg2011/LuoSavaSEG2011.pdf · image domain methods (Shen et al., 2003; Sava and Biondi,