Large-scale DES computations of the forward speed diffraction and

$Page 1: Large-scale DES computations of the forward speed diffraction and$
Computers & Fluids 39 (2010) 1095–1111

Contents lists available at ScienceDirect

Computers & Fluids

journal homepage: www.elsevier .com/ locate /compfluid

Large-scale DES computations of the forward speed diffraction and pitchand heave problems for a surface combatant

P.M. Carrica a,*, J. Huang a, R. Noack b, D. Kaushik c, B. Smith c, F. Stern a

a IIHR-Hydroscience and Engineering, C. Maxwell Stanley Hydraulics Laboratory, The University of Iowa, Iowa City, IA 52242-1585, USAb Applied Research Laboratory, The Pennsylvania State University, State College, PA 16804, USAc Argonne National Laboratory, 9700 S. Cass Ave Argonne, IL 60439, USA

a r t i c l e i n f o a b s t r a c t

Article history:Received 22 September 2008Received in revised form 15 December 2009Accepted 1 February 2010Available online 11 February 2010

Keywords:Surface combatantFree surface flowDESLarge-scale computations

0045-7930/$ - see front matter � 2010 Elsevier Ltd. Adoi:10.1016/j.compfluid.2010.02.002

* Corresponding author. Tel.: +1 319 335 6381; faxE-mail address: [email protected] (P.M. Ca

This paper aims at presenting the most resolved solutions to date for the ship forward speed diffractionand pitch and heave problems, and discuss the method that enables these computations. Large-scale DEScomputations (60–115 million grid points, 276–500 processors) of ship hydrodynamics problems are pre-sented for the DTMB model 5512 surface combatant. The forward speed diffraction problem is studied atFr = 0.28 with waves of amplitude a = 0.006 and wavelength k ¼ 1:5, with the ship static allowing theoverset assembly to be a pre-processing step. In the pitch and heave problem the ship faces head wavesat Fr = 0.41 with waves of amplitude a = 0.006 and wavelength k ¼ 1:5, with the ship is allowed to pitchand heave, thus requiring dynamic overset grid processing. The code CFDShip-Iowa version 4 and theoverset assembly code Suggar were modified to carry out some large scale simulations of free surface shiphydrodynamics. These modifications were focused on reducing the memory requirement and optimizingthe per-processor and parallel performance at the implementation and algorithmic levels, plus the addi-tion of a lagged mode for the overset domain connectivity computation. The simulation results show verysignificant improvements in the local flow and free surface results, but minor in forces and momentswhen compared with previous URANS computations performed with grids with about three millionpoints.

� 2010 Elsevier Ltd. All rights reserved.

1. Introduction

The advent of faster parallel computers offers both an opportu-nity and a challenge for the ship hydrodynamics CFD community.On one hand, larger grids allow the modeling of more complexphenomena and resolution of smaller scales. On the other handthe recent trend toward massively parallel computers as opposedto faster single processors imposes a significant burden of findingextra concurrency in CFD codes to handle these large cases. To takeadvantage of these new computers, code architectures need to berethought, or, as in the case of CFDShip-Iowa version 4, signifi-cantly revised.

An attractive approach to handle viscous large-scale problems(with tens or hundreds of millions of grid points) relies on Carte-sian grid solvers, in which simpler forms of discretized equationslead to smaller, better conditioned linear systems, and require lessfrequent memory access. However, Cartesian grids cannot handlecurved geometries directly, and either an immersed-boundary ap-proach needs to be used to impose the ship geometry or non-Carte-sian grids are used for the ship and its appendages and some way

ll rights reserved.

: +1 319 335 5238.rrica).

of communication is used with the Cartesian grids. Though ad-vances are coming up fast, high-Reynolds number, free surfaceflows with ship motions impose significant complications to Carte-sian solvers like CFDShip-Iowa version 6 and the technology is un-der development [1].

Structured curvilinear grids have the advantage of conformingto the geometry surface and retaining a reasonable high-order ofaccuracy. The resulting discretized equations are much more com-plex than in Cartesian grids, presenting significant challenges toefficient parallelization. The need to resolve ship motions, rotatingpropellers, and moving control surfaces with free surface wavesadd up to the complexity. It is worth noting that not only struc-tured curvilinear, but all types of solvers will be stressed whenthese features are included.

Perhaps due to the abovementioned complexities, CFD for shiphydrodynamics has been largely limited to grids smaller than 20million points, with the typical being a few million grid points.Other CFD areas have seen significantly larger computations re-cently. Compressible flow computations with up to 80 million gridpoints have been performed by Gomez et al. [2] for the flow aroundthe space shuttle. Lim and Strawn [3] computed the vortex-in-duced vibrations on a helicopter rotor with 113 million grid points,using a loosely coupled CFD/CSD structured overset approach. The

http://dx.doi.org/10.1016/j.compfluid.2010.02.002

mailto:[email protected]

http://www.sciencedirect.com/science/journal/00457930

http://www.elsevier.com/locate/compfluid

Fig. 1. Earth and ship fixed reference systems.

1096 P.M. Carrica et al. / Computers & Fluids 39 (2010) 1095–1111

flow around nuclear fuel pins has been computed using a spectralelement method with 140 million points [4]. One of the largestcomputations yet is a static DNS calculation of combustion withover 2.5 billion points in Cartesian grids using more than 10,000CPUs [5], which shows the scalability that can be expected froma Cartesian code. All these highly resolved computations report sig-nificant improvements in the quality of simulation outcome withrespect to the cases employing coarser grids.

This objective of this paper is to present the most resolved solu-tions to date for the ship forward speed diffraction and pitch andheave problems, and discuss the method that enables these com-putations. The forward speed diffraction problem has been re-cently subject of attention, and it is an interesting case toevaluate the scalability of the code since it is a transient problembut the grids are static. Deng et al. [6], Cura Hochbaum and Pier-zynski [7], Luquet et al. [8] and Carrica et al. [9,10] used differentRANS approaches to study the surface combatant model DTMB5512 at Fr = 0.28 with incoming head waves with amplitudea = 0.006 and wavelength k ¼ 1:5. This case was selected in theShip CFD hydrodynamics workshop held in Tokyo in 2005 [11] asa benchmark for transient ship hydrodynamics due to the rele-vance of the geometry and the quality of the experimental dataavailable. All the mentioned runs were limited to a maximum of3.3 million grid points. Herein a DES approach is used to analyzethe problem with 61.3 and 114.1 million grid points, providing ahigh-resolution discretization near the free surface to detect un-steady wave breaking and complex turbulence–surface interaction,and under the hull where complex turbulent structures arecaptured.

Numerical simulations of seakeeping in head waves based onsurface tracking [12] and surface capturing are available in the lit-erature. This latter approach is more popular since it can handlearbitrary free surface topologies. Solutions for containerships[13–15], the Wigley hull [16] and the DTMB 5512 surface com-batant model [17] have been presented. Of particular interestare methods based on dynamic overset grids, attractive becausecan handle arbitrary motions maintaining good grid quality nearthe free surface. Sato et al. [16], Orihara and Miyata [15] and Car-rica et al. [17] use overset grids in their computations. In this pa-per the pitch and heave case at Fr = 0.41, a = 0.006 and k ¼ 1:5computed in [17] with three million grid points is calculatedusing a 70.6 million point grid. Extensive unsteady bow and tran-som wave breaking is observed in this case with the high resolu-tion grid, as well as a more clear picture of the vortex generationand transport.

In both test cases the quality of the results in local quantities(velocity profiles, free surface resolution) increases considerablycompared with coarser grids and experimental data, but integratedvariables like forces, moments and motions change slightly. Theability to run high resolution grids opens the opportunity to re-solve smaller scales of free surface and turbulence when usingDES methods, and to approach the asymptotic range on verificationstudies.

As stated earlier, the high resolution cases require carefulimplementation of the solution process on current large scale par-allel architectures. The code needs to perform well in the hostileenvironment of deep memory hierarchy and large processorcounts. In addition, the trend towards less and less memory percomputational entity (core or processor) puts additional burdenof minimizing the memory overhead resulting from parallelism.Efforts to improve CFDShip-Iowa performance in high-perfor-mance computing (HPC) environments involved optimization ofserial routines, parallel execution optimizations, parallel I/O, andSuggar optimizations. Memory usage was also improved to avoidusing shared memory and allow scalability on distributed memorymachines.

2. Mathematical modeling

CFDShip-Iowa [10,17] solves the incompressible Navier–Stokesequations in an inertial coordinate system, either fixed to a shipmoving at constant speed or in the earth system shown in Fig. 1.See Xing et al. [18] for a discussion on coordinate systems. Massand momentum read:

@uj

@xj¼ 0 ð1Þ

@ui

@tþ @uiuj

@xj¼ � @p

@xiþ @

@xj

1Reeff

@ui

@xjþ @uj

@xi

� �� þ si ð2Þ

Equations are non-dimensionalized with arbitrary characteris-tic velocity U0 and length L (typically the ship length). The dimen-sionless piezometric pressure is defined as p ¼ pabs=qU2

0 þ z=Fr2þ2k=3 with pabs the absolute pressure, Fr ¼ Uo=

ffiffiffiffiffigL

pis the Froude

number, and 1/Reeff = 1/Re + mt is the effective Reynolds numberwith mt the turbulent viscosity obtained from a turbulence model.

The free surface is identified as the zero level of the level setfunction /, obtained by solving the following transport equation

@/@tþ uj

@/@xj¼ 0 ð3Þ

Eqs. (1) and (2) are solved only in water, imposing in air appro-priate equations that satisfy the dynamic free surface boundaryconditions; this method is called a single-phase level set approach.The following conditions for velocity and pressure are enforced atthe interface

@ui

@xjnj

��int

¼ 0 ð4Þ

pint ¼zint

Fr2 ð5Þ

Solving Eq. (4) in all points in air immediately satisfies the zeronormal derivative of the velocity required at the interface, and alsoprovides a convenient velocity field to transport the level set func-tion in air. Eq. (5) is imposed geometrically.

Since the transport of the level set function does not maintainthe property of being a distance function, periodic reinitializationsare performed everywhere except at the interface by solving thefollowing transport equation:

nj@/@xj¼ signð/0Þ ð6Þ

with /0 the level set function before reinitializing and nj the normalto the interface pointing into the fluid to be reinitialized.

P.M. Carrica et al. / Computers & Fluids 39 (2010) 1095–1111 1097

The turbulent kinetic energy is computed using a blended k–e/k–x model [19], where the turbulent kinetic energy k and the spe-cific dissipation rate x are computed from:

@k@tþ uj � rk

@mt

@xj

� �@k@xj� 1

Pk

@2k@x2

j

þ sk ¼ 0 ð7Þ

@x@tþ uj � rx

@mt

@xj

� �@x@xj� 1

Px

@2x@x2

j

þ sx ¼ 0 ð8Þ

where the turbulent viscosity and the effective Peclet numbers aredefined as:

mt ¼ k=x; Pk ¼1

1=Reþ rkmt; Px ¼

11=Reþ rxmt

ð9Þ

with sk = �G + b*xk and sx = x(b*x � cG/k) � 2(1 � F1)rw2(ok/oxj

ox/oxj)/x the sources for k and x, respectively, and G ¼ mtsij@ui=

@xj the kinetic energy production. F1 is a blending function thatswitches between the k�x model near the wall to the k�e modelon the free-stream region. The constants of the model are also afunction of F1. Zero normal gradient for both k and x is used atthe free surface:

@k@xj

nj

��int

¼ @x@xj

nj

��int

¼ 0 ð10Þ

Eqs. (4) and (10) are solved in air to transport the velocity, k andw from the interface into the air. This is called an extension withthe normal and enforces the zero normal gradient of thesequantities.

A detached eddy simulation model (DES) based on the k�e/k�xturbulence model is used to model regions of massively separatedflows. The dissipative term of the k-transport equation is modifiedas [20]:

DkRANS ¼ qb�kx ¼ qk3=2

lk�xð11Þ

DkDES ¼ q

k3=2

~lð12Þ

where the length scale is chosen from the length scale of the k–xmodel and the characteristic grid size D as~l ¼minðlk�x;CDESDÞ, withlk�x = k1/2/(b*x) and CDES = 0.65. Details of the implementation ofDES in CFDShip-Iowa version 4 are presented in [21]. An exampleof the regions operating under RANS and Large Eddy Simulation(LES) is shown later in the second application problem of this paper.

3. Numerical methods

CFDShip-Iowa uses body-fitted structured multi-block gridswith ghost cells and chimera interpolations in a finite differencesapproach to accommodate complex geometries and dynamic mo-tions. The equations are first transformed from the curvilinearphysical domain (x, y, z, t) to the orthogonal computational domain(n, g, f, s). As an example, Eq. (2) is transformed to

@ ui

@ sþ 1

Jbk

j uj �@ xj

@ s

� �@ ui

@ nk

¼ �1J

bki@ p

@ nkþ 1

J@

@ nj

bkl

JReeffbj

l@ ui

@ nkþ bi

l@ uj

@ nk

� �" #þ Si ð13Þ

The resulting convective terms are discretized using eitherupwind or upwind-biased schemes. Though linear (up to fourthorder) and TVD (second-order) methods are available in CFDShip-Iowa version 4 [22], a hybrid 2nd-/4th-order biased scheme is usedfor DES computations. Diffusion terms are discretized using sec-

ond-order central discretization. Time derivatives are evaluatedusing a second-order backward Euler scheme. Since CFDShip-Iowais designed for high-Reynolds number flows, the transport andreinitialization equations are weakly elliptical and pentadiagonalline solvers in an ADI scheme are efficient. This solution methodis used for all equations, with the exception of Eq. (1).

Enforcement of the incompressible condition Eq. (1) leads to aPoisson equation for the pressure, obtained from either projectionor PISO methods. The discretization results in a 19-point stencil.Since the pressure equation is strongly elliptic, ADI methods con-verge slowly. Instead, preconditioned Krylov methods from PETScare used [23].

3.1. Motions

The 6DOF equations of motion of a ship are described by trans-lations respect to the initial static location of the center of gravityon the earth system, and by the rotations in terms of the Euler an-gles g ¼ ðg1;g2Þ ¼ ðx1; x2; z3; /; h; wÞ, with the linear and angularvelocities defined as mi = (mi1, mi2) = (ui, vi, wi, pi, qi, ri) [17]. u, vand w are the surge, sway and heave velocities, and p, q and rare the roll, pitch and yaw angular velocities in the ship system.The velocities on the ship system relate to the time rate of changeof the Euler angles and spatial position through m1 ¼ J�1

1 _g1 andm2 ¼ J�1

2 _g2, with Ji1 and Ji2 transformation matrices depending onthe Euler angles [24].

If the principal axes of inertial are aligned with the ship’s axes,then the rigid-body equations of motion are:

m½ _u�v rþwq� xGðq2þ r2ÞþyGðpq� _rÞþ zGðprþ _qÞ� ¼ X

m½ _v �wpþur�yGðr2þp2Þþ zGðqr� _pÞþ xGðqpþ _rÞ� ¼ Y

m½ _w�uqþv p� zGðp2þq2Þþ xGðrp� _qÞþyGðrqþ _pÞ� ¼ Z½Ix _pþðIz� IyÞqrþmfyGð _w�uqþv pÞ� zGð _v �wpþurÞg� ¼ K ð14Þ½Iy _qþðIx� IzÞrpþmfzGð _u�v rþwqÞ� xGð _w�uqþv pÞg� ¼M

½Iz _rþðIy� IxÞpqþmfxGð _v �wpþurÞ� yGð _u�v rþwqÞg� ¼N

where xG = xrot � xcg is the distance from the center of rotation ofthe ship to the center of gravity, and the principal moments of iner-tia are

Ix ¼ Ixcg þm y2G þ z2

G

� Iy ¼ Iycg þm x2

G þ z2G

� Iz ¼ Izcg þm x2

G þ y2G

� ð15Þ

Forces and moments are computed by integration of the forceson the solid surfaces, plus the forces and moments caused by thepropeller models and the gravity force. The fluid and gravity forcesand moments are first computed in the earth system, where theflow solution is available:

Fe ¼Z

ship

ruþruT

2Re

� �� p� z

Fr2

� �I

� �� dae þmg ð16Þ

Le ¼Z

shipr� ruþruT

2Re

� �� p� z

Fr2

� �I

� �� dae

�þ rG �mg

ð17Þ

and then translated into the ship system where the propeller forcesand moments are added:

F ¼ J�11 Fe þ Fprop ¼ ðX;Y; ZÞ ð18Þ

L ¼ J�11 Le þ Lprop ¼ ðK;M;NÞ ð19Þ

Eq. (14) is integrated numerically using either an explicit or im-plicit predictor–corrector scheme. The predictor scheme is explicit


and uses the forces and moments at the end of the current timestep to guess the next time step 6DOF solution. For any of the de-grees of freedom / the predictor step solves:

_un ¼ _un�1 þ Dt c1 €un�1 þ c2 €un�2 þ c3 €un�3�

un ¼ un�1 þ Dt c1 _un�1 þ c2 _un�2 þ c3 _un�3� ð20Þ

where the accelerations €u ¼ _u; _v ; _w; _p; _q; _r are computed by solv-ing Eq. (14). The corrector step uses forces and moments computedin the current time step:

_un ¼ _un�1 þ Dt c1 €un þ c2 €un�1 þ c3 €un�2�

un ¼ un�1 þ Dt c1 _un þ c2 _un�1 þ c3 _un�2� ð21Þ

In Eqs. (20) and (21) the constants c1 , c2 and c3 define the orderof accuracy of the integration. CFDShip-Iowa currently has firstsecond and third order accurate approximations.

Note that two modes of motions integration are possible. In ex-plicit mode at the end of a time step the motions for the next timestep are predicted using Eq. (20), resulting in only one motion ofgrids per time step. In implicit mode the correction step describedby Eq. (21) is added at the end of each non-linear iteration, with

Fig. 2. Global solu

the resulting higher demand of motion of grids and oversetcomputations.

3.2. Overset implementation

Overset grids are handled by considering fringe points asboundary conditions within each block, and lagging the value ofthe variables at those points to the latest available. The overset do-main connectivity information (DCI), which is used to connect thesolution between overlapping grids, is obtained using Suggar [25].The original process is described in detail in [17], with the modifi-cations introduced for better performance described in sectionOptimization Efforts.

3.3. Waves

Waves are implemented based on initial and boundary condi-tions for linear waves. Regular and irregular, unidirectional andmultidirectional waves are available in CFDShip-Iowa, but onlyregular head waves are used and discussed here. The elevation ofthe wave at any time t is

tion strategy.

Fig. 3. Implicit solution procedure for one variable. Grey boxes indicate operationswith intensive inter-processor communication.

Table 1Details of the different test cases.

Case Kvlcc1-coarse Kvlcc1-fine

Description Self-propelled tanker ship inzig-zag maneuver [27](6DOF)

Self-propelled tankerturn maneuver[27] (6DOF)

Number of baseoverset grids

17 18

Max–min number of pointsin a grid (millions)

0.12–0.7 0.12–4.1

Parent moving geometry Hull HullChild moving geometry Rotating propeller,

moving rudderRotating propeller,moving rudder

Number of processors 49 129Pressure solves 1 (projection) 5 (PISO)Max. points/proc 120,000 120,000Total points (millions) 5.6 13.1

0 20 40 60

Baseline

After paralleloptimization

+subroutineoptimizations

PH Fr=0.41 (188K/proc,projection)

Time p

Fig. 4. Effect of some important


nðx; tÞ ¼ a cos½k x�xet þ /� ð22Þ

where / is the phase, a is the wave amplitude, xe is the encounterfrequency and k is the wave number. The encounter frequency inthe ship reference system is given by xe ¼ kþ

ffiffiffikp

=Fr:The velocities and pressure are then given by:

Uðx; z; tÞ ¼ U0 þ aFr

ffiffiffikp

ekz cos½kx�xet þ /�Vðx; z; tÞ ¼ V0 þ a

Fr

ffiffiffikp

ekz cos½kx�xet þ /�Wðx; z; tÞ ¼W0 þ a

Fr

ffiffiffikp

ekz sin½kx�xet þ /�pðx; z; tÞ ¼ a

Fr2 ekz cos½kx�xet þ /� a2k ekz�

ð23Þ

4. Solution strategy

All equations of motion are solved sequentially as shown inFig. 2. During the initialization stage, grids are read and split forparallel decomposition according to user directives. Each subgridis expanded to add ghost cells and sent to a processor. This domaindecomposition information is also sent to Suggar for use in parallelcommunications. For static cases (no motions) CFDShip-Iowa readsthe overset DCI from a file only once at the start of the computa-tions and Suggar is not launched. For cases with motions CFD-Ship-Iowa obtains the overset DCI at run time as required duringthe solution time stepping.

FSD Fr = 0.28 PH Fr = 0.41 FSD Fr = 0.41 FSD Fr = 0.28Fine

ship in Forward speeddiffraction(FSD) [9]

Pitch and heaveproblem [17](2DOF)

FSD problem[9]

Forward speeddiffraction [9]

5 5 5 7

6.2–23.9 0.73–30.4 4.8–13.7 6.2–36.4

None Hull None NoneNone None None None

276 388 499 5001 (projection) 1 (projection) 1 (projection) 1 (projection)237,000 188,000 120,000 241,20061.3 70.6 55.1 114.1

80 100 120 140 160er nonlinear iteration (s)

MomentumLevel setTurbulencePressureOther, wait for Suggar

Kvlcc1-fine(120K/proc,PISO)

performance optimizations.

0.0E+00

1.0E-05

2.0E-05

3.0E-05

4.0E-05

5.0E-05

6.0E-05

0 100 200 300 400 500Number of processors

Tim

e pe

r poi

nt/n

onlin

ear i

tera

tion

(s)

Total/10 TurbulenceLS Transport LS reinitializationMomentum Pressure/10

Kvlc

c1-c

oars

e

Kvlc

c1-fi

ne

FSD

Fr=0

.28

FSD28 Fine (500 procs)

PH F

r=0.

41

FSD Fr=0.41(499 procs)

Fig. 5. Weak scaling for several cases of varying difficulty.


Non-linear iterations (Picard) are performed within each timestep to properly couple turbulence, level set, motions and thenon-linear convection terms on the momentum equations. At thebeginning of a non-linear iteration CFDShip-Iowa obtains the over-set information through receives of the information sent by theSuggar processors that run separately in parallel. Since this is a col-lective operation, CFDShip-Iowa has to wait until Suggar is ready tosend the DCI. In the standard (no-lag) procedure, Suggar starts the

Fig. 6. Average wall clock time per tim

Fig. 7. Average memory usage

computation at the end of the previous inner iteration (or timestep), and thus the CFDShip-Iowa wait time is almost the time ta-ken by Suggar to compute the overset DCI. Once this is done andthe grids are moved to the current position, turbulence and levelset are solved using the latest velocity available. Then the velocityis solved implicitly, the pressure matrix is assembled and the pres-sure solved. Afterwards the velocity is updated with the new pres-sure field to obtain a solenoidal velocity field. In a projectionscheme the non-linear iteration is complete, but if a PISO algorithmis used to couple pressure and velocity then some iterations areneeded in a loop marked in grey in Fig. 2. Once the flow field forthe non-linear iteration is obtained, forces and moments are com-puted. If the residuals for the non-linear loop are acceptably small(typically 10�3) then the solution is converged for the time stepand the motions for the next time step are predicted using Eq.(20). With the new motions available Suggar can initiate computa-tion of the DCI for the new grid position. If the time step did notconverge the motions are corrected using Eq. (21) and Suggar isstarted for the new non-linear iteration n. Notice in Fig. 2 the syn-chronization points between CFDShip-Iowa and Suggar, shown indashed lines. The no-lag procedure implies an essentially serialoperation of Suggar and CFDShip-Iowa, where CFDShip-Iowa needsto wait for Suggar to complete the domain connectivity before itcan read the overset information, while Suggar needs the relativepositions between grids before it can start the computations. Formost coarse and medium grids, say below three million grid points,the cost is acceptable since Suggar takes about 5–25% of the totalcomputational time, depending on the complexity of the oversetassembly.

e step for overset grid assembly.

for overset grid assembly.

Fig. 8. Grid used for the forward speed diffraction problem (61.3 million grid points). One every fifth point is shown.


For large cases Suggar may take longer than CFDShip-Iowa for anon-linear iteration and slow-down the computation significantly.To overcome this problem a lagged mode was implemented, inwhich Suggar is launched immediately after the transfer of theoverset DCI to CFDShip-Iowa is completed (the Read Overset Infor-mation box in Fig. 2). As shown in Fig. 2, in lagged mode the trans-fer of the motions to Suggar and the transfer of the overset DCI toCFDShip-Iowa occur simultaneously. Since the final motions arenot yet available, the last available position of the grids is providedto Suggar. As the time step converges, the difference in position ofthe grids between non-linear iterations decreases, resulting in Sug-gar and CFDShip-Iowa using essentially (exactly in perfect conver-gence) the same grid positions. Therefore, there is no penalty forusing an implicit lagged mode if time step convergence is achieved.Tests have shown that for implicit mode and a typical number ofnon-linear iterations (4–6) the lag has negligible effect on thesolutions.

If the time step is small or if a steady state solution is sought,then an explicit update of the motions can be used. In this modethe time step is computed using the motions provided by the pre-dictor with the forces from the previous time step. The lagged

Table 2Details for test case 1.

Case FSD, Fr = 0.28,k=L ¼ 1:5,ak = 0.025

FSD, Fr = 0.28, finek=L ¼ 1:5, ak = 0.025

Time step 0.003415 0.002277Encounter frequency 1.83 1.83Number of time steps

per wave period160 240

Number of baseoverset grids

5 7 (adds refinement to the sonardome and propeller plane)

Total points (million) 61.3 114.1Number of

processors276 500

mode can also be used in conjunction with the explicit mode, usingmotions from previous time steps to predict the grid positions atthe current time, using for Suggar a first order extrapolation ofthe form

Fig. 9. Free surface contours at the four quarter periods for case 1.

Fig. 10. Zeroth (top) and first (bottom) harmonics of the free surface elevation forcase 1.


utþ1 ¼ ut þ _utDt ð24Þ

where _ut is velocity of change of any of the motion degrees of free-dom at time t, linear or angular. In this case Suggar has time to com-plete the overset assembly in the span it takes CFDShip-Iowa toperform a complete time step. For the explicit mode the error growswith the local non-linearity of the motion. If the velocity _ut doesnot change between time steps (zero acceleration), then theapproximation is exact and there is no penalty. If the acceleration(unknown a priori in an explicit scheme) is non-zero, then a first or-der approximation of the exact position is given by

utþ1e ¼ ð €utDt þ _utÞDt þut ð25Þ

and an estimation of the relative position error is

utþ1e

utþ1 ¼ 1þ€utDt2

_utDt þutð26Þ

which shows that the position of the grid sent to Suggar is exact ifthe acceleration is zero, and decreases with the square of the timestep for non-zero acceleration. If the acceleration expected is small,an explicit scheme with small time step will work well. If the accel-eration is large, a second-order algorithm could be used to predictthe next position, using the acceleration available from previoustime steps. An explicit algorithm fails if the acceleration is discon-tinuous in time, like in slamming problems, since previous valuesof velocity and acceleration cannot be used to predict the next timestep. In these cases, however, implicit algorithms work well.

The implicit solution procedure for each variable (with theexception of the pressure) is shown in Fig. 3. The matrix and solverwork in the local processor treating overset fringe points and ghostcells as boundary conditions. At the end of a linear iteration infor-mation is exchanged between grids and convergence evaluated.These two communication routines account for most of the inter-processor communication in CFDShip-Iowa, with the exception ofthe solution of the pressure Poisson equation, which is performedwith PETSc handling the communication.

5. Optimization efforts

Efforts to improve the speed of CFDShip-Iowa in high-perfor-mance computing (HPC) environments involved optimization ofserial routines, parallel optimizations, parallel I/O, and improvedSuggar algorithms. Detailed descriptions of the optimizationefforts are provided in [26], and only an overview is included here.

Extensive profiling was performed to identify bottlenecks in thein-processor and inter-processor procedures, using the PETScfacility of user-defined events. Several offending routines were de-tected and re-coded. Parallel optimizations include elimination ofunnecessary MPI barriers, use of collective MPI instructions when-ever possible, introduction of the already discussed lagged mode,and parallel I/O.

Memory usage was also improved to allow scalability in distrib-uted memory machines. A scalable PDE-based reinitialization wasadded as an scalable alternative to geometrical reinitialization,which requires the full geometry to be available in all computingnodes and was thus problematic for large cases.

Details of all cases used to evaluate performance are shown inTable 1, in which problems with static and dynamic overset are in-cluded, ranging from 5.6 to 114.1 million grid points. Two geome-tries are used: the surface combatant model DTMB 5512 for theforward speed diffraction and pitch and heave (PH) problems,and the tanker Kvlcc1 for maneuvering simulations.

A measure of the effect of the optimizations described previ-ously is shown in Fig. 4, where all timings measurements are with-out I/O (but the time to read the overset DCI in the old approach isincluded). Overset optimizations are shown in dark bars, and dis-

cussed later. PH Fr = 0.41 uses a projection approach to solve thepressure equation, resulting in only one solution of the pressureequation per non-linear iteration (see also Table 1). The Kvlcc1-finecase [27] uses a PISO approach that solves the pressure equationfive times per inner iteration. This is a more complex case and ismore expensive to solve, since it has rotating propeller, movingrudder and a highly unsteady solution, and illustrates the priceto pay for a more robust algorithm. Fig. 5 shows average timeper grid point to solve one non-linear iteration for the main sec-tions of the code (see Fig. 2), for all cases presented in Table 1.All runs were performed on an SGI Altix 4700 with 500 processorsper node available to the users. As shown by the figure, weak scal-ing is fairly good, even for different problems and ratios of gridpoints per-processor of up to 2–1. The Kvlcc1-fine case is an excep-tion, as previously discussed. The second exception is FSD28 Fine,with 500 processors, which could be affected by saturation of thecapacity of the 500-processor node SGI Altix 4700. An obvious con-clusion from Figs. 4 and 5 is that at this point performance ismainly limited by the solution of the pressure equation.

Optimizations on the overset assembly code Suggar [25] alsohad a significant impact on performance. In [17], Suggar was runas a separate process with communication and synchronization be-tween codes via UNIX named pipes and connectivity transfer via afile. This approach is inefficient because Suggar and CFDShip-Iowaprocesses must execute on the same computational node, whichcan seriously limit performance due to memory limitations, CPUsharing, and the time for writing and reading the overset DCI viaa file. In the current implementation CFDShip-Iowa can executeSuggar as a library on a dedicated MPI rank with communicationbetween flow solver ranks and the Suggar rank via MPI.


Several options have been added to Suggar to improve the exe-cution speed for moving body problems, described in detail in [26].The improvements are (1) limitation of the grids to be searched fordonors to the ones that are acceptable, manually excluding gridswith bad quality donors that would be rejected anyway duringthe overset assembly process, (2) addition of special managementfor Cartesian grids, allowing much faster processing and less mem-ory usage, (3) reutilization of donors found in the previous compu-tation to initialize the donor search, which allows for some CPUtime reduction at the expense of additional memory usage, (4)pre-assembly of dynamic groups, in which sets of grids will moverigidly together during the motions and (5) use of shared memoryparallel threads in some parts of the grid assembly process are par-allelized by distributing the work within a grid among the parallelthreads.

The effectiveness of the described approaches in reducing thewall clock time for the overset grid assembly is highly dependentupon the particular grid system, the specific motion, and theexecution environment. The performance is examined in Fig. 6for the three different grid systems shown in Table 1, with the totalwall clock time shown as a bar subdivided according to the timerequired for the different steps in the grid assembly process. Thetimes presented are an average over a number of time steps. The

Fig. 11. Free surface and axial velocity sectio

black portion of the bar shows the time required to perform thehole cutting, dark gray is for the donor search, light gray for overlapminimization, and the white portion of the bar is the time for allother operations performed during the time step. Each bar alongthe vertical axis represents a different case with a different set ofexecution options. The lowest bar, labeled ‘‘Base” is the Baselineset of input options, which includes the use of overlap minimiza-tion. The ‘‘Base + Reuse” bar adds donor reuse and shows that itis effective in reducing the time for the donor search and the over-lap minimization. The ‘‘Base + Pre + Reuse” bar adds pre-assemblyand shows that it is only marginally effective in reducing the timefor the donor search. The relatively small decrease with the pre-assembly is attributed in part to the fact that the input file speci-fied explicit donor grids to use or to exclude so that the pre-assem-bly benefit is partially included in the Baseline inputs. The final/upper three bars enable the use of two, three, and four threads ofexecution. In general, the threads are effective in reducing the wallclock time with significant variability depending upon the case. ForPH Fr = 0.41, case 2 in the application examples section, executiontime is dominated by a single large grid. The hole cut and donorsearch are parallelized by spreading the work in a single grid acrossthe threads while the overlap minimization assigns a single threadto each grid.

ns at the four quarter periods for case 1.


The time for Suggar to write and CFDShip-Iowa to read the do-main connectivity file is now avoided by directly transferring theconnectivity to CFDShip-Iowa via MPI calls, resulting in more than20% savings in wall clock time for a large case such as PH Fr = 0.41.

The memory overheads for the different options are illustratedin Fig. 7, where the horizontal bars are for the same set of cases.The usage is presented for the three different grid systems and onlyone execution environment. The donor reuse requires an increasein memory usage that is somewhat reduced by the use of pre-assembly.

6. Applications

As a demonstration of the level of solutions that can be ex-pected with high resolution grids, two cases are run that has beenpreviously studied with coarser grids, one static overset and theother with dynamic overset grids. In both cases the geometry isthe surface combatant model DTMB 5512, chosen as a benchmarkon the ship hydrodynamics community, tested in the towing tanksat DTMB, IIHR (Iowa) and INSEAN (Italy), and selected as test casefor code validation at the Ship Hydrodynamics CFD workshops inGothenburg 2000 (steady state) and Tokyo 2005 (steady and un-steady state [11]).

6.1. Case 1: forward speed diffraction at Fr=0.28, k/L = 1.5, ak = 0.025

For the forward speed diffraction problem there are no motionsinvolved, and thus a static overset approach is used in which theoverset assembly is a pre-processing step. This case has been partof the Tokyo 2005 Ship hydrodynamics CFD workshop [11], wherefour respondents presented numerical solutions. High resolutionresults are compared against experimental data taken at the IIHRtowing tank and reported for forces and moment [28], free surface[29], and nominal wake plane velocities and turbulence [30], andprevious numerical solutions on coarser grids [9,10]. In [9] the for-ward speed diffraction problem was studied at Fr = 0.28 for

Fig. 12. Some vortical structu

k=L ¼ 1:5 using CFDShip-Iowa version 4 with a 3.3 million pointgrid. The case was subject to a validation and verification studyagainst experimental data for forces and moment and free surfaceelevation at two longitudinal sections. A complete description ofthe conditions for this case is provided in [9,10].

Two simulation grids are used. The first is designed to capturedetails of the free surface, including refinement blocks at the bowand stern. Since the problem is symmetric respect to y = 0, onlyhalf of the domain is simulated. The total number of grid pointsis 61.3 million, distributed as shown in Fig. 8. Since this grid isvery fine, a DES approach was used to compute the turbulence,along with a 4th-order upwind-biased scheme for the convectionterms. Notice that the symmetry boundary condition imposes anunrealistic restriction to lateral fluctuations on the resolved LESregions, mostly the wake of the ship. However, upstream of thestern the symmetry boundary condition will have small effect,since the lateral fluctuations are small (see [30] showing experi-mental measurements of v 0v 0 on the nominal wake plane) withthe exception of the boundary layer, where RANS is active. Timestep, total number of processors and other details are summa-rized in Table 2. As a reference, the time step used in [9] resultsin 80 time steps per wave period. The second grid adds tworefinement blocks, one at the sonar dome and a second locatednear the hull to capture vortex detachment and transport andbetter predict the flow on the nominal wake plane x = 0.935.Due to cost, this case was computed only for a few wave periods,and all results presented correspond to the 61.3 million point gridexcept where noted. Note that the computations presented in thispaper are performed on grids much finer than the fine grid pre-sented in [9], and the current grids are not systematically refinedfrom those in [9]. A verification study with the fine grids pre-sented in this work would be very expensive and has not beenperformed. However, the consistency of the results with the coar-ser grids in [9] and with CFD results from others [6–8] and theimprovement compared to experimental data [28–30] gives somedegree of confidence to the results.

res identified for case 1.


6.1.1. Free surfaceThe benefits of a high resolution computation can be observed

in the free surface, shown in Fig. 9 for the four quarter periods,with t/T = 0 defined when the wave crest is at x/L = 0. This figurecan be compared against computations with two million grid

Fig. 13. Zeroth harmonic of U, V, W and xx (top to bottom) at the nominal wakeplane, case 1.

points and experimental results [10]. Notice the small overturningbow wave at t/T = 0 and t/T = 1/4, which is impossible to capturewithout a local grid finer than these features. The bow wave causessurface scars clearly visible at t/T = 1/4, created by vorticity gener-ated by the overturning waves and observed for this ship in steadystate at Fr = 0.35 [31]. A very steep shoulder wave develops at t/T = 0 and spills over at t/T = 1/4, a phenomenon of very small-scalethat can be observed visually in the towing tank experiment. Thefree surface at the transom exhibits a highly turbulent nature, asobserved in the experiments, with overturning waves showing att/T = 0 and 3/4. Though there are improvements in the far-fieldflow with respect to coarser grid computations, the enhancementsare less impressive, since waves far from the ship are of smallersteepness and amplitude. The 0th and first harmonic amplitudesof the free surface elevation are compared against experimentaldata in Fig. 10, where due to cost only three wave periods wereavailable with periodic motions and used in CFD to perform theFFT analysis. The agreement with the experimental data is excel-lent, much better that the results presented in [9], with very goodmatch in the amplitudes of both harmonics and only a slight evi-

Fig. 14. First harmonic of U, V and W (top to bottom) at the nominal wake plane forcase 1.


dence of phase error in the far-field. Notice that the first harmonicamplitude in the transom region evidences fluctuations in the ele-vation that have not been averaged out by using only three periodsto perform the FFT. The level of first harmonic amplitude in the re-gion, however, is still in good agreement with the data. An ani-mated view of the free surface evolution can be found inElectronic Annex 1 of the online version of this paper.

6.1.2. Boundary layerFigs. 11 and 12 are prepared using results from the finest grid

of 114.1 million grid point. Fig. 11 shows the boundary layer andthe free surface at the four quarter periods, with t/T = 0 definedwhen the wave crest is located at the nominal wake plane x/L = 0.935. The boundary layer is depicted showing cross sectionsof axial velocities limited to U = 0.95. See Electronic Annex 2 ofthe online version of this article for an animated version of thisFig. 11. The phenomena describing the generation and transportof the vortices observed in Fig. 11 has been discussed in [9],and the same discussion applies here, but better resolution un-veils additional phenomena. Vortices are created at the sonar

-0.004-0.0020.0000.0020.0040.0060.0080.0100.0120.014

0 0.5 1

CT

-0.080

-0.060

-0.040

-0.020

0.000

0.020

0 0.5 1

CH

-0.015

-0.010

-0.005

0.000

0.005

0.010

0.015

0 0.5 1

CM

Fig. 15. Time history of resistance force, heave forc

Table 3Comparison of 0th and first harmonic amplitudes and first harmonic phase of the resistancdifferences with the experimental data. Uncertainties are non-dimensionalized with the fi

This study Medium resolution [9] Experimental data and unc

CT,0 0.0046 (�0.45%) 0.00464 (0.45%) 0.00462 (1.08%)CT,1 0.00622 (2.3%) 0.00655 (7.2%) 0.00608 (3.83%)cT,1 �46.5� (�9.3%) �20.9� (�23.6%) �63.3� (7.24%)CH,0 �0.0338 (1.2%) �0.0338 (1.2%) �0.0334 (12.3%)CH,1 0.0379 (6%) 0.0401 (�15%) 0.03574 (3.18%)cH,1 �127� (�7.2%) �103� (�20.6%) �140� (12.5%)CM,0 �5.9 � 10�4 (�3%) �5.7 � 10�4 (�6.3%) �6.08 � 10�4 (2.8%)CM,1 0.0117 (�8.3%) 0.0122 (�13%) 0.0108 (4.25%)cM,1 �40.8� (�10%) �22.6� (�20.1%) �58.8� (4.64%)

dome that are transported downstream and up and down bythe flow and the waves. The interaction of the waves with thehull causes the boundary layer to bulge as it is transported upand down. At the nominal wake plane, the boundary layer isthickest when a trough of the wave passes through the plane(t/T = 1/2). This level of resolution permits to identify downwardjets generated at the center plane when the downward velocitycaused by the waves is near the maximum, and evidenced as ver-tical spikes in the boundary layer (at t/T = 0 the maximumstrength occurs around x/L = 0.55). The downward velocity andjets are stronger where the hull has a V-shape (toward thebow) than where the hull is flat-bottomed.

Vortical structures as represented by isosurfaces of Q = 30 cor-responding to a quarter period t/T = 3/4 (respect to the nominalwake plane) are presented in Fig. 12. This corresponds to a phaset/T = 1/3 measured from the bow. The vortices that separate fromthe sonar dome interact with each other, showing a breakdowninto hairpin vortices at x/L = 0.75. This vortex breakdown occursas the interior sonar dome vortex coils around the stronger exter-nal sonar dome vortex. The occurrence can be seen at all quarter

1.5 2 2.5 3

1.5 2 2.5 3

1.5 2 2.5 3t/T

e and pitching moment coefficients for case 1.

e and heave forces and pitching moment coefficients for case 1. In parenthesis are therst harmonic amplitude, or with 180� for the phase.

ertainty Deng et al. [6] Hochbaum and Pierzynski [7] Luquet et al. [8]

0.00516 0.0049 0.00480.00638 0.00676 0.0071�63.0� �25.9� �23.0��0.0315 �0.0292 �0.03350.0351 0.0389 0.0415�144� �105� �102��6.0 � 10�4 �7.2 � 10�4 �10.5 � 10�4

0.0118 0.0130 0.0124�63.0� �22.2� �21.0�

Fig. 16. Free surface contours at quarter periods (case 2).

Table 4Details for test case 2.

Case Pitch and heave Fr = 0.41,k=L ¼ 1:5, ak = 0.025

Time step 0.003422Encounter frequency 1.4611Number of time steps per wave period 200Motions scheme Implicit, laggedNumber of base overset grids 5Total points (millions) 70.6Number of processors 392


periods in Fig. 11 as a strong distortion of the boundary layerwhere the interior vortex collapses.

The transom flow is dominated by the vortical structures gener-ated by the sudden expansion of the flow. The vortical structures inFig. 12 are colored with axial velocity, showing very low velocitiesin the near transom and accelerating downstream. A large rollerdeforms into hairpin-like vortices, as discussed in [32]. These vor-tices are transported and break down generating fairly small struc-tures. The generation of vortices associated with breaking waves[31] is also clear in Fig. 12 for the bow, shoulder and transomwaves, see also Fig. 9. Though very small at this Froude number,these vortices can be resolved with the current level of refinement.See Electronic Annex 3 of the online version of this article for ananimation of the vortical structures.

It is clear that much more turbulence in the flow is now beingresolved compared to the medium resolution computations of[9]. This is particularly noticeable in the transom flow and thebreaking waves, but also in the stern region where the geometryopens rapidly.

Zeroth and first harmonics of the velocity components at thenominal wake plane (x/L = 0.935) are shown in Figs. 13 and 14,respectively. Fig. 13 also shows the first harmonic of the axial vor-ticity. The circulation, evidenced by the hook-shaped deformationof U in Fig. 11, is apparently overpredicted, exactly the oppositebehavior reported by the second-order RANS computations of [9]in a medium resolution grid. The first harmonic of U (Fig. 13) isslightly overestimated, but in much better agreement with datathan the computations in [9].

V and W velocities compare both in 0th and first harmonicmuch better than previous computations. V agreement with exper-iments is excellent, but the first harmonic is slightly overestimated.Overestimation of the first harmonic amplitude was observed alsoin U but not in W. In the case of W both 0th and first harmonicamplitude predictions are excellent.

The axial vorticity (Fig. 13) shows a very good qualitative agree-ment with the considerably noisy data, exposing a cell withxx = �2 and a strong cell with positive axial vorticity that reachesa maximum of about xx = 8 for CFD and xx = 9 for EFD.

6.1.3. Forces and momentResistance force, heave force, and pitching moment coefficients

are shown in Fig. 15 compared with CFD results from [9], otherrespondents to the Tokyo workshop [6–8] and experimental data.A summary of the harmonic analysis is shown in Table 3. No signif-icant improvements over the predictions of [9] are observed for the0th harmonics of all quantities, which match the data very well inboth cases.

The first harmonic amplitudes are significantly improved,though for heave forces are still overpredicted by 6% and for pitch-ing moment underpredicted by 8.3%. Phase errors also show a sig-nificant improvement respect to [9], but still leading by about 15�.All but one of the respondents to the Tokyo CFD workshop also re-ported significant phase leads. It is interesting to note that smallamplitude low frequency fluctuations reported in [9] and likelydue to wave reflection at the boundaries, are also present in thiscomputation.

6.2. Case 2: pitch and heave at Fr=0.41, k/L=1.5

The case of the surface combatant pitching and heaving in reg-ular head waves is presented in this section. This problem wasstudied in [17] for two speeds (Fr = 0.28 and 0.41) and two waveamplitudes (ak = 0.025 and 0.075) with a wavelength k=L ¼ 1:5.Herein efforts are concentrated in the case with Fr = 0.41,ak = 0.025, k=L ¼ 1:5, run in [17] with three million grid points,named from here on the medium resolution results, while the

present computations are called high resolution results. Experi-mental data of motions at several Froude numbers and wavelengths and amplitudes taken at IIHR include uncertainty analysis[33].

Since the Froude number is high, extensive wave breaking oc-curs at the free surface, in addition to radiation and diffraction phe-nomena, thus the grid is refined to capture wave breaking andassociated vorticity generation and transport. The grid is symmet-ric about y = 0, and includes refinement blocks at the bow andstern. Details of the grid and conditions are shown in Table 4, withthe time step corresponding to half of the fine time step used in[17]. Since the grid is very fine, the same DES of case 1 was used.

As in case 1, no verification study was performed. Again, threegrids where used in [17] and a validation and verification studywas performed. Here a much finer grid than the fine grid in [17]is used with no systematic refinement, so no conclusions on uncer-tainty can be drawn. However, results are closer to the experimen-tal data, showing that the use of fine grids, which allows the use ofa DES approach, tends to improve the quality of the computations.


6.2.1. Free surfaceContours of free surface elevation are shown in Fig. 16. The

phase is referred as zero when the wave peak is at the ship bow.Massive breaking can be seen in the bow and stern waves, and in

Fig. 17. U, V and W (top to bottom) at the nominal wake plane for the four quarterperiods, case 2. The plots for t/T = 1/2 and t/T = 3/4 are displaced vertically byDz = �0.03.

the shoulder waves at t/T = 1/2. In contrast, no breaking waveswere captured with the medium resolution grid used in [17]. Thegeneration of large radiation waves is clear as the free surfacepushes water out when the flared bow comes down, creatingstrong breaking waves at t/T = 0 and 1/4 and pulling water in att/T = 1/2 and 3/4. A splash-up occurs at t/T = 1/4, evidenced bythe presence of two free surface scars. A steep shoulder wave be-gins to form at t/T = 1/4 and at t/T = 1/2 is spilling over. Very strongbreaking occurs in the stern wave, most notably for t/T = 1/2 and 3/4, when the wave crest is in the stern rooster tail. The transom flowlooks highly turbulent with small-scale structures being shed peri-odically and affecting the free surface. Electronic Annex 4 of theonline version of this article shows an animation of the free surfaceand vortical structures.

6.2.2. Boundary layerFig. 17 shows isocontours of the instantaneous velocity com-

ponents U, V and W at the nominal wake plane four quarter peri-ods, while the axial vorticity is presented in Fig. 18. Significantvorticity is generated by the vertical motion while the shippitches and heaves. The level of distortion of the boundary layer(evidenced by the U contours in Fig. 17) is much higher than inthe forward speed diffraction problem, due to the effect of themotions, but the boundary layer is thinner in average, being thisa higher Froude number than in case 1. The levels of axial velocityat the propeller location fluctuate between 0.75 and 1 during awave period, which will cause significant variations in propellerload.

6.2.3. Turbulent structuresTurbulent structures, depicted as isosurfaces Q = 30, and free

surface at the four quarter periods are shown in Fig. 19, and inanimated form in Electronic Annex 5 of the online version of thisarticle. The isosurfaces are colored with axial velocity. The sonardome vortices are generated continuously as the ship pitchesup and down, but the vertical motion of the bow and the chang-ing angle of incidence of the flow split the vortices in two, onethat remains closer to the hull and one that detaches. The vorticesthat remain closer to the hull eventually reach the stern andmerge with the complex transom flow. The detached vortex dis-sipates rapidly as it passes the stern due to the grid coarseningthere.

Fig. 18. xx at the nominal wake plane for the four quarter periods, case 2. The plotsfor t/T = 1/2 and t/T = 3/4 are displaced vertically by Dz = �0.03.

Fig. 19. Turbulent structures colored by U velocity at the four quarter periods for case 2.


In this case, the breaking bow waves create strong vorticity thatis transported downstream. These structures are more complexthan those reported by Wilson et al. [31] for bow breaking waves,probably due to a more complex unsteady breaking.

The transom flow appears richer than that shown in case 1, withabundant structures but also evidencing the presence of hairpinvortices.

The regions where RANS and LES models are active at t/T = 0are shown in Fig. 20. Fig. 20 also shows cross sections of axialvelocity limited to U = 0.99 contours from x/L = 0 to 0.95, depict-ing the boundary layer. The blue transparent isosurface showsthe region where the characteristic length predicted by RANSequals the DES characteristic length, defining the surface separat-ing domains operating under RANS and LES models, with regions

inside the volume operating under LES. While the very near wallis modeled by RANS, vast portions of the domain are modeledwith LES, including all regions of strong separation and fine grid.Most notable are the wake of the sonar dome, the separation vor-tices under the hull, the expansion region of the hull downstreamof x/L = 0.8, the wake downstream of the transom stern, and theareas near the free surface where the breaking bow is, as wellas the vortices generated by the breaking wave during the previ-ous period, located immediately downstream of the shoulderwave (see also Fig. 19 at t/T = 0). It is noticeable how the LESmodel is in some regions restricted to the refinement blocks, likein the transom refinement. Fig. 20 essentially shows that, thoughthe grid is fine for today’s standards, still most of the domain ismodeled with RANS.

Fig. 20. Cross sections colored with axial velocity and limited to U = 0.99, depicting the boundary layer, and limiting isosurface in transparent blue, indicating the domains ofoperation under RANS and LES models (case 2).

0.0 0.5 1.0 1.5 2.0 2.5 3.0-1

0

1

2 Mediumresolution[17]

Fine

Pitc

hing

ang

le (d

egre

es)

0.0 0.5 1.0 1.5 2.0 2.5 3.0-0.015

-0.010

-0.005

0.000

0.005Medium resolution[17]

Fine

Heav

e (-)

t/T

Fig. 21. Time histories of pitch and heave motions for case 2, including medium andfine resolution and experimental results.


6.2.4. MotionsFig. 21 shows time histories of pitch and heave motions com-

pared against experimental data and the medium resolution results,

Table 5Pitch and heave amplitudes and phases for case 2.

This study

Mean sinkage 0.0043 (�6.5%)Heave amplitude 0.00855 (4.3%)Heave phase �9.3� (6.2%)Average trim 0.54� (12.5%)Pitch amplitude 1.61o (2.5%)Pitch phase �146.1� (5.0%)

see also Table 5. The pitching angle shows very similar results tothose of previous computations, except for a slightly better phaseangle and larger first harmonic amplitude, closer to experimental re-sults. The 0th harmonic of the pitching angle, or mean trim, is alsocloser to the experiments. The first harmonic amplitude is closerto the experiments, but now is slightly over predicted.

For heave motions the differences with the medium resolutionresults are also slight. The 0th harmonic amplitude (or mean sink-age) for high and medium resolution results are coincident, andunder predict the experimental result. The first harmonic ampli-tude computed with the fine grid is over predicted, compared toan under prediction for the medium grid. Both the medium resolu-tion and high resolution lead the experimental phase by about 10�.

As in the case of the forces for case 1, slight improvements inmotions results are observed when going from the medium tothe high resolution grid, though these are not as spectacular asthe improvements observed in the free surface and flow structures,considering that the fine grid has over 20 times more grid pointsthan the medium.

7. Conclusions

High-resolution DES computations of two unsteady flowsaround the surface combatant model DTMB 5512 are presented.

Medium resolution [17] Experimental data

0.0043 (�6.5%) �0.00460.00772 (�5.8%) 0.0082�10.2� (5.7%) �20.4�0.59� (22.9%) 0.48�1.49� (�5.1%) 1.57��143.7� (6.3%) �155.1�


The first case is the forward speed diffraction problem, which usesstatic overset and therefore Suggar is used as a pre-processing step.The second case is pitch and heave in regular head waves, wheredynamic overset is used. In both cases dramatic improvementswith respect to the medium resolution grids of around three mil-lion points were observed for the free surface, vortical structures,and other local flow features. Only slight improvements (less than10%) can be reported for integral quantities (forces and momentsand motions) in the high resolution grids, as already noticed bythese and other authors when comparing grids with differentresolutions.

This level of refinement opens the opportunity to resolve largeareas of the flow using LES, either using a DES approach or a morecomplex hybrid RANS/LES. In addition, the capability to run verylarge grids allows the prospect of reaching asymptotic range in ver-ification studies of complex flows, though iterative convergencemay limit the benefit; see [18] for a discussion on the subject.Though rich turbulence structures were resolved, analysis of theturbulence and flow physics are left for future work. Future workalso includes the addition of surface tension to improve the simu-lation capability of small-scale curvatures in the free surface.

Other areas of naval architecture can greatly gain from faster,more scalable codes capable of resolving smaller scales. DirectCFD maneuvering, demonstrated for propellers modeled by bodyforces and direct discretization [27,34], show interesting potentialbut the computations are very expensive. Forces and moments inmaneuvers with large drift or roll appear to be more sensitive tomodeling issues, and could benefit from higher resolution compu-tations [35,36]. The computational challenges (in terms of mem-ory, CPU time, and I/O) in these high resolution cases need to beaddressed at the algorithmic, architectural, and implementationlevels. We expect to optimize the per-processor performance andscalability of our code CFDShip-Iowa further in subsequent ver-sions on contemporary parallel platforms.

Acknowledgements

This work is sponsored by the US Office of Naval Researchthrough research grants N00014-01-1-0073 and N00014-06-1-0474 administrated by Dr. Patrick Purtell. Computations were per-formed at DoD NAVO and AFRL HPC centers, and at NASA AdvancedSupercomputing center.

Appendix A. Supplementary material

Supplementary data associated with this article can be found, inthe online version, at doi:10.1016/j.compfluid.2010.02.002.

References

[1] Yang J, Bhushan S, Suh J, Wang Z, Koo B, Sakamoto N, et al. Large-eddysimulation of ship flows with wall-layer models on Cartesian grids. In: 27thSymposium naval hydrodynamics. Seoul (Korea); 2008.

[2] Gomez R, Greathouse J, Stuart P, Vicker D, Rogers S, Tejnil E. Space shuttlereturn to flight CFD applications. In: 8th Symposium overset composite gridsand solution tech. Houston (TX); 2006.

[3] Lim J, Strawn RC. Prediction of HART II rotor BVI loading and wake systemusing CFD/CSD loose coupling. In: 45th AIAA aerospace sci meeting andexhibit. Reno (NV); 2007. AIAA paper 2007-1281.

[4] Fisher P. Simulation of coolant flow past 19 wire-wrapped fuel pins. <http://www-unix.mcs.anl.gov/~fischer/pin19>; 2008.

[5] Lignell D, Yoo C, Chen J, Sankaran R, Fahey M. S3D: petascale combustionscience, performance, and optimization. TN: Oak Ridge National Laboratory;2007.

[6] Deng GB, Guilmineau E, Queutey P, Visonneau M. Ship flow simulations withthe ISIS CFD code. CFDWS05. Tokyo (Japan); 2005.

[7] Cura Hochbaum A, Pierzynski M. Flow simulation for a combatant in headwaves. CFDWS05. Tokyo (Japan); 2005.

[8] Luquet R, Jacquin E, Alessandrini B, Guillerm PE. RANSE with free surfacecomputations around fixed and free DTMB 5415 model in still water andwaves. CFDWS05. Tokyo (Japan); 2005.

[9] Carrica PM, Wilson R, Stern F. Unsteady RANS simulation of the ship forwardspeed diffraction problem. Comput Fluids 2006;35:545–70.

[10] Carrica PM, Wilson R, Stern F. An unsteady single-phase level set method forviscous free surface flows. Int J Numr Method Fluid 2007;53:229–56.

[11] Hino T, editor. CFD workshop Tokyo 2005. National Maritime ResearchInstitute; 2005.

[12] Weymouth G, Wilson RV, Stern F. RANS CFD predictions of pitch and heaveship motions in head seas. J Ship Res 2005;49:80–97.

[13] Klemt M. RANSE simulation of ship seakeeping using overlapping grids. ShipTechnol Res 2005;52:65–81.

[14] Cura Hochbaum A, Vogt M. Towards the simulation of seakeeping andmanoeuvring based on the computation of free surface viscous flow. In: 24thSymposium naval hydrodynamics. Fukuoka (Japan); 2002.

[15] Orihara H, Miyata H. Evaluation of added resistance in regular incident wavesby computational fluid dynamics motion simulation using overlapping gridsystem. J Mar Sci Technol 2003;8:47–60.

[16] Sato Y, Miyata H, Sato T. CFD simulation of 3-dimensional motion of a ship inwaves: application to an advancing ship in regular head waves. J Mar SciTechnol 1999;4:108–16.

[17] Carrica PM, Wilson RV, Noack R, Stern F. Ship motions using single-phase levelset with dynamic overset grids. Comput Fluids 2007;36:1415–33.

[18] Xing T, Carrica PM, Stern F. Computational towing tank procedures for singlerun curves of resistance and propulsion. J Fluids Eng 2008;130(101102):1–14.

[19] Menter FR. Two-equation eddy viscosity turbulence models for engineeringapplications. AIAA J 1994;32:1598–605.

[20] Travin A, Shur M, Strelets M. Physical and numerical upgrades in the detachededdy simulation of complex turbulent flows. In: Friedrich, Rodi, editors.Advances in LES of Complex Flows; 2002. p. 239–54.

[21] Xing T, Kandasamy M, Stern F. Unsteady free-surface wave-inducedseparation: analysis of turbulent structures using detached eddy simulationand single-phase level set. J Turbul 2007;44:1–35.

[22] Ismail F, Carrica PM, Xing T, Stern F. Evaluation of linear and non-linearconvection schemes on multidimensional non-orthogonal grids withapplications to KVLCC2 tanker. Int J Numer Meth Fluids, 2010, in press,doi:10.1002/fld.2174.

[23] Balay S, Buschelman K, Gropp W, Kaushik D, Knepley M, Curfman L, et al.PETSc user manual. ANL-95/11-Revision 2.1.5. Argonne NationalLaboratory; 2002.

[24] Fossen TI. Guidance and control of ocean vehicles. New York: Wiley; 1994.[25] Noack R. Suggar: a general capability for moving body overset grid assembly.

In: 17th AIAA computational fluid dynamics conference. Toronto, Ontario(Canada); 2005. AIAA paper 2005-5117.

[26] Carrica PM, Huang J, Noack R, Kaushik D, Smith B, Stern F. Large-scalecomputations of ship motions with dynamic overset curvilinear grids. In: 27thSymposium naval hydrodynamics. Seoul (Korea); 2008.

[27] Carrica PM, Stern F. DES simulations of Kvlcc1 in turn and zigzag manoeuvreswith moving propeller and rudder. In: SIMMAN 2008. Copenhaguen(Denmark); 2008.

[28] Gui L, Longo J, Metcalf B, Shao J, Stern F. Forces, moment and wave pattern forsurface combatant in regular head waves, part I: measurement systems anduncertainty analysis. Exp Fluids 2001;31:674–80.

[29] Gui L, Longo J, Metcalf B, Shao J, Stern F. Forces, moment and wave pattern forsurface combatant in regular head waves part I: measurement results anddiscussion. Exp Fluids 2002;32:27–36.

[30] Longo J, Shao J, Irvine M, Stern F. Phase-averaged PIV for the nominal wake of asurface ship in regular head waves. J Fluids Eng 2007;129:1–17.

[31] Wilson R, Carrica PM, Stern F. Simulation of ship breaking bow waves andinduced vortices and scars. Int J Numr Method Fluids 2007;54:419–51.

[32] Wyatt D, Fu TC, Taylor GL, Xing T, Terrill E. A comparison of full-scaleexperimental measurements and computational predictions of the transomwave of the R/V Athena I. In: 27th Symposium naval hydrodynamics. Seoul(Korea); 2008.

[33] Irvine M, Longo J, Stern F. Pitch and heave tests and uncertainty assessment fora surface combatant in regular head waves. J Ship Res 2008;52:146–53.

[34] Carrica PM, Ismail F, Hyman M, Bhushan S, Stern F. Turn and zigzag maneuversof a surface combatant using a URANS approach with dynamic overset grids.In: SIMMAN 2008. Copenhagen (Denmark); 2008.

[35] Sakamoto N, Carrica PM, Stern F. URANS simulations of static and dynamicmaneuvering for surface combatant. SIMMAN 2008. Copenhagen (Denmark);2008.

[36] Umeda N, Hashimoto H, Stern F, Nakamura S, Hosseini S, Matsuda A, et al.Comparative study on numerical prediction techniques for parametric roll. In:27th Symposium naval hydrodynamics. Seoul (Korea); 2008.

http://dx.doi.org/10.1016/j.compfluid.2010.02.002

http://www-unix.mcs.anl.gov/~fischer/pin19

http://www-unix.mcs.anl.gov/~fischer/pin19

http://dx.doi.org/10.1002/fld.2174

Documents

Large-scale DES computations of the forward speed diffraction and