8/7/2019 Scientific Computing for Energy
1/112
8/7/2019 Scientific Computing for Energy
2/112
8/7/2019 Scientific Computing for Energy
3/112
TECHNIQUES OF SCIENTIFICCOMPUTING FOR ENERGYAND THE ENVIRONMENT
8/7/2019 Scientific Computing for Energy
4/112
8/7/2019 Scientific Computing for Energy
5/112
TECHNIQUES OF SCIENTIFICCOMPUTING FOR ENERGY
AND THE ENVIRONMENT
FRDRIC MAGOULSAND
RIAD BENELMIREDITORS
Nova Science Publishers, Inc.New York
8/7/2019 Scientific Computing for Energy
6/112
Copyright 2007 by Nova Science Publishers, Inc.
All rights reserved. No part of this book may be reproduced, stored in a retrieval system ortransmitted in any form or by any means: electronic, electrostatic, magnetic, tape, mechanical
photocopying, recording or otherwise without the written permission of the Publisher.
For permission to use material from this book please contact us:
Telephone 631-231-7269; Fax 631-231-8175
Web Site: http://www.novapublishers.com
NOTICE TO THE READERThe Publisher has taken reasonable care in the preparation of this book, but makes no expressed or
implied warranty of any kind and assumes no responsibility for any errors or omissions. No
liability is assumed for incidental or consequential damages in connection with or arising out of
information contained in this book. The Publisher shall not be liable for any special,
consequential, or exemplary damages resulting, in whole or in part, from the readers use of, or
reliance upon, this material. Any parts of this book based on government reports are so indicated
and copyright is claimed for those parts to the extent applicable to compilations of such works.
Independent verification should be sought for any data, advice or recommendations contained in
this book. In addition, no responsibility is assumed by the publisher for any injury and/or damage
to persons or property arising from any methods, products, instructions, ideas or otherwise
contained in this publication.
This publication is designed to provide accurate and authoritative information with regard to the
subject matter covered herein. It is sold with the clear understanding that the Publisher is not
engaged in rendering legal or any other professional services. If legal or any other expert
assistance is required, the services of a competent person should be sought. FROM A
DECLARATION OF PARTICIPANTS JOINTLY ADOPTED BY A COMMITTEE OF THE
AMERICAN BAR ASSOCIATION AND A COMMITTEE OF PUBLISHERS.
LIBRARY OF CONGRESS CATALOGING-IN-PUBLICATION DATA
Available upon request
ISBN 13: 978-1-60692-823-3
Published by Nova Science Publishers, Inc. New York
8/7/2019 Scientific Computing for Energy
7/112
CONTENTS
Preface vii
Frdric Magouls and Riad Benelmir
Stability Analysis of Abnormal Multiplication of PlanktonConsidering Flow Velocity 1
T. Yamauchi and M. Kawahara
Achieving High-Performance Computing in Geomechanicsby Development of Parallel Finite Element Package 21
F. Okulicka - Duzewska
Large-Scale Data Visualization Using Multi-Language Programming
Applied to Environmental Problems 37F. Magouls and R. Putanowicz
An Analysis of Flow Around a Propeller UsingFictitious Domain Finite Element Method 69
K. Harada and M. Kawahara
Numerical Simulation of Supersonic Combustion Using Parallel Computing 85E. von Lavante and M. Kallenberg
Index 99
8/7/2019 Scientific Computing for Energy
8/112
8/7/2019 Scientific Computing for Energy
9/112
PREFACE
Frederic Magoules and Riad Benelmir
The research and development in Scientific Computing and Computational Science
have considerably increased the power of numerical simulation. Engineers and researchers
are now able to solve large and complex problems which were impossible to solve in the
past. This Special Issue presents some techniques, methods and algorithms for solving
engineering problems arising in energy and environment.
The first article by T. Yamauchi and M. Kawahara presents a numerical method for theabnormal multiplication of plankton that happens by the water pollution. In this study, the
basic equation represents food chain of ecological model, which consists of phytoplankton,
zooplankton, and nutrient. The stability problem, the eigenvalue problem and the param-
eter identification technique are reintroduced. Experimental data used in the numerical
simulation arise from the Lake Kasumigaura, located in the Ibaraki Prefecture in Japan. A
discretisation with finite element is considered for the numerical experiments.
The paper of F. Okulicka - Duzewska analyzes the geotechnical problem, the embank-
ment rising, the dam building and the settlement of the underground. A finite element soft-
ware is considered for this analysis. Due to the large amount of data used in such model,these environmental problems appear to be difficult to solve with classical finite element
methods. The use of high performance computers is thus mandatory. For this reason, the
finite element software is parallelized on a network of PCs, and details of the proposed
approach are presented by the authors.
The paper written by F. Magoules and R. Putanowicz describes a technique well suited
for the visualization and analysis of large data sets arising in environmental problems. This
technique is based on a multi-programming approach using the visualisation toolkit VTK
library, and components written in Tcl and C++. VTK is an open-source software system
for visualisation, computer graphics and imaging. Though it is possible to write a whole
VTK application in a scripting language like Tcl, it is more suitable, for efficiency reasons,
to implement some functionality in a compiled language like C/C++ for instance. This is
specially the case when working with large data sets arising from environment analysis for
instance, as presented here. Pieces of code and detailed examples are provided to the reader
in order to alow him to program his own software.
The paper by K. Harada and M. Kawahara describes the analysis of flow around a
propeller. This analysis is related to the minimisation of the turbulence around the propeller
which leads to a lost of energy for the engine. The proposed analysis is based on the
fictitious domain method and a finite element discretisation is performed for the numerical
experiments.Finally, the paper of E. von Lavante and M. Kallenberg discusses the numerical simula-
tion of supersonic combustion using parallel computing. This analysis is related to a better
understanding of the transfer of energy involved in such combustion. For this purpose the
8/7/2019 Scientific Computing for Energy
10/112
viii Frederic Magoules and Riad Benelmir
unsteady, three-dimensional, supersonic flow in a channel with transverse hydrogen injec-
tion is simulated. The time accurate computation was accelerated by an implicit method
and implemented on a massively parallel computer. The parallelization is accomplished
using domain decomposition on a distributed memory systems. The relative efficiency and
relative speedup of the parallel algorithm are analyzed for various sizes of the problem andnumber of processor units.
Naturally, the present issue cannot provide a complete record of the many approaches,
applications, features, and numerical methods related to energy and environment. However,
it does give an indication of the progress that is being made in addressing these issues and
the possibilities that are available for future research in this area.
Frederic Magoules
Universite Henri Poincare
Institut Elie Cartan de Nancy, BP 239
54506 Vandoeuvre-les-Nancy Cedex, France
Riad Benelmir
Universite Henri Poincare
Ecole Sup. Sc. Tech. Ing. de Nancy
2 rue Jean Lamour
54519 Vandoeuvre-les-Nancy Cedex, France
8/7/2019 Scientific Computing for Energy
11/112
In: Techniques of Scientific Computing for Energy ...
Editors: F. Magoules and R. Benelmir, pp. 120
ISBN 1-60021-921-7
c 2007 Nova Science Publishers, Inc.
STABILITY ANALYSIS OF ABNORMAL
MULTIPLICATION
OF PLANKTON CONSIDERING FLOW VELOCITY
Tomohiro Yamauchi and Mutsuto Kawahara
Department of Civil Engineering, Chuo University,
Kasuga 1-13-27,Bunkyou-ku,Tokyo 112-8551,Japan
Abstract
This paper presents a numerical method for the abnormal multiplication of plank-
ton that happens by the water pollution. In this study, the basic equation represents
food chain of ecological model, which consists of phytoplankton, zooplankotn, and
nutrient. Flow velocity and time-lag increase are added to the ecological model as a
new approach. Velocity of the flow is obtained by using the nonlinear shallow water
equation. The stabilized bubble function finite element method is applied to the spatial
discretization in an analysis of the nonlinear shallow water flow. In this paper, an ab-
normal multiplication is thought as one of the unstable problem.Therefore, if there is
no problem of water quality, the system is stable. The stability of the system is investi-gated introducing the eigenvalues of the basic equation. The stability of the system can
be judged by the eigenvalue based on the Lyapunovs stability theory. In this paper,
the Arnoldi-QR method is used to obtain eigenvalues and eigenvectors of the system.
The Lake Kasumigaura, that is located in the Ibaraki Prefecture in Japan, is selected
and actual data in 1991 is used in order to guess the phenomenon of plankton at the
lake. Mode analysis is employed to make the initial distribution at the lake Kasumi-
gaura. Finally, change of distribution of plankton patchness for various time stage and
equilibrium solution is obtained.
Keywords: plankton, shallow water equation, ecological model, stability, parameter iden-
tification
1. Introduction
Recently, environmental problems are serious problems. Water pollution causes a large
amount of nutrient, because the industrial waste water, that streamed into the river, lake
and sea, etc. includes a lot of nutrient, which means nitrogen and phosphorous. The phy-
toplankton preys the nutrient. As a result, a large amount of nutrient causes an abnormal
multiplication of the plankton. Fish and shellfish die by suffocation because a lot of plank-ton consumes a large quantity of oxygen. Therefore, the abnormal multiplication of the
plankton has damaged seriously fishing industry. In fact, income of villages dramatically
E-mail address: [email protected]
8/7/2019 Scientific Computing for Energy
12/112
2 Tomohiro Yamauchi and Mutsuto Kawahara
decrease for heavy damage more than billion yen. This abnormal multiplication of the
plankton is called as red tide or blue-green alagae 1),2),3),4).
Prediction of abnormal multiplication of plankton using numerical analysis leads to pre-
vention of abnormal multiplication. In the recent study, abnormal multiplication is thought
as one of the unstable problem
5),6)
. The purpose of this study is to prevent an abnormalmultiplication of plankton by investigating stability of the system considering flow veloc-
ity. To obtain the initial spatial distribution of plankton at the Lake Kasumigaura, the mode
analysis is applied. This spatial distribution is used as the initial data.
2. Basic Equation
2.1. Nonlinear Shallow Water Equation
The two-dimensional nonlinear shallow water equation is used to caluculate the water flow,
which is written as follows;
u
t+ (u
u
x+ v
u
y) + g
+
x[(
2u
x2+2u
x2) + (
2u
y2+
2v
xv)] + f u = 0, (1)
v
t+ (u
v
x+ v
v
y) + g
+
y[(
2v
x2+
2u
xy) + (
2v
y2+2v
y2)] + f v = 0, (2)
t+ u
x+ v
y+(
u
x+v
y) = 0. (3)
The boundary condition can be expressed as;
u = u on d, (4)
v = v on d, (5)
= on d, (6)
un = unx = un on n, (7)
vn = vny = vn on n. (8)
The stabilized bubble element is used for the discretization by the finite element
method7). The bubble function is capable of eliminating the barycenter point by using
the static condensation. The discretized form derived from the bubble function element is
equivalent to that from the SUPG 9). Therefore, the stabilized parameter which is derived
from the bubble function element is expressed as follows for the momentum equation of the
shallow water flow:
eBui =e,12eA
1e
1t||e||
2e
+ 12
(+ )2||e,j||2e f||e||
2e
, (9)
and for the continuity equation :
8/7/2019 Scientific Computing for Energy
13/112
Stability Analysis of Abnormal Multiplication... 3
eBui =e,1
2e
A1e1t||e||
2e
+ 12
( )||e,j||2e
, (10)
where is the stabilized control parameter. From the criteria for the stabilized parametercorresponding to the SUPG, an optimal parameter can be given as follows for the momen-
tum equation of shallow water flow;
eBui = (1
21es +
t)1 (11)
es1 = [(2|Ui|
he)2 + (
4
h2e)2]
12 , (12)
and for the continuity equation;
eBui = (1
21es +
t)1, (13)
es1 = (2|Ui|
he), (14)
where
=Ae||e||
2e
e,12e
, (15)
he =
2Ae, (16)
|Ui| =
u2
+ v2
+ g, (17)
and e is the element domain and u,ve =e
uvd.
2.2. Ecological Model
In this study, a simple mathematical model is employed, which is suggested in 1). There
are many parameters in these equations. The ecological model is shown in Figure 1.
Pt
= D1x 2
Px2
+D1y 2
Py2
uPx
vPy
+ f(P,Z,N), (18)
Z
t= D2x
2Z
x2+D2y
2Z
y2u
Z
xv
Z
y+ g(P,Z,N), (19)
8/7/2019 Scientific Computing for Energy
14/112
4 Tomohiro Yamauchi and Mutsuto Kawahara
Figure 1. Ecological System
N
t= D3x
2N
x2+D3y
2N
y2u
N
xv
N
y+ h(P,Z,N). (20)
where, P is Phytoplankton, Z is Zooplankton, and N is Nutrient in which P,Z and N
show the concentration of each component. In these equation,D1x,D1y,D2x,D2y,D3x and
D3y are the non-dimensional diffusion coefficient of P,Z and N, respectively. The terms
f(P,Z,N),g(P,Z,N) and h(P,Z,N) are the biological reaction terms, which are expressed as
follows;
f(P,Z,N) =NP
+NZ[1exp{(P P)}]P, (21)
g(P,Z,N) = Z[1exp{(P P)}] Z2[1exp{(P P)}], (22)
h(P,Z,N) = NP
+N+P +Z2[1exp{(P P)}]. (23)
where positive term underlined means the increase of the time-lag.
3. Stability Problem
3.1. Lyapunovs Stability Theory
The stability analysis based on the Lyapunovs stability theory is employed. Considering
this theory, equilibrium points and perturbation can be thought to research the stability of
the system.Equilibrium means the state points, and the perturbation is the microscopic os-
cillation. In case that the system is completely stable, the oscillation settles down according
as time goes by. But if the system is unstable, according as time passes, oscillation becomes
unlimited. In this study, to decide the stability of the system, eigenvalue is employed. Thejudging criteria of the stability is described in Table1.
8/7/2019 Scientific Computing for Energy
15/112
Stability Analysis of Abnormal Multiplication... 5
Table 1. Judging Criteria
Eigenvalue system
< 0 Completely stable
= 0 Neutral
> 0 Unstable
3.2. Linearization
In order to obtain eigenvalues of the system, the basic equations are the linearized Lya-
punovs stability theory. At the first equilibrium point is pursuited. The following way is
employed in order to linearize the equations;
1:Considering the solution around the equilibrium points as follows;
P+P,Z+Z,N+N,
where, P,Zand Nmean equilibrium points in each component. The equilibrium pointsare determined if change of value by the incremental method is less than 1.0 105, andP,Z, N are the perturbations. Thus, substituting P + P,Z + Z, and N+N to
eqs.(18) - eq.(20);
(P+P)
t= D1
2(P+P) + f(P+P,Z+Z,N+N), (24)
(Z+Z)
t= D2
2(Z+Z) + g(P+P,Z+Z,N+N), (25)
(N+N)
t= D3
2(N+N) + h(P+P,Z+Z,N+N). (26)
2:Employing Taylor-expansion and omitting terms more than one order, the linearized
equation of eqs.(24) - (26) is obtained as follows;
= F, (27)
where
=
P
ZN
,
F =
D12 +
fP
fZ
fN
gP
D22 +
gZ
gN
hP
hZ
D32 +
hN
.
in whichfP
is the function at the substituted equilibrium solution with differentiation by P.
8/7/2019 Scientific Computing for Energy
16/112
6 Tomohiro Yamauchi and Mutsuto Kawahara
3.3. Discretization by FEM
The following perturbations are substituted ;
P = Pet (28)
Z = Zet (29)
N = Net (30)
Using eqs.(28) - (30) and discretizing eq.(27), the following equation is obtained ;
[M] = [H], (31)
where
M=
MM
M
,
H =
D1S + FP FZ FN beta
GP D2S + GZ GNHP HZ D3S +HN
,
M =
VdV,S =
V,i,idV. (32)
4. Eigenvalue Problem
4.1. Arnoldis Method
To obtain the eigenvalue of the system, the Arnoldis method is applied in this research. This
method is enable to decrease the memory of dimension and computation time. Algorithm
for the standard eigenvalues and eigenvectors problem(Cu = u) is as follows;1:Start;Choose an initial vector v1 of unity norm, and a number of step m.
2:Iterate;For j = 1,2,...,m do:
vj+1 = Cvj j
i=1
hi jvi, (33)
wherehi j = (Cvj,vi), i = 1,...., j, (34)
hj+1,j = ||vj+1||2, (35)
vj+1 = vj+1/hj+1,j. (36)
This algorithm produces an orthonormal basis Vm = [v1,v2,...,vm] of the Krylov sub-space Km = span{v1,Cv1,...,C
m1v1}. In this basis the restriction of C to Km is representedby the upper Hessenberg matrix Hm whose entries are the hi j produced by algorithm,i.e.,
Hm = hi j. (37)
The eigenvalues of C are approximated by those of Hm which are :
Hm = VT
m CVm. (38)
8/7/2019 Scientific Computing for Energy
17/112
Stability Analysis of Abnormal Multiplication... 7
4.2. Application for Generalized Eigenvalue Problem
If one wishes to find out the leading eigenvalue with maximum real part,it is common to
use the shift and invert strategy. If0 is an approximation to an eigenvalue of interest, then
the shifted and inverted problem is;
(C0I)1u = u, (39)
where, = 1/(0).Thus,eigenvalues of C close to 0 correspond to eigenvalues ofeq.(39) with large absolute value, and one expects the Arnoldis method to converge to
such eigenvalues. In order to apply the Arnoldis method to eq.(39) for the generalized
eigenvalue problem eq.(31), eq(39) may be described as;
(H0M)1Mu = u, (40)
and to apply to the Arnoldis method the LU decomposition of H - 0Monce is performed,and then each time(H-0M)
1Mv is needed, we solve (H0M)w = Mv by forward andbackward analysis. This is much more economical than forming the matrix of eq.(40)
explicitly since it is usually full and also its dimension is much larger than M.
5. Parameter Identification
5.1. Performance Function
In case that the parameter in the equation is changed, the stability of the system changes.Obtaining the parameter value in case that the system is stable, the parameter identification
technique is applied. This technique is equal to the estimation with minimization of the
performance function J, which is defined as the sum of square residual between calculated
and observed values. This function is described as follows:
J =1
2
v((k))t((k))dv. (41)
where, is objective eigenvalue,(k) is the eigenvalues of the system . In a word, the
optimal parameter value can be decided to minimize the performance function J applyingthe parameter identification technique.
5.2. Algorithm
In this reseach, the Conjugate Gradient Method is employed to minimize the performance
function J. The algorithm of the parameter identification technique is as follows:
1.Assume initial parameter value k(0), decide convergence criterion J2.Calculate state value (k)(0)
3.Calculate performance function J(0)
4.Calculate sensitivity matrix [ (k)k
](0)
5.Calculate initial gradient d(0) = [ (k)k ](0)
6.Calculate step size so as to minimize J((i) +d(i))
8/7/2019 Scientific Computing for Energy
18/112
8 Tomohiro Yamauchi and Mutsuto Kawahara
7.Renew parameter k(i+1) = k(i) +d(i)
8.Calculate state value (k)(i+1)
9.Calculate performance function J(i+1)
10.Calculate sensitivity matrix [ (k)k ](i+1)
11.Calculate = [J
k]
(i+1)
[
J
k]
(i+1)
[ Jk](i)[ J
k](i)
12.Calculate gradient of performance function J; d(i+1) = [ Jk](i+1) +d(i)
13.If|J(i+1)| |J(i)|< , then stop14.Set i = i + 1 and go to 6
5.3. Sensitivity Matrix
In order to solve the sensitivity matrix, the left eigenvalue problem has to be used in this
study. Left eigenvalue problem is as follows:
M = H, (42)
MT= HT. (43)
where, MT(or HT) is the transposed matrix of M(or H).The eigenvector of eqs.(42) and (43)
are not the same, but the eigenvalues are the same. In this study, the maximum eigenvalue
of real part is investigated and eigenvectors of real and imaginary part are employed to solve
sensitivity matrix. The real part of sensitivity matrix is able to be obtained as follows :
Rek
= AC+BDA2 +B2
. (44)
where,
A = TreMreTimMim, (45)
B = TreMim +TimMre, (46)
C= TreH
kre
Tim
H
kim, (47)
D = Tim
H
kre +Tre
partialH
k im. (48)
re:Real part of eigenvectors
im:Imaginary part of eigenvectorsre:Real part of eigenvectors
im:Imaginary part of eigenvectors
Calculating this matrix, parameter identification technique can be applied and the opti-
mal parameter value which makes the system stable can be obtained.
8/7/2019 Scientific Computing for Energy
19/112
Stability Analysis of Abnormal Multiplication... 9
6. Initial Distribution of Plankton
6.1. Modal Analysis
To represent the whole distribution of the plankton as the initial distribution,the concept of
the modal analysis is utilized as shown in Figure 2. If the eigenvalue of the linear Laplacian is denoted as 2 in area V, spectrum of linear Laplacian is determined by the Helmholtzequation as follows;
Figure 2. Modal analysis
2+2= 0, (49)
where
2 2
x2+
2
y2, (50)
In eq.(49), presents basic mode of phytoplankton, zooplankton and nutrient. Theboundary condition is as follows;
n = 0. (51)
6.2. Eigenvalue Problem by FEM
To obtain the eigenvalues 2 and eigenvectors , the finite element method is employed.The Galerkin method is used for the spatial discretization of eq.(49).
S2M = 0, (52)
whereS =
V,i,idV, (53)
M =
VdV. (54)
8/7/2019 Scientific Computing for Energy
20/112
10 Tomohiro Yamauchi and Mutsuto Kawahara
Eq(52) is dealed with as the general eigenvalue problem. The Householder-QR method
is employed to find the eigenvalues 2. However this method cant be applied to the general
eigenvalue problem. Therefore this problem is transformed the into standard eigenvalue
problem. Matrix M is symmetric, therefore the matrix can be divided into two matrices
by the Choleski Method; M = LTL, (55)
Substituting eq.(55) into eq.(52);
S2LTL = 0, (56)
where eigenvector is replaced by using the following equation;
z = L, (57)
then it is obtained thatSL
1
z = 2LTL, (58)
2z = LT
SL1
z, (59)
where
A = LT
SL1
, (60)
Substituting eq.(60) into eq.(59), the following equation can be derived ;
2z = Az. (61)
To obtain the eigenvalues 2 and the eigenvectors z by eq.(61), the Householder-QRMethod and the Inverse iteration method is employed. And equals to L1z, thus, eigen-
vector is found by the Backward substitution method 8).
6.3. Superposition of Spectra
6.3.1. Performance Function
Two-dimensional distribution is calculated by the superposition of the eigenmode and ob-
servation data. Then obtained spectra which represent the state for spatial density is super-
posed. It is called as the Modal analysis method. The method for superposition is calculated
as follows. The performance function is defined as;
J=1
2
V
(u u)2dV, (62)
where
u =n
i=1
ciui, (63)
Determine c1-cn so as to minimize J,where;
J=1
2
V
(u u)2dV, (64)
8/7/2019 Scientific Computing for Energy
21/112
Stability Analysis of Abnormal Multiplication... 11
J =1
2
mx
j=1
(uj uj)2, (65)
J=1
2
mx
j=1
(uj22 uj uj + uj
2), (66)
J =1
2
mx
j=1
[(n
i=1
ui jci)22(
n
i=1
ui jci)uj + uj2], (67)
J
Cl=
1
2
mx
j=1
[(2n
i=1
ui jci)ul j2ul j uj], (68)
=mx
j=1
[(n
i=1
ui jci)ul j ul juj], (69)
=mx
j=1
[ul j(n
i=1
ui jci uj)]. (70)
(l = 1,2,3,,n)
6.3.2. Minimization Method
Spatial distribution is made by the superposition of each eigenmode. Therefore, mode with compo-
nent influence by the unknown constants ci have generated the spatial distribution. The conjugate
gradient method is employed for the above equations to obtain the ci.
7. Numerical Example
7.1. Verification of Ecological Model
7.1.1. Case 1
Figure 3 shows the finite element mesh. The total number of element and node are 400 and 303,
respectively.
Figure 3. Used Mesh
Assuming that the sufficient nutrient change of phytoplankton, zooplankton, nutrient is com-
puted and represented in Figures 3-6. Vertical axis is normalized fraction of nutrient Nt. Horizontal
axis is the scaled distance of plankton patchness.
Figure 4 shows the initial condition. In figure 5, phytoplankton increases as the nutrient de-
creases and all patchnesses are moved by velocity.
In figure 6, zooplankton increases as phytoplankton decreases, because zooplankton absorbs
phytoplankton. The nutrient is also preyed by the phytoplankton. All patchnesses dont move out-siede through the boundary. In figures 5 and 6, all patchness is moved by velocity and diffusion.
The small amount of the nutrient remained mainly due to the extinction of zooplankton. In figure
7, equilibrium solution is obtained. Equilibrium solution means balanced value on the ecological
model.
8/7/2019 Scientific Computing for Energy
22/112
12 Tomohiro Yamauchi and Mutsuto Kawahara
0
0.2
0.4
0.6
0.8
1
-30 -20 -10 0 10 20 30
NormalizedFra
ctionofNt
Scaled Distance
PhytoplanktonZooplankton
Nutrient
Figure 4. t = 0.00
0
0.2
0.4
0.6
0.8
1
-30 -20 -10 0 10 20 30
NormalizedFractionofNt
Scaled Distance
PhytoplanktonZooplankton
Nutrient
Figure 5. t = 3.00
0
0.2
0.4
0.6
0.8
1
-30 -20 -10 0 10 20 30
NormalizedFractionofNt
Scaled Distance
PhytoplanktonZooplankton
Nutrient
Figure 6. t = 9.00
8/7/2019 Scientific Computing for Energy
23/112
Stability Analysis of Abnormal Multiplication... 13
0
0.2
0.4
0.6
0.8
1
-30 -20 -10 0 10 20 30
NormalizedFra
ctionofNt
Scaled Distance
PhytoplanktonZooplankton
Nutrient
Figure 7. t = 50.00
7.1.2. Case 2
Figure 8. Used Mesh
The mesh in Figure 8 is used in case 2. The total number of element is 3600 and that of node is
1861.
Figure 9 is initial condition of the nutrient. Phytoplankton and zooplankton are constant. From
figure 10, that nutrient diffuses and moves by flow velocity is confirmed.
7.2. Analysis in Lake Kasumigaura
In this research, the Lake Kasumigaura is chosen as the analysis field. This lake is with area of 220
square kilometers which is the second in size within Japan.
In this lake,there have been water quality problems and its damage has been very serious. One of the
famous problems is outbreak of Microcystis aeruginisa.It is a kind of phytoplankton like red tide,
and by the eutrophication, the water quality problem like Microcystis was taken place. Location
of the Lake Kasumigaura is shown in Figure 11. The mesh is used in Figure 12. The total number
of element is 1409 and that of node is 804.
8/7/2019 Scientific Computing for Energy
24/112
14 Tomohiro Yamauchi and Mutsuto Kawahara
N
0.95
0.9
0.850.8
0.75
0.7
0.65
0.6
0.55
0.5
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
Figure 9. t = 0.00
Table 2. Biological Parameter
The parameter values of the system are described on Table 2.
Parameter difinition
Micahelis constant 0.1
Zooplankton maximum grazing threshold 1.2
Zooplankton egestion coefficient 2.31
Ivlev constant 1.0
p Phytoplankton loss coefficient 0.083
Zooplankton grazing threshold 0.15
7.2.1. Initial Distribution
Figures 13 and 14 are results of modal analysis. The convergence of the performance function J is
shown in Figure 13. Figure 14 gives control vector ci to each spectrum, it is confirmed which mode
has significant influence on the components in the Lake Kasumigaura.
Figure 15 represents appearance of initial distribution of phytoplankton in time. The plankton
patchness is changed for predatism and velocity of flow on the Lake Kasumigaura. In this study, the
values in figure 16 are regarded as equilibrium solution, which means balanced value in ecological
system.
7.2.2. Stability Analysis
In this study, it is thought to combine the outbreak of the planktons with stability problem employingeigenvalues. For example, the maximum eigenvalue is negative, this system is stable. However, if it
is positive, it is considered that the system is unstable. Figure 17 shows the stability of the system.
Changing parameter , the system shifts from stable to unstable. The parameter value of which
8/7/2019 Scientific Computing for Energy
25/112
Stability Analysis of Abnormal Multiplication... 15
N
0.012
0.011
0.010.009
0.008
0.007
0.006
0.005
0.004
0.003
0.002
0.001
Figure 10. t = 10.00
Figure 11. Place of the lake Kasumigaura
8/7/2019 Scientific Computing for Energy
26/112
16 Tomohiro Yamauchi and Mutsuto Kawahara
Figure 12. Used mesh
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
0 10 20 30 40 50 60 70 80 90
PerformanceFunctionJ
Iteration
J
Figure 13. Performance function of Phytoplankton
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 5 10 15 20 25 30
controlvalue
mode number
control value
Figure 14. Control Quantity of Phytoplankton
8/7/2019 Scientific Computing for Energy
27/112
Stability Analysis of Abnormal Multiplication... 17
P
3.22
2.98615
2.75231
2.51846
2.28462
2.05077
1.81692
1.58308
1.34923
1.11538
0.881538
0.647692
0.413846
0.18
Figure 15. t = 0.00(phytoplankton)
P
3.22
2.98615
2.75231
2.51846
2.28462
2.05077
1.81692
1.58308
1.34923
1.11538
0.881538
0.647692
0.413846
0.18
Figure 16. t = 6.00(phytoplankton)
8/7/2019 Scientific Computing for Energy
28/112
18 Tomohiro Yamauchi and Mutsuto Kawahara
-0.1
-0.08
-0.06
-0.04
-0.02
0
0.02
0 2 4 6 8 10 12 14 16 18 20
falpha.r
Maximume
igenva
lue
Parameter alpha
Figure 17. Stability of The System
0
0.002
0.004
0.006
0.008
0.01
0.012
0.014
0 50 100 150 200 250 300
performancefunctionJ
iteration
dej.r
Figure 18. Performance Function
-0.48
-0.46
-0.44
-0.42
-0.4
-0.38
-0.36
-0.34
-0.32
0 50 100 150 200 250 300
maximume
igenvalue
iteration
emax.r
Figure 19. Maximum eigenvalue
8/7/2019 Scientific Computing for Energy
29/112
Stability Analysis of Abnormal Multiplication... 19
0.7
0.75
0.8
0.85
0.9
0.95
1
1.05
1.1
0 50 100 150 200 250 300
parameteralpha
iteration
balpha.r
Figure 20. Parameter
changes the stability is guessed about 9.45. From this result, in case that parameter is small,abnormal multiplication occurs. That is small means that phytoplankton absorbs large amount ofnutrient. Figure 18 shows the convergence of the performance function. Figure 19 shows that of real
part of maximum eigenvalue, and Figure 20 shows the convergence of . In this case, convergentvalue of the performance function J is 1.0 105. From the result, in case that maximum eigenvalue
is -0.32, parameter is 1.1. Therefore, 1.1 is the value of parameter which makes the system isstable.
8. Conclusion
In this study, two-dimensional spatial distribution is made by the superposition of spectra of mode
number from 1 to 30. From Figure 5, mode No.1 has a significant influence on spatial distribution.
Relation of food chain is represented and applied to the phenomena in the Lake Kasumigaura. The
nonlinear shallow water equation is employed to represent influence of flow velocity. The computed
result is used as the initial distribution of the stability analysis.
The main purpose of this study is to judge the stability of the system from the real part of the
maximum eigenvalue. Figure 17 is the result of the forward analysis, Figures 18, 19 and 20 are
results of backward analysis about parameter . When parameter is small, phytoplankton absorbslarge amount of nutrient. When the parameter is 1.1, the system is stable.
References
[1] J.S.Wroblewski and J.J.OBrien(1976), A Spatial Model of Phytoplankton Patchiness. Ma-
rine Biology 35, 161-172.
[2] J.S.Wroblewski(1977), A model of Phytoplankton plume formation during variable Oregon
upwelling, Journal of Marine Research, 358-394.
[3] N.F.Britton (1999), Reaction-diffsion equations and their application to biology , AcademicPress; pp109-137.
[4] Peter J.S. Franks and Changsheng Chen (1996), Plankton prouction in tidal fronts: A model
of Georges Bank in summer, Journal of Marine Research; 54: pp631-651.
8/7/2019 Scientific Computing for Energy
30/112
20 Tomohiro Yamauchi and Mutsuto Kawahara
[5] G.Ono and M.Kawahara (2004), Stability Analysis of Multiplication of Plankton Using Pa-
rameter Identification Technique. Int. J. Num. Meth. Fluids Vol. 44 Num. part1 pg71.
[6] Y.Ding and M.Kawahara (1998), Bifurcation Analysis of Brown Tide in Tidal Flow Using
Finite Element Method, Oceanographic Literature Review pp502-502.
[7] J.Matsumoto, T.Umestu and M.kawahara(1998), Shallow Water and Sediment TransportAnalysis by Implicit FEM, Journal of Applied Mechanics , vol.3, 263-274.
[8] N. Nayar and J. M. Ortega (1993), Computing of Selected Eigenvalues of Generalized Eigen-
value Problems, Jour. Comp. Phys.; 108: pp8-14.
[9] Hughes, T.J.R., Framea, L. P. and Balestra, M. (1986), A New Finite Element Formulation
for Computational Fluid Dynamics, V. Comp. Meth. Appli. Meth. Eng. 59, 85-99.
8/7/2019 Scientific Computing for Energy
31/112
In: Techniques of Scientific Computing for Energy ...
Editors: F. Magoules and R. Benelmir, pp. 2135
ISBN 1-60021-921-7
c 2007 Nova Science Publishers, Inc.
ACHIEVING HIGH-PERFORMANCE COMPUTING
IN GEOMECHANICS BY DEVELOPMENT
OF PARALLEL FINITE ELEMENT PACKAGE
Felicja Okulicka - Duzewska
Faculty of Mathematics and Information Science
Warsaw University of Technology, Pl. Politechniki 1,
00-661 Warsaw, POLAND
Abstract
The parallelization of the finite element method (FEM) algorithm is considered.
The parallel versions of the FEM package are developed on the base of the sequential
one. The elasto-plastic behavior of the geotechnical constructions can be modelled,
calculated and analyzed by the package. The Cray Fortran compiler directives are
applied for the parallelization of source code for shared memory machines and MPI
library for distributed environment. As the engineering example of the geotechnical
problem the embankment rising, the dam building and the settlement of the under-
ground are remodelled, calculated and analyzed.
1. Introduction
Finite element method (FEM) is the most general and powerful tool for solving engineering
problems. The FEM algorithm, well known and widely used in practice, is very appro-
priate for the parallelization. Due to the parallelization of the code the high performance
calculation can be done, very large structures can be modelled and considerable speed-up
is reached. For smaller engineering problems it would be nice to have result as fast as a
draw a mouse to a new pixel. Considering the finite element modelling, one of the essentialproblems, which we face in software development, is the parallelization of the existing se-
quential codes, which very often are developed by years. The question arises if is it worthy
done and how great effort is required. Advantages and difficulties of the parallelization
of the finite element method package are presented and discussed in the paper. The FEM
package Hydro-Geo oriented at hydro and geotechnical problems is presented in Section
2. The program is developed at Warsaw University of Technology and next extended to
allow the parallel calculations. The sequential version of the program is the starting point
for developing the parallel versions step by step. The package is composed of the three
main programs: the preprocessor for mesh generation and preparing the data, the processorfor main mechanical calculation and the graphical post processor. In the paper two paral-
lel versions of processor are compared: the first working on the shared memory machines
E-mail address: [email protected]
8/7/2019 Scientific Computing for Energy
32/112
22 Felicja Okulicka - Duzewska
and the second in the distributed environment. In section 3 the numerical procedure imple-
mented in the package is recalled after [6, 7]. Section 4 contains the algorithm of processor
for shared memory machines. In section 5 the message passing method implemented in
distributed Hydro-Geo is described. The computational results are included in section 6.
The version for shared memory machines is implemented and tested due to the access tothe supercomputer Sun 10000E, owned by COI PW (Computing Center of Warsaw Univer-
sity of Technology). The distributed version of the package is implemented and tested due
the support of the European Community - Access to Research Infrastructure action of the
improving Human potential programme (contract No HPRi-1999-CT00026).
2. Finite Element Method Package Hydro-Geo
The finite element package HYDRO-GEO [8] is oriented at hydro and geotechnical prob-
lems. The structure of the package is drown on Figure 1.
management shell
preprocessor processor postprocessor
datastructuresoil water
graphical
presentation
Auto
CAD
interface
mesh
generation
coupledresults
selection
Figure 1. Relation between various programs in the finite element package HYDRO-GEO.
Finite element method package algorithm can be divided into three separate parts:
1. preprocessor for mesh generation, mesh optimalization and data input
2. processor for stiffness matrix calculation, solving the global set of equations and analysis
of strains and stresses
3. postprocessor for graphical and numerical presentation of the computed resultsThe above tasks are independent and they are realized by separate modules of the package.
In Hydro-geo these programs can run under the management shell or can be executed on
different machines as well, where the data between programs can be sent by the net. The
8/7/2019 Scientific Computing for Energy
33/112
Achieving High-Performance Computing in Geomechanics... 23
data transfer between modules are due to the text files. In the package the format of the data
transfer files between modules are fixed. It allows to exchange the parts of the package. It
is very useful when we take under consideration the data preparation and the representation
of the results. In fact in the package a few preprocessors exists. They are written in Fortran
and in C. The main part - processor was developed by years by group of persons. It iswritten in Fortran.
The most time-consuming part of the modelling is the processor in which the numerical
finite element algorithms are implemented what is really worth to be done in parallel. In
the processor the coefficient matrix of the set of equations is calculated for each stage of the
construction building and each time increment. Parallel calculation speeds up the process
and putting data into the distributed memories increases the number of elements that can be
proceeded. The set of equation is solved several times. The parallel solver can also speed
calculation process radically. The structure of the package gives the opportunity to execute
the calculation on parallel machine without changing the format of data input and output.
The parallelization of the preprocessor presented in the paper is done in such a way, that the
procedures responsible for modelling are not change and the program can be developed by
others without problems.
3. Numerical Procedure
Description of different mechanical phenomena such as flow, mechanical behavior, thermal
effects, leads to coupled systems of differential equations. To solve a certain initial bound-
ary value problems, the finite element methods can be used. In such situation where a fewphenomena are taken into account the final form of global equation set takes the block form.
In general, the times when important phenomena has been considered separately belongs to
the past. Now we want to model very complicated and complex effects. For example if we
consider the car engine we have to solve mechanical and thermal differential equations as
a coupled system. The coupled systems appear in modern mechanics very often. Ground
waters flow and mechanical behavior deformation and stressed, transport of the pollutants,
thermal flow etc. Coupled problems are much more complicated, comparing each effect
considered separately but solving of them gives very realistic behavior of complex prob-
lems [12]. To solve system of linear equations obtained during the modelling of coupled
systems the standard solvers, which are available from the Internet, can be used. Here
we consider the direct solvers, the Scalapack [24] library for distributed blocked matrix
equations is very appropriate for the problem. The iterative solvers needs preconditioning
because even for relatively not too big problems the convergence of algorithm is difficult to
reach.
In Hydro-geo the virtual work principle, continuity equation with boundary conditions
is the starting points for numerical formulation. The finite element method is applied to
solve initial boundary value problems. Several procedures stemming from elasto-plastic
modelling can be coupled with the time stepping algorithm during the consolidation pro-
cess. The elasto-plastic soil behavior is modelled by means of visco-plastic theory (Perzyna,1966). The finite element formulation for the elasto-plastic consolidation combines over-
lapping numerical processes. The elasto pseudo-viscoplastic algorithm for numerical mod-
elling of elasto-plastic behavior is used after Zienkiewicz and Cormeau (1974). The sta-
8/7/2019 Scientific Computing for Energy
34/112
24 Felicja Okulicka - Duzewska
bility of the time marching scheme was proved by Cormeau (1975). The pseudo-viscous
algorithm developed in finite element computer code Hydro-Geo is successfully applied to
solve a number of boundary value problems, Dluzewski (1993). The visco-plastic proce-
dure was extended to cover the geometrically non-linear problems by Kanchi et al (1978)
and also developed for large strains in consolidation, Dluzewski (1997) [6]. The pseudo-viscous procedure is adopted herein for modelling elasto-plastic behavior in consolidation.
In the procedure two times appear, the first is the real time of consolidation and the second
time is only a parameter of the pseudo-relaxation process. The global set of equations for
the consolidation process is derived as follows
KT L
LT (S+tHi)
ui
pi
=
0 0
0 tHi
ui
pi
+
Fi
q
(1)
where KT is the tangent stiffness array, considering large strains effects, L is the coupling
array, S is the array responsible for the compressibility of the fluid, H is the flow array,
u are the nodal displacements, p are the nodal excesses of the pore pressure, Fi is the
load nodal vector defined below
Fi = FL +Ri
I+Ri
II (2)
Fi is the load increment, RiI is the vector of nodal forces due to pseudo-visco iteration,
RiII is the unbalanced nodal vector due to geometrical nonlinearity. Ri
I takes the following
form
RiI =
(i1)
t+t V
BT(i1)D(t+tt+t
pi )
t+ti1 dv (3)
and is defined in the current configuration of the body. The subscripts indicate the con-
figuration of the body, and superscripts indicate time when the value is defined (notation
after Bathe (1982)). RiI stands for the nodal vector which results from the relaxation
of the stresses. For each time step the iterative procedure is engaged to solve the material
non-linear problem. The i-th indicates steps of iterations. Both local and global criterions
for terminating the iterative process are used. The iterations are continued until the calcu-
lated stresses are acceptable close to the yield surface, F Tolerance at all checked points,
where F is the value of the yield function. At the same time the global criterion for this
procedure is defined at the final configuration of the body. The global criterion takes itsroots from the conjugated variables in the virtual work principle, where the Cauchy stress
tensor is coupled with the linear part of the Almansi strain tensor. For two phase medium,
the unbalanced nodal vector RiII is calculated every iterative pseudo-time step.
Rk1 =
t+tV
NTft+tdV+
t+tS
NTtt+tdS
(k1)t+t V
BT(k1)D(t+tt+t
j(k1)+mt+tt+tp(k1))
(k1)t+t dV (4)
The square norm on the unbalanced nodal forces is used as the global criterion of equi-
librium. The iterative process is continued until both criterions are fulfilled.
8/7/2019 Scientific Computing for Energy
35/112
Achieving High-Performance Computing in Geomechanics... 25
4. Parallel Hydro-Geo For Shared Memory Supercomputers
In programs for shared memory machines all variables are visible by all threads created
during the program execution. The most popular and simple way of the parallelization
is the division of the loops into threads, which run on the separate processors. We can
make the compiler do it automatically by adding the directive autoparallel during
the compilation process. Any changes need not be make in the source code. Explicit
parallelization is reached by putting directives into source code before the parts that can
be execute concurrently. The number of processors should be known during compilation
because it determines the number of threads. The parallel versions of the FEM package are
built on the base of the sequential one. The structure of the package is not changed. In the
first step all auxiliary files for keeping data during the calculation process are cancelled.
All data are put into the memory. The number of read/write on/from the disk operations
is reduced. The order in which single elements are calculated became not important.
That allows us to parallelize the main loops which calculate the local values and the local
matrices for single elements. The Hydro-Geo processor algorithm for shared memory
machines with compiler parallelizing directives can be written as follows [14]:
Start
Data reading, initial computations
For each stage of the construction and each increment of load do
Read the data specific for the stage
Parallelize the following loop
For each element doCalculate the local stiffness matrix,
coupling matrix, flow matrix
Calculate the initial stresses
End do
Calculate the global set of fully coupled system
First part of the solver (forward substitution)
- parallel calculation
For each Gauss point do
Second part of the solver (backward substitution)-parallel calculation
Parallelize the following loop
For each element do
Calculation of strains and stresses
End do
Print the result in the disk file
End do
End do
Stop
The loops calculating values for each element (the local stiffness matrix, coupling
matrix, flow matrix, stresses, strains) are divided into threads that were executed con-
8/7/2019 Scientific Computing for Energy
36/112
26 Felicja Okulicka - Duzewska
currently. It is a kind of domain decomposition, done by splitting the set of elements
into subsets. The calculated variables are visible for the rest of program commands and
procedures. Such approach needs big amount of memory for big problems. The speed-up
is reached not only due to parallelism but due to the reduction of disk operations as well.
5. Distributed Calculation
5.1. Distributed FEM Package
The Message Passing Interface (MPI) standard is used as the tool for parallelization [13, 25].
For distributed memory machines the number of processes is created during the execu-
tion of the parallel program. The processes are distinguished by own unique names called
ranks, which are integer numbers. They have their own private variables. Each process has
to have copies of all variables needed for calculations. All results calculated by one process
and needed by another should be send or broadcast. In our approach only one process reads
data. It will be called master. Others obtain the data by broadcast from master, make
calculations using received data and send the results needed for solving global set of equa-
tion back to master. The distributed version bases on the program working in memory - no
auxiliary files for keeping data during the calculation process are created.
The parallelization is similar to the one described in the previous section [14, 17]. The
calculation is done concurrently in such a way that the loops, which calculate the local
values for each element, are divided between processes. Each process calculates local
values connected with single element (the local stiffness matrix, coupling matrix, flow
matrix, initial stresses) for own private subset of elements. The subsets are determined at
the beginning and remain fixed during whole calculation. All processes know which subset
of elements belongs to each one. Each process keeps data connected with elements for
his private subset only. When the local matrixes are calculated, they are sent to master,
which calculates the global matrix for fully coupled system and solves the set of linear
equations. The result is broadcast to all processes to allow them to continue the calculation.
For master process which is the process with rank 0 the algorithm can be written as follows:
Start
Data reading, initial computationsSharing the computation -
determine the subsets of elements calculating by separate processes
Broadcast read data to other processes
For each stage of the construction and each increment of load do
For my subset of the set of elements do
Calculate the local stiffness matrix, coupling matrix , flow matrix
Calculate the initial stresses
End do
Synchronization point 1Gather the local stiffness matrices from all processes
Calculate the global set of fully coupled system
For each Gauss point do
8/7/2019 Scientific Computing for Energy
37/112
Achieving High-Performance Computing in Geomechanics... 27
Solve the set of equations
Synchronization point 2
Broadcast the solution to all processes
For my subset of the set of element do
Calculation of strains and stressesEnd do
End do
Synchronization point 3
Receiving the results from all processes
Printing of the calculated values
End do
Stop
Processes with ranks greater than 0 receive the read data, initialize the data con-
nected with their subsets of elements, calculate the elements of local matrices connected
with their private subsets of elements and send them to the master. The algorithm for slave
processes is as follows:
Start
Receiving of data from process 0
For each stage of the construction and each increment of load do
For my subset of the set of element do
Calculate the local stiffness matrix, coupling matrix, flow matrix
Calculate the initial stresses
End do
Synchronization point 1
Send the local matrices to process number 0
For each Gauss point do
Synchronization point 2
Receive the solution from the process number 0
For my subset of the set of element do
Calculation of strains and stressesEnd do
End do
Synchronization point 3
Sending the results to process 0
End do
Stop
The synchronization points are added to secure the proper communication and exchange ofdata.
8/7/2019 Scientific Computing for Energy
38/112
28 Felicja Okulicka - Duzewska
Table 1. Time table of the embankment rising
Stage Description Time increment Total time
No days days
0 Initial stresses 0 0I Rising of stage I 12 12
Consolidation 189 201
II Rising of stage II 12 213
Consolidation 239 452
III Rising of stage III 12 464
Consolidation 590 1054
5.2. Parallel Numerical Algorithm for Solving the Linear Equations for Con-solidation Problem
The block formulation of the coupled problems makes natural the application of the block
methods for solving the sets of linear equations. The large matrixes can be split into blocks
and put into separate memories of the net of computers. The parallel calculations are
reached due to the matrix operations on separate blocks. The standard numerical algo-
rithms should be rebuilt for the block version or the standard libraries can be used [3, 4, 5].
For big problems matrix of the system of linear equations is put on distributed memories
and iterative methods is used to obtain solution of the set of equations. For the consolida-tion problem the coefficient matrix is ill-conditioned [22]. The preconditioners are needed
to improve the convergence although for not big problem [1, 2, 9, 10, 11, 12, 19, 21, 22, 23].
6. Engineering Problems
6.1. Embankment Rising
To study the influence of the large deformation description rising of the embankment on
peat is modelled [18].
The Coulomb-Mohr yield criterion and the non-associated flow rule is used with
dilatancy angle equals zero. The permeable boundary below the peat layer is assumed. The
layer is 10m thick. The embankment slope is 1:2. The embankment is built in four stages.
At the beginning the initial stresses are introduced into subsoil. The first stage of the
embankment is risen up to the height of 2.0 m, the second up to 4.0 m, the third up to 6.0
m and the fourth up to 7.0 m. The timetable of the embankment rising is given in the Table
1. The mesh contains 1879 nodes. The example of the calculating results - the pressure are
presented on Figure 4.
The mesh is shown in Fig.2. Six nodded izoparametric elements are used. The
non-consistent formulation of the consolidation is applied (herein, pore pressures are
calculated in all nodes). The three different times are compared: real wall time, processor
8/7/2019 Scientific Computing for Energy
39/112
Achieving High-Performance Computing in Geomechanics... 29
Figure 2. Embankment rising - finite element mesh.
calculation time - user time from the point of view of system and system time i.e. time for
synchronization , management of disk and memory access. The calculations were made for
elastic model on Sun6500. The reached speedup is presented on Figures 5. The speedupdepends on the size of the problem - on shared memory machines we did not calculate big
models. That is why the results are not spectacular.
6.2. Besko Dam
The Besko dam has been risen on the Carpathian flysch [15, 16]. The height of the dam is
about 40m. The inclined schist layers are located in the subsoil. The parallel schist layers
with various material parameters create the specific foundation typical for Polish dams in
the south. The dam is built from concrete. The clay-concrete screen of 0.8 m thickness is
performed. The height of the screen is about 25 m.
Figure 3. Speed up of user time reached on Sun6500.
The numerical modelling is done in three stages. In the first stage the initial stressesare introduced into the subsoil. In the second one, the heavy concrete dam is built. The
special teeth are done between the subsoil and the dam body for better interaction between
the dam and subsoil. The rising of the dam is done by adding the elements. In the third
8/7/2019 Scientific Computing for Energy
40/112
30 Felicja Okulicka - Duzewska
stage the loading causes by filling the reservoir is applied.
Figure 4. Embankment rising - excess of the pore pressure.
Figure 5. Besko dam - calculated stresses.
The performance of the different parallel versions of the package is compared for
user time. Real time strongly depends on the number of users concurrently working on
the supercomputer. The system time is connected mostly with synchronization of threads
and the number of input/output commands which is the same in all parallel versions. The
average speed up for sequential version in the memory comparing the sequential version
of the HYDRO-GEO processor working with the auxiliary files is about 10. In the case
of automatic parallelization the speed up for user time comparing the sequential version
working into memory is about 2 and does not change much when we change the number
of processors.
6.3. The Settlements of Warsaw Underground
The settlements of the Warsaw underground structures of the metro station are analyzed.
The calculation are performed in six stages:
1 - introduction of the initial stresses in subsoil, introduction of diaphragm wall, construct-
8/7/2019 Scientific Computing for Energy
41/112
Achieving High-Performance Computing in Geomechanics... 31
ing of the ground ceiling,
2 - soil excavation
3 - foundation plate constructing
4 - adding columns
5 - loading from the station trains and traffic6 - extra loading from 10 floors building
Figure 6. Besko dam - isolines of displacements.
In the each stage the new arrangement of the global equilibrium system is done.Some additional boundary conditions (to support internal walls) are changing the global
numbering of equations. The problem is nonlinear due to elasto-plastic soil models based
on Coulomb-Mohr yield criterion and nonassociated flow rule [2].
The chosen results are shown in the Fig. 7 , 8 and 9.
The three different times are compared: real wall time, processor calculation time - user
time from the point of view of system and system time i.e. time for synchronization,
management of disk and memory access. To compare the speed up the calculations are
done for elastic model. The maximum speed-up for user time is bounded, by Amdahls
law, which says that the maximum speed-up does not depend on number of processors.
The parallel versions are compared with the sequential version working in the memory
because they based on it.
The average speed up is following:
1. for version of the processor obtained by the compilation with the option autoparallel the
speed up is about 2
2. for version of the processor obtained by the compilation with the option parallel (i.e.
explicit and auto) the speed up is about 3,5
3. for version of the processor obtained by the compilation with the option parallel ( i.e.explicit and auto) and with band matrix parallel solver the speed up is about 5.
In our case the half of the calculation of the processor should be done sequentially. The
8/7/2019 Scientific Computing for Energy
42/112
32 Felicja Okulicka - Duzewska
Figure 7. Warsaw underground - the finite element mesh, 8-noded izoparametrical elements
are used.
Figure 8. Warsaw underground - the finite displacements in the form of contour lines.
speed-up reached by the parallel package, comparing the sequential version is about 2 for
different problems. Comparison of calculation time of both versions is possible only for
small problems. Bigger problems can not be calculated in sequential way in the memory
only - auxiliary files should be used to keep data between called procedures because the
capacity of the memory of the single machine is too small. For big problems the calculationwithout using disc for writing/reading intermediary results is possible only when the data
is divided between different machines memory. Big problems can be calculated quickly -
keeping all data in memory only in parallel.
Conclusions
The shared memory version in practice does not work for problems with huge number of
elements. Such approach leads to difficulties with shared memory access via bus. It can be
used for programs with large number of iterations for Gauss points. This version is easier toimplement because there is no exchange of data between processes. The compiler ensures
the synchronization. Considering the parallelization of the finite element source code the
first steps are obvious. First the element procedures (calculating the stiffness matrix and
8/7/2019 Scientific Computing for Energy
43/112
Achieving High-Performance Computing in Geomechanics... 33
Figure 9. Warsaw underground - the displacement of the station with the subsoil.
nest calculating the strains and stresses) are parallelized. In the second step the frontal
procedure for solving the set of linear equations are replaced by band matrix solver. The
distributed version is not limited in the number of elements in calculated problems. It can be
used for really huge models. For small models the speed up is small or the parallel version
can work longer than the sequential one because of the communication and synchronization
procedures.
References
[1] O.Axelsson, Iterative solution Methods, Cambridge 1994
[2] O.Axelsson, V.A.Barker, Finite Element Solution of Boundary Value Problems , Aca-
demic Press,Inc. 1984
[3] R.Barret, M.Berry, T.Chan, J.Demmel, J.Donato, J.Dongara, V.Eijkhout, R.Pozo,
C.Romine, H. Van der Vost, Templates for the Solution of Linear Systems: Building
Blocks for Iterative Methods, SIAM, 1994
[4] J.J.Dongara, Performance of Various Computers Using Standard Linear Equations
Software, 1999
[5] J.Demmel, Applied numerical linear algebra , 1997
[6] J. M. Duzewski, Non-linear consolidation in finite element modelling, Proceedings
of the Ninth International Conference on Computer Methods and Advances in Geome-
chanics, Wuhan, China, November 1997
[7] J.M. Duzewski, Nonlinear problems during consolidation process , Advanced Nu-
merical Applications and Plasticity in Geomechanics ed. D.V. Griffiths, and G. Gioda,
Springer Verlag, Lecture Notes in Computer Science, 2001
[8] J.M. Duzewski, HYDRO-GEO - finite element package for geotechnics , hydrotech-
nics and environmental engineering, Warsaw 1997 (in Polish)
8/7/2019 Scientific Computing for Energy
44/112
34 Felicja Okulicka - Duzewska
[9] V. Eijkhout, T.Chan , ParPre: A parallel preconditioners Package reference manual
for version 2.0.21, revision 1
[10] M.J.Grote, T.Hycke, Parallel Preconditioning with Sparse Approximate Inverses ,
SIAM Journal of Sci. Comput., 18(1997), pp 838-853
[11] P.Krzyzanowski, On block preconditioner for non-symmetric saddle point problems,
SIAM Journal on Scientific Computing, Vol.23 No 1, 2001, pp157-169
[12] R.W.Lewis, B.A.Schrefler, The Finite Element Method in the Static and Dynamic De-
formation and Consolidation of Porous Media , John Wiley Sons, 1998
[13] MPI: A Message Passing Interface Standard, June 1995
[14] F. Okulicka, High-Performance Computing in Geomechanics by a Parallel Finite Ele-
ment Approach, Lecture Notes in Computer Science 1947, Applied parallel Com-puting, 5th International Workshop, PARA 2000, Bergen, Norway, June 2000,
pp391-398
[15] F. Okulicka, Block parallel solvers for coupled geotechnical problems, 10th Interna-
tional Conference on Computer Methods and Advances in Geomechanics ,January 7 -
12, 2001, Tucson, Arizona USA- vol 1, A.A.Balkema, Rotterdam, Brookfield, 2001,
pp861-866
[16] F.Okulicka, Parallel Calculations of Geotechnical Problems by Means of Parallel Fi-
nite Element Code Hydro-Geo, Proceedings of the IASTED International SymposiaAPPLIED INFORATICS, Innsbruck Austria, February 19-22, 2001, pp 440-443
[17] F.Okulicka, Achieving high performance calculation by the parallelization of the code,
The Eighth International Conference On Advanced Computer Systems ACS2001 , Oc-
tober 17-19, 2001 Mielno, Poland, pp 259-268
[18] F.Okulicka, Parallelization of Finite Element ement Package by MPI library, Interna-
tional Conference of MPI/PVM Users , MPI/PVM 01, Santorini 2001, Lecture Notes
in Computer Science 2131, pp425-436
[19] P.S.Pacheco, A Users guide to MPI, 1998
[20] S Parter, Preconditioning Legrendre spectral collocation methods for elliptic problems
I: Finite difference operators, SIAM Journal on Numerical Analysis , Vol 39, No 1,
2001, 320-347
[21] S Parter, Preconditioning Legrendre spectral collocation methods for elliptic problems
I: Finite element operators , SIAM Journal on Numerical Analysis , Vol. 39, No 1, 2001,
348-362
[22] K.K.Phoon, K.C.Toh, S.H.Chan, F.H.Lee, An Efficient Diagonal Preconditioner for
Finite Element Solution of Biots Consolidation Equations , to appear in International
Journal of Numerical Methods in Engineering.
8/7/2019 Scientific Computing for Energy
45/112
Achieving High-Performance Computing in Geomechanics... 35
[23] Y.Saad, Iterative methods for Sparse linear systems , SIAM 2003
[24] http://www.netlib.org/scalapack/
[25] http://www-unix.mcs.anl.gov/mpi
8/7/2019 Scientific Computing for Energy
46/112
8/7/2019 Scientific Computing for Energy
47/112
c
LARGE-SCALE DATA VISUALIZATION USING
MULTI-LANGUAGE PROGRAMMING APPLIED
TO ENVIRONMENTAL PROBLEMS
Fred eric Magoulesand Roman Putanowicz
Institut Elie Cartan de Nancy, Universite Henri
Poincare, BP 239, 54506 Vandoeuvre-les-Nancy Cedex, France
Institute of Computer Methods in Civil Engineering
(L5), Cracow University of Technology, Cracow, Poland
Abstract
Environment problems lead to large and complex data sets, which appear to be
difficult to analyze. Scientific visualization which transforms raw data into images has
been recognized as an effective way to understand such data. Actually, most existing
scientific software have their own data format and special visualization interfaces or
independent software are used to display these data. In this paper a technique for the
visualization of large-scale data using multi-language programming is investigated.Mixing Tcl, C++ and Fortran components to the VTK library allows to build efficient
and robust applications. This article presents in details how to build such applica-
tions and provides an elegant solution to the problem of accessing VTK objects from
different languages and how to mix Tcl, C++ and Fortran components in one single
application.
Keywords: image processing, visualization, graphics, scripting language, compiled lan-
guage, multi-language programming
1. Introduction
The amount of data collected and stored electronically is doubling every three years. Even
if the development of standard data interface protocols allows to solve the data access prob-
lems, the analysis of this information becomes an emerging problem. Visualization tech-
nology provides effective data presentation. Unfortunately, we are now reaching the limits
of interactive visualization of large-scale data sets, since the amount of data to be analyzed
is overwhelming. Despite the numerous number of scientific visualization software and the
multiple options available in these software, researchers have particular requirements and
the development of home made visualization software is very common.E-mail address: [email protected]; address all correspondence to this author.E-mail address: [email protected]
In: Techniques of Scientific Computing for Energy ... ISBN 1-60021-921-7
Editors: F. Magouls and R. Benelmir, pp. 37-68 2007 Nova Science Publishers, Inc.
8/7/2019 Scientific Computing for Energy
48/112
38 Frederic Magoules and Roman Putanowicz
The Visualization ToolKit named VTK is a software system for computer graphics,
visualization and image processing. VTK library is written in C++ however it provides
interfaces to the scripting languages Tcl, Python and Java. Though it is possible to write a
whole VTK application in a scripting language like Tcl, it is more suitable, for efficiency
reasons, to implement some functionality in a compiled language like C/C++ for instance.This is specially the case when working with large data sets arising from environment analy-
sis as noise reduction for instance. For example, when the noise level distribution generated
by cars or airplanes over a city is analyzed, large data sets are considered. Huge data are
mandatory to model the whole city. An example of such model is illustrated in the Fig-
ures 1, 2 and 3. These Figures have been obtained with the technique presented in this
paper.
Figure 1. Example of a city.
Figure 2. Example of a city (bis).
This article presents in through details how to access VTK objects from different lan-
guages and how to mix Tcl and C++ components in one application. Several source code
8/7/2019 Scientific Computing for Energy
49/112
Large-Scale Data Visualization using Multi-Language Programming... 39
Figure 3. Example of a city (ter).
examples are shown in order to help the reader to write a complete application by his own.
The paper is organized as follows. In Section 2, the concept of programmable filters in VtK
are reminded. Then in Section 3, the way to access VTK objects data is detailed. Section 4
presents programmable filters written in C++, and Section 5 presents programmable filters
written in Tcl. In Section 6, dynamically linked functions used as filter method in Tcl are
investigated. Finally, Section 7 contains the conclusions of this paper.
2. Programmable Filters
In programs using VTK library, visualization process can be described in terms of data flow
through so called visualization networkor visualization pipeline [14, 12].
During visualization, a data, which is represented by visualization objects, is passed
between process objects connected into the visualization pipeline. The process objects
operate on input data to generate output data.
The process objects can be divided into the following categories: source objects, filter
objects and mapper objects. Filter objects require one or more input data objects and gen-
erate one or more output data objects. VTK provides several filter objects which perform
various visualization operations (e.g. extracting geometry, extracting and modifying data
attributes, etc).
When new processing capabilities are required, and when they cannot be obtained by
combination of existing filters, new classes of filters can be added. This requires introduc-
tion of a new class and modification of the source code. However it is possible to create
new type filter objects without creating new classes and even to create new kind of filters
on the run time and from the scripting languages like Tcl or Python. To make it possible
VTK provides family of programmable filter class which has all common properties of or-
dinary filter classes except the fact that their processing routine can be set to specified user
function. This way the user has only to write the processing function, create new instance
of programmable filter and use it to build visualization pipeline. Each time the filter is re-
8/7/2019 Scientific Computing for Energy
50/112
40 Frederic Magoules and Roman Putanowicz
quested to execute, it will call user specified function. The next section shows how to write
functions for programmable filters (vtkProgrammableFilter in particular) in Tcl, C++
and Fortran.
3. Accessing VTK Objects Data
It might happen that we want to extend visualization programs by functions written in For-
tran or we have large Fortran legacy code we would like to interface with visualization
program written using VTK library. One problem that immediately appears is that in For-
tran (including Fortran 90) we do not have direct access to C++ objects. In theory it is
possible to pass C++ object pointer to a Fortran function and then knowing the memory
layout of the object manipulate it directly from Fortran but this is restricted to simplest
cases and is highly not portable. What instead should be done is to extract all necessary
information from C++ object, pack it into ordinary variables and arrays and pass that datato a Fortran function. When Fortran function returns modified arguments, they are used to
alter the C++ objects or to create new ones.
We assume that the reader is already familiar with basic VTK components and in par-
ticular with VTK data model. If not then we suggest reading chapters 4 and 5 from [14] or
chapter 11 from [12]. Nevertheless we will start our discussion with a very simple example
which introduces one of the VTK array classes vtkDoubleArray
3.1. Creating and Manipulating VTK Arrays
The example below shows how to create an array of double values with 10 rows and 3
columns. Such array could be used for instance to hold points coordinates of a three dimen-
sional mesh in finite element methods.
1 #include 2 #include "vtkDoubleArray.h" 3 using namespace std; 4 int main(void) { 5 vtkDoubleArray *array; 6 int m = 10; 7 i n t n = 3 ; 8 double buff[3]; 9 /* Creating VTK array object */ 10 array = vtkDoubleArray::New(); 11 array->SetNumberOfComponents(n);
12 array->Allocate(m); 13 array->SetNumberOfTuples(m); 14 for (int i=0; i
8/7/2019 Scientific Computing for Energy
51/112
Large-Scale Data Visualization using Multi-Language Programming... 41
15 { 16 buff[0] = buff[1] = buff[2] = (double)i; 17 array->SetTuple(i, buff); 18 } 19 /* copy values from array object to the carray array */ 20 int nrows = array->GetNumberOfTuples(); 21 int ncols = array->GetNumberOfComponents(); 22 double *carray = new double [nrows * ncols]; 23 for (int i=0; i
8/7/2019 Scientific Computing for Energy
52/112
42 Frederic Magoules and Roman Putanowicz
2 {
3 assert (array!=NULL);
4 int maxId = array->GetMaxId();
5 // in C/C++ indexing starts from 0
6 double *carray = new double[maxId+1];
7 if (carray != NULL) 8 {
9 for (int i=0; iGetValue(i);
12 }
13 if (length != NULL)
14 {
15 length = maxId+1;
16 }
17 }
18 return carray;
19 }
In appendix A we present more functions which can help to transfer data from and to
VTK objects. Those functions are organized into small library called dpl data passing li-
brary. That library covers only few cases of accessing VTK objects in particular following
classes: vktUnstructuredGrid, vtkPoints, vtkPointData. If necessary, more robust
and universal functions can be easily written based on the dpl examples.
3.3. Passing Data from VTK Object to Fortran Routine
To finish this section we present another example of accessing VTK object. This time we
use dpl functions to extract values of the points scalar attribute called temperature, pack
it into an array and send that array together with its length to a Fortran function which
substitutes each element with its sinus value.
1 #inlcude "dpl.h"
2 #include "vtkProgrammableFilter.h"
3 #include "vtkUnstructuredGrid.h"
4 #include "vtkDataSet.h"
5 void SinusTemperature(void *arg) {
6 vtkProgrammableAttributeDataFilter *myFilter;
7 vtkDataSet *input;
8 vktPointData *pd;
9 double *indata;
10 int length;
11 *myFilter = (vtkProgrammableAttributeDataFilter *)arg;
12 input= myFilter->Input(); 13 pd = input->GetPointData();
14 /* get the point data array */
8/7/2019 Scientific Computing for Energy
53/112
Large-Scale Data Visualization using Multi-Language Programming... 43
15 indata = dplGetScalarsCArray(pd, "temperature", &length);
16 if (indata != NULL)
17 {
18 /* pass the input data to the Fortran routine */
19 sinusfortran (&length, indata);
20 }
21 /* set back the transformed data */
22 dplSetScalarsFromCArray (pd, indata, length, "temperature");
23 }
The code for Fortran function is as follows:
SUBROUTINE SINUSFORTRAN(N, A)
DOUBLE PRECISION A(*)
DO 10 I=1,NA(I) = SIN(A(I))
10 CONTINUE
END
The call of Fortran function in line 19 depends on the combination of C/C++ and For-
tran compilers so the appropriate compiler documentation should be consulted for each
particular compiler.
The example above uses the following helper functions:
double *dplGetScalarsCArray(vtkUnstructuredGrid *,const char *, int *);
int dplSetScalarsFromCArray(vtkUnstructuredGrid *, double *,
int, const char* );
The first of them takes point data object and the name of scalar field and allocates an array
and then copies the values of the scalar field to the array. It returns the length of allocate
array as through the last argument.
The second function does the reverse - it takes the array and sets the named points
attribute to the array values.
4. Programmable Filters in C++
We will start the discussion on vtkProgrammableFilter class by an example of using
programmable filter in a C++ code. Though use of C++ complicates slightly the example, it
allows to show some internal working of the class, which understanding is necessary when
trying to use vtkProgrammableFilter in Tcl.
4.1. Simple Pipeline ExampleTo simplify our discussion and the example we present a program in which visualization
pipeline has been reduced to minimum and consists of only two objects:
The program looks like follows
8/7/2019 Scientific Computing for Energy
54/112
44 Frederic Magoules and Roman Putanowicz
vtkUnstructuredGridReader
vtkUnstructuredGridWriter
Figure 4. Simple pipeline
1 #include "vtkUnstructuredGridReader.h"
2 #include "vtkUnstructuredGridWriter.h"
3 int main() {
4 vtkUnstructuredGridReader *reader;
5 vtkUnstructuredGridWriter *writer;
6 reader = vtkUnstructuredGridReader::New();
7 reader->SetFileName("2Dmesh.vtk");
8 writer = vtkUnstructuredGridWriter::New();
9 writer->SetFileName("newMesh.vtk");10 // connect object to form the pipe
11 writer->SetInput(reader->GetOutput());
12 // initialize pipe processing
13 writer->Update();
14 writer->Delete(); 15 reader->Delete();
16 return 0;
17 }
As it can be seen, this program does nothing else as copying 2Dmesh.vtk file into
newMesh.vtk.
4.2. Pipeline with Filter
Now we introduce programmable filter in order to transform an unstructured grid by userspecified function. The layout of the program is shown in the figure below. We assume that
data file specified a scalar attribute for each point. The user function copy the grid topology
and geometry and sets new point attributes which are the old value multiplied by 10. With
8/7/2019 Scientific Computing for Energy
55/112
Large-Scale Data Visualization using Multi-Language Programming... 45
vtkUnstructuredGridReader
vtkProgrammableFilter
vtkUnstructuredGridWriter
user function
Figure 5. Pipeline with filter
user function as above it would be better to use vtkProgrammableAttributeDataFilter
but we will use vtkProgrammableFilter to show how to create new output object andhow to copy grid topology and geometry.
The vtkProgrammableFilter class provides, among others, the following method:
vktProgrammableFilter::SetExecuteMethod (void(*f)(void *), void *arg)
This method takes two arguments: first being the pointer to user function. The user
function must take one argument of void pointer type and return void. The second argument
is the pointer to client data, which will be passed to user function upon its execution. The
client data allow to pass to user function all necessary information the function needs to
perform its job. The client data will usually contain pointer to the filter itself which allows
the function to retrieve filters input and output objects. In the simplest case the client data
will be the pointer to the filter alone.
Lets assume that the function
void ScaleBy10 (void *arg);
is going to be used by the filter. Here is the new program:
1 #include "vtkUnstructuredGridReader.h" 2 #include "vtkUnstructuredGridWriter.h"
3 #include "vtkProgrammableFilter.h"
8/7/2019 Scientific Computing for Energy
56/112
46 Frederic Magoules and Roman Putanowicz
4 void ScaleBy10 (void *arg);
5 int main() {
6 vtkUnstructuredGridReader *reader;
7 vtkUnstructuredGridWriter *writer;
8 vtkProgrammableFilter *filter;
9 reader = vtkUnstructuredGridReader::New();
10 reader->SetFileName("2Dmesh.vtk");
11 writer = vtkUnstructuredGridWriter::New();
12 writer->SetFileName("newMesh.vtk");
13 filter = vtkProgrammableFilter::New();
14 filter->SetExecuteMethod (ScaleBy10, (void*)filter);
15 // connect objects to form the pipe
16 filter->SetInput(reader->GetOutput()):
17 writer->SetInput(filter->GetUnstructuredGridOutput());
18 // initialize pipe processing
19 writer->Update();
20 writer->Delete();
21 filter->Delete
22 reader->Delete();
23 return 0; 24 }
Note the line 14 where as we said we pass pointer to filter object as a client data.
4.3. User Function
The task of the user function ScaleBy10 create new grid with the same geometry and topol-
ogy as the input grid but with points attribute scaled by 10. First the topology and geometry
is copied from the input object. Then point data array are copied and then modified. At the
end the modified data array is inserted into output mesh as the new point data. The code for
user function is given below.
1 #include "vtkDataSet.h"
2 #include "vtkDoubleArray.h"
3 #include "vtkDataArray.h"
4 #include "vtkProgrammableFilter.h"
5 #include "vtkUnstructuredGrid.h"
6 void ScaleBy10(void *arg) {
7 vtkIdType numPts;
8 vtkDa