Predicting rail defect frequency: An integrated approach using …qinghe/papers/journal/2019 CACIE... · 2019. 8. 17. · a defect can occur and may occasionally lead to severe train

See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/333149747

Predicting rail defect frequency: An integrated approach using fatigue

modeling and data analytics

Article in Computer-Aided Civil and Infrastructure Engineering · May 2019

DOI: 10.1111/mice.12453

CITATIONS

0READS

98

5 authors, including:

Some of the authors of this publication are also working on these related projects:

Viscoplasticity of polycrystalline solids under high strain rates View project

Travel Mode Identification with Smartphone Sensors View project

Faeze Ghofrani

University at Buffalo, The State University of New York

5 PUBLICATIONS 32 CITATIONS

SEE PROFILE

Abhishek Pathak



SEE PROFILE

Qing He



SEE PROFILE

All content following this page was uploaded by Faeze Ghofrani on 10 June 2019.

The user has requested enhancement of the downloaded file.

https://www.researchgate.net/publication/333149747_Predicting_rail_defect_frequency_An_integrated_approach_using_fatigue_modeling_and_data_analytics?enrichId=rgreq-6e034140a168fd9e317ae5d6a068b276-XXX&enrichSource=Y292ZXJQYWdlOzMzMzE0OTc0NztBUzo3NjgyNzMwMTI1MDY2MzBAMTU2MDE4MTk2ODgyNg%3D%3D&el=1_x_2&_esc=publicationCoverPdf

https://www.researchgate.net/publication/333149747_Predicting_rail_defect_frequency_An_integrated_approach_using_fatigue_modeling_and_data_analytics?enrichId=rgreq-6e034140a168fd9e317ae5d6a068b276-XXX&enrichSource=Y292ZXJQYWdlOzMzMzE0OTc0NztBUzo3NjgyNzMwMTI1MDY2MzBAMTU2MDE4MTk2ODgyNg%3D%3D&el=1_x_3&_esc=publicationCoverPdf

https://www.researchgate.net/project/Viscoplasticity-of-polycrystalline-solids-under-high-strain-rates?enrichId=rgreq-6e034140a168fd9e317ae5d6a068b276-XXX&enrichSource=Y292ZXJQYWdlOzMzMzE0OTc0NztBUzo3NjgyNzMwMTI1MDY2MzBAMTU2MDE4MTk2ODgyNg%3D%3D&el=1_x_9&_esc=publicationCoverPdf

https://www.researchgate.net/project/Travel-Mode-Identification-with-Smartphone-Sensors?enrichId=rgreq-6e034140a168fd9e317ae5d6a068b276-XXX&enrichSource=Y292ZXJQYWdlOzMzMzE0OTc0NztBUzo3NjgyNzMwMTI1MDY2MzBAMTU2MDE4MTk2ODgyNg%3D%3D&el=1_x_9&_esc=publicationCoverPdf

https://www.researchgate.net/?enrichId=rgreq-6e034140a168fd9e317ae5d6a068b276-XXX&enrichSource=Y292ZXJQYWdlOzMzMzE0OTc0NztBUzo3NjgyNzMwMTI1MDY2MzBAMTU2MDE4MTk2ODgyNg%3D%3D&el=1_x_1&_esc=publicationCoverPdf

https://www.researchgate.net/profile/Faeze_Ghofrani?enrichId=rgreq-6e034140a168fd9e317ae5d6a068b276-XXX&enrichSource=Y292ZXJQYWdlOzMzMzE0OTc0NztBUzo3NjgyNzMwMTI1MDY2MzBAMTU2MDE4MTk2ODgyNg%3D%3D&el=1_x_4&_esc=publicationCoverPdf


https://www.researchgate.net/institution/University_at_Buffalo_The_State_University_of_New_York?enrichId=rgreq-6e034140a168fd9e317ae5d6a068b276-XXX&enrichSource=Y292ZXJQYWdlOzMzMzE0OTc0NztBUzo3NjgyNzMwMTI1MDY2MzBAMTU2MDE4MTk2ODgyNg%3D%3D&el=1_x_6&_esc=publicationCoverPdf


https://www.researchgate.net/profile/Abhishek_Pathak22?enrichId=rgreq-6e034140a168fd9e317ae5d6a068b276-XXX&enrichSource=Y292ZXJQYWdlOzMzMzE0OTc0NztBUzo3NjgyNzMwMTI1MDY2MzBAMTU2MDE4MTk2ODgyNg%3D%3D&el=1_x_4&_esc=publicationCoverPdf




https://www.researchgate.net/profile/Qing_He55?enrichId=rgreq-6e034140a168fd9e317ae5d6a068b276-XXX&enrichSource=Y292ZXJQYWdlOzMzMzE0OTc0NztBUzo3NjgyNzMwMTI1MDY2MzBAMTU2MDE4MTk2ODgyNg%3D%3D&el=1_x_4&_esc=publicationCoverPdf





DOI: 10.1111/mice.12453

O R I G I N A L A R T I C L E

Predicting rail defect frequency: An integrated approach usingfatigue modeling and data analytics

Faeze Ghofrani1 Abhishek Pathak1 Reza Mohammadi2 Amjad Aref1 Qing He1,2,3

1Department of Civil, Structural and

Environmental Engineering, University at

Buffalo, The State University of New York,

New York

2Department of Industrial and Systems

Engineering, University at Buffalo, The State

University of New York, New York

3Key Laboratory of High-Speed Railway

Engineering in Ministry of Education, School

of Civil Engineering, Southwest Jiaotong

University, Chengdu, China

CorrespondenceQing He, Department of Industrial and Sys-

tems Engineering and Department of Civil,

Structural and Environmental Engineering,

University at Buffalo, The State University of

New York, 313 Bell Hall, Buffalo, NY 14260.

Email: [email protected]

Funding informationFRA, Contract Number: DTFR5317C00003

AbstractIn maintenance planning of rail track, it is imperative to assess the potential and fre-

quency of rail defects. Although this problem has been mainly studied in the literature

by either data-driven or mechanic-based models, in the present study a new method is

proposed to account for the strengths of both approaches in a single model. The envis-

aged model incorporates fatigue crack growth model, through Finite Element Model-

ing (FEM), into Approximate Bayesian Computation (ABC) framework. The method

is applied to the prediction of rail defect frequency for transverse defects obtained

from a US Class I Railroad. The results of the proposed model show that inducing the

mechanics of rail defects into a data-driven model outperforms the traditional pure

data-driven models by over 20%. The outcome of this study, along with necessary

future developments to broaden the scope of applicability of the method, will benefit

railroad existing practice in capital and maintenance planning.

1 INTRODUCTION

Railroad track is a very complex system that involves many

interactions, and thus, generates many potential areas where

a defect can occur and may occasionally lead to severe train

accidents.

According to the existing literature, defects appearing on

the rail track could be either of a geometry or structural type

(Sadeghi & Askarinejad, 2011). Track geometry defects are

generated from the geometry conditions of the track includ-

ing profile, alignment, gage, etc. Some studies in the litera-

ture have been mainly focusing on track geometry (Benedetto,

Ciampoli, Brancadoro, Alani, & Tosti, 2018; Mohammadi,

He, Ghofrani, Pathak, & Aref, 2019) to provide maintenance

planning accordingly (Xie, Lei, & Ouyang, 2018; Xu, Sun,

Liu, Souleyrette, & Wang, 2015). On the other hand, track

structural defects (known as rail defects), which is the main

© 2019 Computer-Aided Civil and Infrastructure Engineering

focus of the current study, indicate ill-conditioned structural

parameters such as rail, sleeper, tie, subgrade, etc. Since rail

defects are difficult to observe using visual inspection tech-

niques, ultrasonic inspections are carried out on a regular

basis to assess health of the railway tracks. An overview of

the defects commonly found in modern rails by ultrasonic

inspection can be found in a paper by Cannon, Edel, Grassie,

& Sawley (2003). One of the most important types of rail

defects which is also the most common cause of rail acci-

dents is called transverse defects (Cannon et al., 2003; Lanza

di Scalea et al., 2005). These defects occur in the head of the

rail and in the transverse direction (i.e., perpendicular to the

movement of the train). The current study is limited to this

type of rail defect as it would be later explained in greater

detail.

Due to repeated loading, rails, which are made of steel pri-

marily, are prone to fatigue failure. Fatigue failure of metallic

Comput Aided Civ Inf. 2019;1–15. wileyonlinelibrary.com/journal/mice 1

2 GHOFRANI ET AL.

structures, such as those made of steel, goes through three

phases: crack initiation, crack growth, and finally a rapid

crack growth culminating in total fracture. Crack initiation

and crack growth are the most important phases for an engi-

neer to confine the defects into reasonable limits of rail

imperfection (Cannon et al., 2003). This has resulted in the

improvement of current rail inspection technologies to find

cracks at earlier stages. While eddy current can be used

for detection of surface and near-surface defects as sensi-

tive rail head surface inspection, a recent development in rail

defect identification specially those volumetric and crack-like

defects in rail head, web and foot is the use of ultrasonic

guided waves and noncontact probing techniques. Ultrasonic

guided waves offer the potential of inspecting a long length

of an effective waveguide, such as a continuously welded rail

(Loveday, Ramatlo, & Burger, 2016). However, the advan-

tages of guided waves come with the difficulty in manag-

ing their complicated propagation behavior. The main issue is

the guided wave multimode character (making the propaga-

tion of many modes simultaneously) and dispersive character

(the propagation velocity depends on the frequency). Conse-

quently, only when guided modes are properly managed can

guided waves become an effective tool for rail defect detection

(Coccia et al., 2008). Also, ultrasound inspection is incapable

of detecting very small cracks before they turn into a defect,

making the understanding of the crack propagation behav-

ior very difficult. Although, periodic inspections can help on

finding the defects at earlier stages and reducing the effect of

the occurred defect, a very potent way of improving the safety

level of the track is by estimating the frequency of defects

on the different segments of the network before they happen.

This can, as a result, determine the amount of work required in

design and later in the inspection and maintenance program.

This estimation not only represents the mechanical condition

of the track, but also represents the frequency at which the

inspection and testing should be done to keep tracks safe for

operations and to avoid major downtime.

Assessing the potential and frequency of rail defects has

been mainly studied in the literature by either data-driven

or mechanic-based models (as explained in more detail in

Section 2). In the present study, a new method is proposed

to account for the strengths of both models in a single model.

In our study, an Approximate Bayesian Computation (ABC)

technique framework, combined with inputs from fatigue

crack growth model Finite Element Modeling (FEM), is used

to develop a hybrid physics-informed statistical model for cal-

culating the frequency of defects in rails. A schematic dia-

gram of the computational framework from a high level is

presented in Figure 1. According to this figure, we first use

the mechanics modeling to determine the accumulated traffic

load, known as “Million Gross Tonnage (MGT),” by which

a crack would propagate to a rail defect. In the next step,

assumptions for the prior distribution of the size and fre-

quency of the cracks are made. This would be an input to the

simulation run by ABC framework which outputs the poste-

rior distributions of the input parameters. Having the poste-

rior distribution of the model parameters and using the MGT

threshold, the frequency of defects could be estimated. More

details of the proposed framework are yet to come later in the

article.

The organization of the article is as follows:

In Section 2, we provide a brief literature review.

Section 3 includes the explanations on components of the

proposed methodology. In Section 4, we implement the pro-

posed methodology on a real-world case study and the steps

are explained in more detail. Results and findings of the study

are provided in Section 5 and finally conclusion and discus-

sion are given in Section 6.

2 LITERATURE REVIEW

2.1 Statistical modelsThis class of model applies statistical analysis to observed

data for defects and other failures in rails. Data are usually

fitted to a statistical model to predict the failure occurrence

in rails. A comprehensive review of the statistical and data-

driven models for rail defects is presented in Ghofrani, He,

Goverde, and Liu (2018).

Determining the MGT thresholds required for a crack of specific size to

propagate to a defect size

Assuming prior distributions for size and frequency of

cracks

Simulation based on ABCframework

Calculating the posterior distribution for size and frequency of cracks

Predicting frequency of defects

Mechanics Modeling

Using Observed Data

F I G U R E 1 Schematic of the computational framework

GHOFRANI ET AL. 3

In particular, Weibull distribution has been reported to be

very successful in modeling fatigue failures (Cannon et al.,

2003). It has been central to several efforts that seek to relate

some features of track geometry or rail material to failure

probability. In the mentioned study, the authors used logis-

tic procedure with terms relating to track strength conditions,

loading conditions, and geometric conditions to determine the

probability of service failures due to broken rail.

Schafer and Barkan (2008) developed a statistical model

using the same technique of logistical regression to predict the

occurrence of broken rails. They also studied the economic

impact of the broken rail incidents. They managed to iden-

tify the most influencing factors to achieve 64.7% predictive

accuracy. A multiple adaptive regression model was used by

Zarembski, Einbinder, and Attoh-Okine (2016) to address the

impact of track geometry on development of rail defects. It

was found that the rail defect life is reduced by approximately

30% when track geometry defects are present in the track.

A very popular category of data-driven models in the recent

years is Bayesian analysis. Bayesian analysis allows you to use

evidence regarding one random variable to update the possi-

bilities of other variables. When the mechanism is unknown or

even unknowable, examining data through the Bayesian lens

has the potential to reveal the “shadow” of the data that we

can then describe and make useful to various degrees. Some

of the recent studies on the application of Bayesian analysis

could be found in Castillo, Grande, Mora, Xu, and Lo (2017),

Castillo, Grande, Mora, Lo, and Xu (2017), Huang and Beck

(2018), Kosgodagan-Dalla Torre et al. (2017), Wang, Liu, and

Ni (2018), and Yuen and Huang (2018).

By the emergence of the advanced big data analytics, the

application of deep neural networks models has sparked in

the recent years. To name a few, Cha and Choi (2017) used a

deep-learning-based crack damage detection by convolutional

neural networks. The test results showed a consistent perfor-

mance although test images taken under various conditions.

Moreover, the performances of the proposed method were

not susceptible to the quality of images or camera specifica-

tion. Other applications of deep neural network for rail failures

could be found in Jamshidi et al. (2017) and Chen and Joffe

(2017).

2.2 Fracture mechanics modelsThere is another class of models that is based on fracture

mechanics. These models try to utilize theory of fracture

mechanics to model crack growth in rails using measured or

estimated data about loading and material properties.

To model crack growth in rails, two sets of information

are required, one pertaining to the geometry and loading and

the other pertaining to the material properties. Geometry and

loading scenarios give us information about the stresses gen-

erated at a defect location. On the other hand fracture mechan-

ics models use material properties and stresses to calculate

crack initiation and growth life in the structure (Ai, Zhang, &

Wang, 2018; Zhu & Jia, 2017).

The loading complexity of rails along with geometric com-

plications can be very well handled by fracture mechanics

models especially when combined with numerical analysis

technique such as FEM.

However, these models are not generally designed to take

into account the probabilistic nature of prediction of fatigue

failure; to overcome this, this class of models relies on com-

bining mechanical models of fracture mechanics with statis-

tical modeling framework to arrive at predictive failure deter-

ministic models of fracture mechanics, either purely analytical

or supplemented with numerical techniques, and reformulat-

ing the model into a probabilistic one by choosing its parame-

ters as random variables. There are several works reported in

the literature that follow this approach. Josefson and Rigns-

berg (2009) presented a framework for uncertainty quantifi-

cation in fatigue life prediction in welded rails. They used

nonlinear Finite Element (FE) analysis and also included the

effects of residual stresses in addition to service loads and

material parameters. Zhu et al. developed a general proba-

bilistic methodology for modeling damage accumulation in

railway axles (Zhu, Huang, Li, Liu, & Yang, 2015). They

used probabilistic S–N curves as the backbone of their devel-

opment and experimentally verified the performance of their

methodology on railway axles.

The other approach is to reformulate deterministic crack

growth laws into stochastic differential equations using suit-

able random process definitions for the parameters involved.

This class of models provides a much better capability to

model uncertainty in random parameters and its variability in

time and focus on capturing the physics of the crack propa-

gation in random process settings (Sobczyk, 1986; Tanaka &

Tsurui, 1987).

The two classes of models that have been described above

both suffer from some limitations. Prediction of future per-

formance based entirely on historical data is highly dependent

upon the quality of the data available. On the other hand, frac-

ture mechanics models face the difficulty of complex interac-

tions in rails that are very difficult to model exactly and are

computationally inefficient. This has been the main motiva-

tion in the current study to incorporate the strengths of both

models as explained in the following sections.

3 METHODOLOGY

3.1 Mechanics model: FEM and crack growth3.1.1 Railway track as assemblyA typical railway track configuration, adapted in a simula-

tion model, is shown in Figure 2. The components of this

4 GHOFRANI ET AL.

Gauge SideField Side

Gauge

Rail

Tie Plates

Tie

Ballast

Sub-Ballast

Sub-Grade

Cant on Rail Inward

F I G U R E 2 Railway track configuration

configuration are classified in two groups: superstructure and

substructure. Superstructure consists of the railway track, ties,

the fastening system, and ballast. Substructure consists of

geotechnical system including subgrade and sub-ballast. Rails

are longitudinal steel members which are placed on equally

spaced concrete or wood ties.

The vertical and horizontal loads imparted by the passing

train traffic are transferred from the wheels of the rail vehicles

to the ties via the rails and fastening system. The fastening

system restricts unwanted movement of the rails by restraining

each to the embedded ties. Ties are transverse beams made

of either wood or concrete and play a significant role in not

only transferring the load coming from rails to the ballast and

subgrade but also to provide a relative geometric constraint to

the two rails.

3.1.2 Loading and boundary conditionsAs described above, the track structure configuration is a very

complex system of interactions between components. The pri-

mary source of load on the system is from the wheel running

on the rail. Remennikov and Kaewunruen (2008) presented a

review of loading conditions occurring in overall track system

that were confined to predominantly due to the interaction of

wheel and rail. They categorize loads into three categories:

(a) static load, or the load that assumes ideal unchanging

(i.e., nonmoving) conditions between wheel and rail, (b)

quasi-static load, or the load that is used to capture moving

conditions between wheel and rail, vehicle speed and track

geometry; and (c) impact load, or the load that quantifies the

irregularities in the rail–wheel surfaces or other geometrical

factors.

Among the three different kinds of loads described above,

variation in wheel–rail contact stress, having its origin in

surface irregularities, has been of central interest for many

researchers. One of the primary reasons for this interest has

been the fact that a very small contact patch between wheel

and rail is responsible for providing forces for steering, accel-

eration, and deceleration of the massive vehicle. The correct

stress profile and contact force are dependent on many fac-

tors coming from material properties and boundary conditions

making it a very hard problem to solve.

A review of various contact mechanics theories that are

applied to rail–wheel interaction problem can be found in

the work of Meymand, Keylin, and Ahmadian (2016). They

discussed various normal and tangential contact mechan-

ics theories along with their underlying assumptions and

simplifications and restrictions imposed by them in view

of their applications. They note that experimentally verified

theories are still elusive for tangential contact problem and

for normal contact there is a need for developing fast and

accurate techniques for experimental validation.

Resolving these details of contact stress profile and con-

tact force is important for rolling contact fatigue problem.

Recently, Panunzio, Puel, Cottereau, Simon, and Quost (2018)

performed a sensitivity analysis of wheel–rail contact inter-

actions on resulting fatigue in railway cross-section. They

reported that effect of changes in stress profile and contact

forces affect only a small region near contact surface and cor-

ners of the rail head except when on a tight curve.

Through various experiments conducted on railway

bridges, Frýba (1996) noted that vertical load, for statistical

analysis, should be taken in the range of 180–200 kN for fully

loaded freight cars and locomotives, 100 kN for passenger

cars and partially loaded freight cars, and 50 kN for empty

freight cars. In addition to these loads coming from rail–

wheel interactions, the rails go through mechanical loading

due to temperature variation, residual stress arising from

manufacturing process or welding, etc.

3.1.3 Modeling assumptions for current studyIn present study, the focus is on the crack growth in the

railhead. Also, the minute details of stress profiles due to

factors like residual stresses and near the contact zone of

wheel and rail are not of interest. This is primarily because

of the lack of resolution in the data to distinguish their effect.

The aim of the mechanics model is to give a stress profile

that is true within error bounds resolvable through the data.

Thus, some major assumptions and simplifications have been

introduced in the model for ease of analysis. The rail wheel

is modeled as a circular disc that imparts load onto the rails

through frictional contact, ties are modeled as constraints on

the rails with sole purpose of keeping the rail in place and

introduce periodicity in the support conditions to simulate

real-life loading profile. Although this model is expected to

give inaccurate results for stress profile at and near the contact

surface and is unsuitable for surface defects, it is reasonable

simplification in our study. We are interested in the stresses

away from contact region and well inside the rail head. The

GHOFRANI ET AL. 5

(a)

(b)

F I G U R E 3 Model of rail–wheel assembly over sleepers

simulated as constraints: (a) View of whole assembly as modeled in

ABAQUS, (b) an enlarged view of wheel interacting with rail

effect of contact zone and contact stress is expected to be

minimal at those locations (Panunzio et al., 2018) and the

presence of crack is expected to dominate the stress intensity

factors and stress profile. In addition, the rail pad stiffness is

ignored because its inclusion is expected to play insignificant

role in quasi-static simulation that we have chosen to perform

while increasing the computational overhead.

The deformability of the sleeper support is also neglected.

However, the track itself is allowed to deform under the

applied loads. The rail section used is standard 132RE and

wheel diameter is 940 mm with flange width of 19 mm, both

the values coinciding with standard passenger car wheel.

We have chosen to model rail–wheel interaction using

quasi-static FE procedure in ABAQUS. The mesh is gener-

ated using 8-noded brick elements. The load of 170,000 kg

was chosen to act on each wheel which is in accordance with

the recommendations made by Frýba (1996) for mean load

that has to be applied for statistical analysis. The mesh size

in the model is found to be governed by consideration of

load transfer through the rail–wheel frictional contact and is

decided based on convergence study. A schematic picture of

the model is shown in Figure 3.

3.1.4 Crack modelingLooking at the data to be analyzed and what has been reported

in the literature, it was assessed that the most common type

of cracks in railhead involve planar cracks appearing in the

plane perpendicular to the traffic direction. These types of

cracks come under the classification of transverse cracks.

To avoid generating new mesh for every new crack that is

to be simulated, we applied extended finite element method

(XFEM) for defining crack in ABAQUS. In this method, we

do not need to define a physical crack in the model and the

definition of crack is supplied through defining a surface

that acts as two-dimensional cracked domain. Moreover, the

cracks are modeled as stationary cracks and their propagation

rate is estimated through Paris law. Use of this strategy

avoids additional details such as cohesive zone modeling

and simplifies the model. As our intent is not to capture the

crack propagation path but to capture the growth of the crack

in statistical sense, the simplification in crack modeling is

reasonable. This method gives a lot of flexibility in terms of

changing the size and location of the cracks quickly for differ-

ent simulations without the need of manipulating the mesh.

SubmodelingThe cracks that are modeled in this study are very small (a few

millimeters in size) compared to the global size of the system.

Thus, modeling cracks in a global model requires inhibitive

number of elements in the analysis. To overcome this issue,

a submodeling approach, which is a widely used technique in

fracture mechanics simulations, is adopted. In this approach,

a global model is analyzed without the presence of cracks.

This model has large element size and is efficient in producing

boundary conditions for a small region of interest where the

crack is located. Using this boundary information, a very fine

discretization is used to analyze the vicinity of cracks. This

model is efficient due to small size of the system. This two-

step process shows computational efficiency and yet provides

high-fidelity results of the state of stress around the imbed-

ded crack under consideration. The relative size of the sub-

domain can be seen in Figure 4. The mesh size for the global

model was chosen to be 5 mm, however, for the submodel-

ing, a much finer mesh size (0.5 mm) was deployed to calcu-

late stress intensity factors accurately. The size of the mesh

also depends on the crack dimension that has to be modeled

since correct XFEM crack generation requires few nodes to

be present at the smallest of the edges.

3.1.5 Crack propagation rate calculationTo model the crack propagation under repeated loading,

empirical crack growth rate equations known as Paris law is

used. Paris law, proposed by Paris and Erdogan (1963) in their

pioneering work, incorporates the experimental observation

that crack propagation speed not only increases with higher

stress range but also with the size of the crack itself.

This concept has been studied extensively since its intro-

duction; a multitude of new variations and extensions of the

original equation have been proposed and used in various

applications. Newman (1998) presented a review of the work

6 GHOFRANI ET AL.

(a)

(b)

F I G U R E 4 (a) Location of the submodeling region in the global

model; (b) view-cut of the submodel region with the rectangular

surface that is used to model XFEM-crack (only rail head is used for

submodel, the web and base of the rail is shown only for representing

the location of the region)

that has gone further into understanding and refining the

relationship of crack growth rate and stress intensity factor

range. With regards to crack propagation in rails, there have

been several studies aimed at characterizing Paris law for

crack propagation in rails under the influence of varying

number of factors and loading conditions (Bogdański,

Stupnicki, Brown, & Cannon, 1999; Josefson & Ringsberg,

2009; Orringer et al., 1988).

dadN

= 𝐶(Δ𝑆eff

)𝑚𝑎𝑚∕2 (1)

Here, 𝑎 is size of the crack; 𝑁 is the number of cycles of a

repeated load; Δ𝑆eff is range of effective stress intensity fac-

tor calculated based on stress intensity factors reported from

FEM analysis in ABAQUS (and with units 𝑚0.5 × MPa); and

𝐶 (with units𝑚0.5

𝑐𝑦𝑐𝑙𝑒 × MPa ) and 𝑚 (with no units) are constants

defined for each material (m in the units stands for meters).

For the current study, we have taken material parameters for

rail steel grade UIC grade 900A which was used in the study

referred by Josefson and Ringsberg under EU-project ICON

(“ICON final technical report,” 2000) (Josefson & Ringsberg,

2009).

The values of parameters used are as follows: 𝐶 = 2.0 ×10−9 and 𝑚 = 3.33. With some algebraic calculations, the

T A B L E 1 Lookup table for statistical analysis: Accumulated

MGT for crack growth

Final size (mm)Initialsize (mm) 1 1.5 2.0 2.5 3.0 3.5 40.5 73.1 155.8 179.9 194.9 195.5 196.5 197.6

1 — 82.8 106.8 121.9 122.4 123.4 124.5

1.5 — — 24.0 39.1 39.7 40.6 41.7

2 — — — 15.09 15.63 16.60 17.70

2.5 — — — — 0.54 1.51 2.61

3 — — — — — 0.96 2.06

3.5 — — — — — — 1.10

Paris law can be converted to the following equations giving

number of cycles for a particular growth in size:

𝑁 =(𝑎𝑐

1−𝑚∕2 − 𝑎01−𝑚∕2)

𝐶 ⋅ (1 − 𝑚∕2)⋅(Δ𝑆eff

)−𝑚(2)

Here, 𝑎𝑐 is the final crack size and 𝑎0 is the initial crack

size (in meters). Furthermore, this number of cycles for crack

growth can be converted to equivalent accumulated traffic

load (MGT), by multiplying it by the load from each wheel.

This methodology is used to perform crack propagation stud-

ied with several initial crack sizes and a lookup table was gen-

erated. We chose seven different crack sizes starting from 0.5

to 3.5 mm in the increment of 0.5 mm. The data obtained from

one particular crack size are used to propagate crack from its

initial value to the next higher size of the crack for which sim-

ulation has been performed, for example, from 2.5 to 3.0 mm.

The stress intensity factors are updated as soon as the crack

grows to reach a level where new data are available. Follow-

ing this procedure, any initial crack can be propagated to any

other higher crack size by using relevant stress intensity fac-

tors as it grows. These data are consolidated in a lookup table

that acts as input for the further statistical analysis and is pre-

sented in Table 1.

According to existing railroad practice, we assume the min-

imum crack size to be detected by ultrasonic device is 3.5 mm

(Lanza di Scalea et al., 2005) around 5% of the railhead cross-

section. Once detected, a crack shall be labeled as a rail defect.

3.2 Bayesian inferenceTo calculate the posterior distribution of rail defects based

on the recorded data, the statement of Bayes’ Theorem that

describes the conditional probability of a parameter 𝜃 based

on another parameter D could be used as follows:

𝑝 (𝜃|𝐷) = 𝑝 (𝐷|𝜃) 𝑝 (𝜃)𝑝 (𝐷)

(3)

GHOFRANI ET AL. 7

In Equation (3), 𝑝(𝜃|𝐷) refers to the posterior, 𝑝(𝐷|𝜃)denotes the likelihood, p(𝜃) the prior, and p(D) the evidence

(which is also referred to as the prior predictive probability of

the data).

Any characteristics associated with rail defects that we wish

to model (defect size, its occurrence, etc.) could be treated as

𝜃 and the relevant data recorded (defect size, occurrence time,

respectively) as D. In most of the real-life problems, 𝑝(𝐷) acts

only as a normalizing factor and is usually ignored in the anal-

ysis (Sunnåker et al., 2013). If we choose to go by analytically

tractable route, some information about prior distribution 𝑝(𝜃)is needed as well as the calculation of the likelihood 𝑝(𝐷|𝜃).However, barring some very simple problems, it is generally

computationally expensive to evaluate the likelihood. ABC is

a method that gives posterior distribution of any parameter

without having to calculate the likelihood. Using computa-

tional efficiency of modern day simulations, ABC framework

circumvents the need of calculating likelihood by the compar-

ison between simulation data and recorded data. This method

has been successfully used in the field of biology, specifi-

cally population genetics, where the problem of large data sets

and many parameters makes it unfeasible for using analyti-

cally tractable models (Beaumont, Zhang, & Balding, 2002;

Csilléry, Blum, Gaggiotti, & François, 2010; Sunnåker et al.,

2013; Tavaré, Balding, Griffiths, & Donnelly, 1997).

3.2.1 ABC rejection algorithmThe root of the ABC framework is the rejection algorithm.

The most basic form of the rejection algorithm can be

described in following fashion. We start with a sample of

parameter points from prior distribution 𝑝(𝜃). Each sample

parameter point 𝜃 is simulated using an evolution model and

simulated data Ď are generated. If the generated data set

Ď varies significantly from the observed data set D, then

the parameter point 𝜃 is rejected. Otherwise, the parameter

point is accepted. To quantify the difference between mea-

sured and simulated data set, the magnitude of Euclidean dis-

tance between the data set is calculated, denoted by 𝜌(�̌�,𝐷)and a tolerance 𝜀 is defined for making decision on rejection

(Equation 4).

𝜌(�̌�,𝐷

)≤ 𝜀 (4)

The outcome of this process is a posterior distribution of

parameter points without having to calculate the likelihood

(Sunnåker et al., 2013).

In many cases, the data sets are very large and have

high dimensionality associated with them. In those cases, the

Euclidean distance between Ď and D is very difficult to keep

small. This results in rejection of undesirably large number of

sample points and reduction in robustness of the algorithm. To

circumvent this problem, a common approach is to calculate

summary statistics for each data set and compare Euclidean

distances between those. This adjustment results in a much

more computationally efficient algorithm. It is important to

note that the summary statistics should be defined such that

it captures all the information about 𝜃 that is available in data

set D to reduce the error introduced in the process (Didelot,

Everitt, Johansen, & Lawson, 2011).

This framework is adopted to find the distribution of rail

defects in a particular segment of rail by using the size of

the defect and rate of occurrence as parameter 𝜃 and known

defects identified during regular inspections as the observed

data set D.

In the provided case study, we describe the integration of

mechanics model with ABC in greater detail.

4 CASE STUDY

4.1 Data preparationTo conduct the integrated modeling, we need a set of data

with different variables. Several sources of data, including

track properties, rail defects, geometry defects, grinding, and

inspection history are provided by a US Class I Railroad,

collected between 2011 and 2016. We have manipulated the

available data according to Figure 5 to prepare the data in the

desirable format for our modeling purpose. It should be noted

that our study is mainly focused on the internal rail defects

occurred on the major lanes of the network. Therefore, we first

clean the data set to acquire homogenous segments of the rail

on the major lanes of the network. Later, we exclude the rail

defects on the surface of the rail and examine the behavior of

the rest of defects on the selected segments. More details on

the data processing procedure are delineated in the following

steps:

• We used the “Track” properties data set to integrate the

track segments based on the spatial coordinates. The spa-

tial characteristics of the segments include segment “Pre-

fix,” “Track Type,” and “Mile Post (MP) Range.”

• To assure keeping a viable number of rail defects for anal-

ysis, we have selected track segments with total length

greater than 50 miles. In this case, we came across with

92 segments and a total of 15,000 miles of mainline track,

which is sufficient for our analysis.

• Since the “Tonnage” data set includes the natural segmen-

tation of track based on tonnage, we divided the selected

segments in the previous step, according to the “Tonnage”

data set so that we would have homogenous rail segments

in terms of traffic load. This is necessary to achieve reliable

and homogenous track segments both in terms of spatial

coordination and traffic for our further analysis.

• Since, two different sources are used for track segmenta-

tion, a few overlapping segments might appear due to some

8 GHOFRANI ET AL.

Track Tonnage Rail and Geometry Defects

Inspections and Grinding

Selecting Segments Removing Segments Overlaps

Dividing Segments

Assigning a Unique ID for each Segment

Assigning Rail and Geometry Defects to

Allocating Inspections and

Input Data set

Processing

Output Data set An integrated Data set with Required Variables for Modeling Purpose

F I G U R E 5 Data preparation for modeling purpose

natural inconsistencies between the two data sources. All

segments are checked and in case any overlaps between

segments exist, they are removed by assigning the over-

lapped portion to the segment which had more than 50%

of the length of the overlapped proportion in it. Otherwise

the overlapped portion is divided equally between the two

segments when its length is less than 50% of each segment.

This helps in obtaining a reliable data set for our further

analysis.

• Because the mechanistic model cannot handle very long

segments, the length of segments is checked. All the long

segments are divided into segments of 2 miles or smaller.

A unique ID is then given to each segment.

• For each segment with a unique ID, the average monthly

tonnage is calculated using “Tonnage” data set.

• For each segment with a unique ID, the history of the rail

and geometry defects, grinding and inspection through all

the years of study are matched using the detailed spatial and

temporal coordinates (“Prefix,” “Milepost Range,” “Track

Type,” and “Year”) in each data set.

The final processed data set includes the unique ID, year,

average tonnage, frequency of rail and geometry defects,

inspection frequency, and grinding presence (output in

Figure 5) for 9,780 rail segments with more than 52,000

defects during 6 years. These are all the input for the modeling

procedure.

4.2 Integration of mechanistic and statisticalmodel for a US Class I RailroadTo conduct the integration of the mechanistic and statistical

model, the ABC framework is used. The aim is to find the

optimal parameter values that achieve the minimum error for

predicting the number of defects occurring in each segment

given a 6-year period of relevant data, including the observed

number of defects.

It is worth mentioning that we do not have any informa-

tion about the cracks appearing inside the rail. Therefore, we

take advantage of simulation for crack emergence inside the

rail and then we move forward in our simulation to see which

of those cracks are supposed to become defects according to

FEM output.

Before we jump into the steps of the main algorithm, we

need to define three functions named FE, G, and POSTE-RIOR, each of which is used inside the previous one, respec-

tively, and all are used in main algorithm.

Function FE: This function gets the initial crack size as

input and returns the required MGT for that crack to grow

as large as 3.5 mm (to be detected as a defect) according

to Table 1 (output of FEM). As an example, according to

Table 1, FE (0.5) and FE (1) are equal to 196.5 and 123.4

MGT, respectively.

Function G: This function is used to simulate number of

defects for each segment (shown with index p), given the

information of that segment). The general steps of Function

G are provided in the flowchart in Figure 6. According to this

figure, Function G gets the information of each segment as

an input to the function. For each segment, number of cracks

for each segment (which is unknown) is drawn from a Pois-

son distribution with parameter 𝜆p, considering the Poisson

distribution for the case of arrival rate is very common in the

literature (Tonge & Ramesh, 2016).

As mentioned before, the minimum crack size required

to be detected as a defect by ultrasound is assumed to be

3.5 mm. In this essence, the size of each crack is considered

to be drawn from a uniform distribution with lower bound

0.5 and upper bound 5 to first make sure that the size range

GHOFRANI ET AL. 9

The function gets segment information as

input

It draws number of cracks for each

segment (λp) from a Poisson distribution

It draws size of each crack from a discrete uniform distribution

It checks if the drawn cracks would turn into a defect (using MGTp and

FE function)

It calculates number of cracks that turn into

defects

The function returns simulated number of

defects for each segment

F I G U R E 6 General steps in Function G for calculating the simulated number of defects for each segment

is adequate, and to cover the 3.5 mm in the range, secondly.

Having the MGT of each segment in each year and using

the FE function, Function G checks whether each of the

drawn cracks would turn into a defect (become 3.5 mm or

more in size) after a certain time (year) or not. The function

ultimately returns the simulated number of defects for each

segment. More details on the implementation of the steps of

Function G are given in pseudocode provided in Algorithm

1. In this pseudocode, T is the time in years, n_cracks defines

number of cracks, n_defects denotes number of defects, and

MGTP accounts for the annual MGT for segment p.

Algorithm 1. Simulating number of defects

Function G (p) # given the data related to segment p, simulate the number of defects

define cracks as a list of size Tdefine n_defects as a list of size T initialized with 0 For t in 1 to T do

if t>1 for crack in cracks[t-1] do

if t*MGTp > FE(crack) n_defects[t] = n_defects[t] + 1 remove crack from cracks[t-1]

endend

endn_cracks ~ Poisson (λp) For i in 1 to n_cracks do

cracks[t][i] ~ DiscreteUniform(0.5, 5) if MGTp > FE(cracks[t][i]) or cracks[t][i] >= 3.5

n_defects[t] = n_defects + 1 end

endend

end

Function POSTERIOR: This function applies the logic of

the ABC framework to estimate the posterior distribution of

cracks arrival (𝜆) for each segment given observed number

of defects and cumulative MGT. 𝜆p is the Poisson distribu-

tion parameter for segment p. A schematic of what happens

in POSTERIOR function is shown in Figure 7.

Since we started with no data on past experience of

internal rail cracks, we assume noninformative or uniform

prior distribution for parameter of the model 𝜆 (Chatterjee &

Modarres, 2012). In this function, 𝜆 is assumed to be drawn

from a prior uniform distribution with 0 and 10 lower and

upper bounds, respectively. Considering the fact that the

average rate of defects for each segment in our study is almost

one defect per mile per year, applying a discrete uniform

distribution of crack rate with lower and upper bounds 0–10,

is conservative enough for the purpose of our simulation.

We run a series of M simulations (M = 1,000 in our study)

for each segment by drawing parameter values from the prior

distribution. For each simulation, we use Function G to cal-

culate the simulated number of defects and then the dis-

tance between the simulated and observed number of defects

is computed. According to the ABC framework, the simu-

lation runs with distances over a threshold (є) are rejected,

where є is set as the 90th percentile of all distances. In other

words, only the 10% of the lowest distances of the simula-

tion runs are kept for each segment. The mean of the distribu-

tion of 𝜆 for those kept simulations are proposed as the poste-

rior distribution of the parameters of the model. The selected

𝜆 distribution based on the rejection algorithm for two sam-

ple segments of our study is provided in Figure 8. More

algorithmic details for POSTERIOR function are provided in

Algorithm 2.

Algorithm 2. Calculating posterior distribution of 𝜆

Main algorithm: After defining each of the mentioned

functions, we can now explain the details on the steps of the

main designed algorithm. To do so, we first introduce notation

of the parameters and variables of the model as follows:

10 GHOFRANI ET AL.

λ1 λ2 λ3 … θ

n_defects1

Prior distribution of the model parameter, number of cracks (λ): assumed as discrete uniform distribution

n_defects

Observa�onal data

1. Summary statistic (

5. The posterior distribution of λ isapproximated using the distribution of parameter values λ, of accepted simulations

n_defects) from observational data

2. n

ˆ

simulations are performed by drawing parameter values from the prior distribution for each segment

3. The summary statistic(n_defect) is computed for each

(n_defectsi,n_defectsi ≤ ε))

simulation using Function G

4. Based on the distance and a tolerance, we decide for a simulation whether its summary statistics to be kept or to be rejected (considering the closeness of prediction to observed data) ( Posterior distribution

of model parameter λ

Simulation 1 Simulation 2 Simulation 3 Simulation n

n_defects2 n_defects

3n_defects

n

F I G U R E 7 General steps in Function POSTERIOR for calculating the simulated number of defects for each segment

F I G U R E 8 Posterior distribution of 𝜆 for two sample segments

M: Number of trials

T: Total time in years

K: Number of folds

MGTp: Annual MGT for segment pWeightp: Rail weight for segment p

Speedp: Speed limit for segment pGeo_Defp: Number of geometry defects per year for

segment pInspectionp: Frequency of inspection in each year for

segment p

GHOFRANI ET AL. 11

Grindingp: Frequency of grinding in each year for segment pn_defectsp: Array of length T storing the number of defects

per year for segment pDISTANCE(𝑛defect, 𝑛d𝑒𝑓𝑒𝑐𝑡)∶ Euclidean distance between

observed number of defects (𝑛defect) and simulated number

of defects (𝑛d𝑒𝑓𝑒𝑐𝑡)

The general steps of the main algorithm are given in

Figure 9 and more details are provided in Algorithm 3.

Algorithm 3. ABC framework

To conduct the main algorithm, we first divide the data set

into three folds for the purpose of threefold cross-validation.

In this essence, each fold would serve as training data set twice

and as test data set once. Given the real number of defects

per mile and MGT of each segment in training data set, the

number of cracks per mile (𝜆p) for each segment is estimated

using POSTERIOR function. The variables vector is set as

Xp, which includes six variables: (a) the average annual MGT

of the segments (MGTp), (b) weight of the rail in each segment

(Weightp), (c) freight speed limit in the segment (Speedp),

(d) number of geometry defects per year for each segment

(Geo_Defp), (e) frequency of inspection in each year for each

segment (Inspectionp), and (f) presence of grinding in each

year for each segment (Grindingp).

These variables are required to set a log-regression model

on the training data set so that we would achieve the coef-

ficient estimates for each of these variables. By fitting the

log-regression model and finding its coefficients, 𝜆p could be

predicted for the test data set. The predicted 𝜆ps are used in

Function G to predict number of defects in each segment of

the data set. The mentioned functions and the simulation runs

have been all undertaken using Python 3. Later, we compare

the predicted values with the observed data to check the vali-

dation of the model.

5 RESULTS AND FINDINGS

As mentioned before, we have conducted our proposed

approach for data collected from a Class I US Railroad

between 2011 and 2016.

The prediction accuracy of the model is evaluated by

comparing the predicted (𝑛_𝑑𝑒𝑓𝑒𝑐𝑡𝑠) and actual values

(n_defects) for number of defects in the test data sets (by three-

fold cross-validation as explained before), using two mea-

surements: the Mean Absolute Error (MAE) and Root Mean

Square Error (RMSE) which are formulated as:

MAE =

∑𝑛

𝑖=1|||𝑛𝑑𝑒𝑓𝑒𝑐𝑡𝑠 − 𝑛defects

|||𝑁

(5)

RMSE =

√√√√√∑𝑛

𝑖=1

(𝑛𝑑𝑒𝑓𝑒𝑐𝑡𝑠𝑖

− 𝑛_defect𝑠𝑖)2

𝑁(6)

where N is the number of segments in test data set,

𝑛𝑑𝑒𝑓𝑒𝑐𝑡𝑠𝑖

and 𝑛defects refer to the predicted and observed num-

ber of defects for ith segment in test data set, respec-

tively. These two measurements are among the most popular

and most common validation measurements when predicting

numeric variables (Willmott, 1982).

A discussion on the superiority of MAE over RMSE is

presented in Willmott and Matsuura (2005). Since RMSE is

based on the sum of the squared error, it does not describe

average error alone, tends to become increasingly larger than

MAE and the interpretation tends to be more difficult. In this

essence, MAE is usually a far better measure of error (MOE)

compared to RMSE and even other MOEs.

We have compared the results of our model with Negative

Binomial (NB) model which is a well-known traditional

statistical model for frequency prediction (Washington,

Karlaftis, & Mannering, 2010). The results of both models

are provided in Table 2.

The average predicted number of defects and real num-

ber of defects are shown in Table 2. As understood from this

table, considering MAE and RMSE, the proposed method

is an improvement of 20% and 16%, respectively, over the

NB regression model for predicting the expected number of

defects on each segment.

12 GHOFRANI ET AL.

Partition the data set into train and test sets

For the segments in train set, the posterior

distribution of λ is calculated (using

POSTERIOR Function)

Fitting a log-linear regression model on

train data segments, the coefficients of each

variable is determined

The computed variable coefficients are used to predict λ for test data set

Simulated number of defects are computed

using Function G

The average difference between simulated

defects and observed defects of all segments

are calulated as the error metric

F I G U R E 9 General steps in the main algorithm for predicting number of defects

T A B L E 2 Results of the proposed model compared to the results

of the negative binomial model

ItemProposedmodel

Negative binomialmodel

Average predicted no.

of defects (annual

per mile)

0.85 0.88

Average real no. of

defects (annual per

mile)

0.89 0.89

MAE 0.68 0.85

RMSE 1.11 1.32

Number of segments

in test data set

3,260

0

20

40

60

80

100

120

125

150

175

110

0112

5115

0117

5120

0122

5125

0127

5130

0132

51

Abs

olut

e E

rror

Segment Number

Proposed_Model NB_Model

F I G U R E 1 0 Distribution of the absolute error by the proposed

model as well as NB model on segments of test data set

The distribution of average error of prediction for segments

of the test data set using both NB and our proposed model is

presented in Figure 10.

As seen in Figure 10, for almost all of the samples in test

data set, the MAE of the proposed model is less than that of

the NB model. This is justifiable as the proposed methodol-

T A B L E 3 Results of negative binomial model

Variable Estimate Z_value Pr (> |z|)(Intercept) 2.978 16.489 0.000

Annual MGT 0.002 4.336 0.000

Speed −0.003 −3.18 0.000

Weight −0.013 −9.169 0.000

Count of Geometry Defects 0.008 8.289 0.000

Frequency of Inspection 0.012 3.415 0.000

Presence of Grinding −0.046 −17.384 0.000

ogy incorporates both the data-driven and mechanics-based

behavior of the rail cracks. We also provided the results of the

NB model in Table 3. According to the results in Table 3, all

the variables are statistically significant at 95% level. The sign

of the coefficient for MGT is positive which implies that the

increase in MGT is associated with increase in total number

of defects on a segment.

This is understandable because as the traffic load on the

rail increases, more defects/cracks are prone to appear on the

rail track segments. Train speed limit has a negative value

implying that the higher the speed, the total number of defects

occurring in a segment would decrease. This could be justified

as the higher speed is usually associated with higher quality of

the rail segment due to tighter tolerances required for the asso-

ciated classification of track. The same reasoning applies for

justification of the negative sign of the weight of the rail. The

heavier the rail, the better the quality and the less defects are

prone to appear. Coming to the sign of the geometry defects,

it is explained that the presence of geometry defects on the

track segments, contributes to the rail defects occurrences in

the rail.

The same conclusion has been reached by other studies in

the literature such as the one provided by Zarembski et al.

(2016).

Although the sign for the frequency of inspection implies

that this variable is mainly associated with more defect occur-

rence, in the real world we can explain that usually segments

GHOFRANI ET AL. 13

with more frequent defects are more frequently inspected by

railway agencies.

That is the main reason why it seems the frequency of

inspection is more associated with defect occurrences. As the

last variable in this study, the presence of grinding can signif-

icantly reduce the rail defects occurrence. In this essence, the

planning for grinding is of utmost importance to maintain an

acceptable level of safety.

6 CONCLUDING REMARKS ANDDISCUSSION

In this article, a physics-based data-driven ensemble model-

ing approach for prediction of the frequency of rail defects is

proposed. The proposed model is gained by conducting a

series of simulations based on the ABC framework accord-

ing to the output of the FEM results on crack initiation and

propagation.

The validation of the proposed model is tested by apply-

ing the trained model on a test data set. The final results of

the proposed model are also compared with the NB model

for estimating the frequency of defects on each rail segment.

It is found that the proposed model decreases MAE by 20%

compared to NB model. In this essence, it is acknowledged

that incorporating the physics-based behavior of the railway

track on a segment is accompanied with a better estimation of

the probable occurrence of defects. Regarding railroad appli-

cations, rail defect frequency is part of their scoring system

to calculate the rail quality and determine the rail renewal for

the next year. It can help on identifying the black spots in the

rail track network to prioritize their corrections. Therefore, the

outcome of this article can be used to guide how to make deci-

sions of capital planning for railroads. At the same time the

current study serves the purpose of proof-of-concept. There

are several areas of improvement that are desired on the front

of modeling as well as data collection to make the proposed

framework a truly potent tool in the future.

The FEM deployed in the current study is very simplified

quasi-static model. This choice of the model is governed

by the quality of data that was available for analysis and

validation. Also, among several different types of defects

that can occur in a railway track assembly, only symmetric

transverse crack inside the rail head and away from the

contact surfaces are modeled in the current study. As more

detailed data with better resolution about the crack location

in the track and inside the rail cross-section is recorded and

made available, a more refined FEM can be deployed. Effects

of surface irregularity such as dynamics impact loads and

contact stresses; effect of track geometry such as grade and

curvature; and effect of subgrade conditions can be included

in the FE model to broaden the domain of its application.

Dynamic effects are neglected in the present study and that

simplifies the FE model to great extent. Surface stresses at

the contact of wheel and rail are not resolved in our simplified

model of uniform frictional contact and thus it is unsuitable

for study of defects such as corrugation and surface burns. A

detailed wheel–rail contact model can be deployed to resolve

those stresses. Lateral loading of tracks at curves is also

not considered in the present model. It can be incorporated

through detailed geometric model of wheel and its interaction

with rail track under dynamic lateral loads. Effect of thermal

boundary conditions such as low/high temperature can affect

the crack propagation to a great extent. A more sophisticated

model for statistical thermal loading is desired to capture its

effect on stresses in FE model and crack propagation rates.

On the side of the data-driven aspect, if more data are avail-

able for a longer period of time, it is expected to improve

the model performance. If available, including some variables

such as heterogeneity of traffic, number of emergency brak-

ing stops, number of vehicles per day per track segment, etc.,

could be also considered in the analysis of crack propagation

behavior. Moreover, more advanced models could be used

inside the main algorithm (rather than log-linear regression)

to account for the effect of historical data of the variables.

ACKNOWLEDGMENTS

This study was funded by FRA under contract NO.

DTFR5317C00003. The data were provided by CSX. Authors

would like to express their sincere thanks for the support from

FRA and CSX.

R E F E R E N C E S

Ai, C., Zhang, A., & Wang, K. C. P. (2018). A nonballasted rail track

slab crack identification method using a level-set-based active con-

tour model. Computer-Aided Civil and Infrastructure Engineering,

33, 571–584.

Beaumont, M. A., Zhang, W., & Balding, D. J. (2002). Approxi-

mate Bayesian computation in population genetics. Genetics, 162(4),

2025–2035.

Benedetto, A., Ciampoli, L. B., Brancadoro, M. G., Alani, A. M., & Tosti,

F. (2018). A computer-aided model for the simulation of railway bal-

last by random sequential adsorption process. Computer-Aided Civiland Infrastructure Engineering, 33, 243–257.

Bogdański, S., Stupnicki, J., Brown, M. W., & Cannon, D. F. (1999). A

two dimensional analysis of mixed-mode rolling contact fatigue crack

growth in rails. In European Structural Integrity Society (Vol. 25,

pp. 235–248). Elsevier. https://www.sciencedirect.com/science/

article/pii/S1566136999800181

Cannon, D. F., Edel, K. O., Grassie, S. L., & Sawley, K. (2003). Rail

defects: An overview. Fatigue and Fracture of Engineering Materialsand Structures, 26(10), 865–886.

Castillo, E., Grande, Z., Mora, E., Lo, H. K., & Xu, X. (2017). Com-

plexity reduction and sensitivity analysis in road probabilistic safety

assessment Bayesian network models. Computer-Aided Civil andInfrastructure Engineering, 32, 546–561.

https://www.sciencedirect.com/science/article/pii/S1566136999800181

https://www.sciencedirect.com/science/article/pii/S1566136999800181

14 GHOFRANI ET AL.

Castillo, E., Grande, Z., Mora, E., Xu, X., & Lo, H. K. (2017).

Proactive, backward analysis and learning in road probabilistic

Bayesian network models. Computer-Aided Civil and InfrastructureEngineering, 32, 820–835.

Cha, Y., & Choi, W. (2017). Deep learning-based crack damage detec-

tion using convolutional neural networks. Computer-Aided Civil andInfrastructure Engineering, 32, 361–378.

Chatterjee, K., & Modarres, M. (2012). A probabilistic physics-of-failure

approach to prediction of steam generator tube rupture frequency.

Nuclear Science and Engineering, 170(2), 136–150.

Chen, F., & Joffe, C. (2017). A texture-based video processing method-

ology using Bayesian data fusion for autonomous crack detec-

tion on metallic surfaces. Computer-Aided Civil and InfrastructureEngineering, 32, 271–287.

Coccia, S., Bartoli, I., Salamone, S., Phillips, R., Lanza, F., Fateh, M., &

Carr, G. (2008). Noncontact ultrasonic guided wave detection of rail

defects. Transportation Research Record, 2117(1), 77–84.

Csilléry, K., Blum, M. G. B., Gaggiotti, O. E., & François, O. (2010).

Approximate Bayesian computation (ABC) in practice. Trends inEcology & Evolution, 25(7), 410–418.

Didelot, X., Everitt, R. G., Johansen, A. M., & Lawson, D. J. (2011).

Likelihood-free estimation of model evidence. Bayesian Analysis,

6(1), 49–76.

Frýba, L. (1996). Dynamics of railway bridges. London: Thomas Telford

Publishing.

Ghofrani, F., He, Q., Goverde, R. M. P., & Liu, X. (2018). Recent appli-

cations of big data analytics in railway transportation systems : A sur-

vey. Transportation Research Part C, 90(January), 226–246.

Huang, Y., & Beck, J. L. (2018). Full Gibbs sampling procedure for

Bayesian system identification incorporating sparse Bayesian learn-

ing with automatic relevance determination. Computer-Aided Civiland Infrastructure Engineering, 33, 712–730.

Jamshidi, A., Faghih-Roohi, S., Hajizadeh, S., Núñez, A., Babuska, R.,

Dollevoet, R.,…De Schutter, B. (2017). A big data analysis approach

for rail failure risk assessment. Risk Analysis, 37(8), 1495–1507.

Josefson, B. L., & Ringsberg, J. W. (2009). Assessment of uncertainties

in life prediction of fatigue crack initiation and propagation in welded

rails. International Journal of Fatigue, 31(8–9), 1413–1421.

Kosgodagan-Dalla Torre, A. K., Yeung, T. G., Morales-Nápoles,

O., Castanier, B., Maljaars, J. & Courage, W. (2017). A two-

dimension dynamic Bayesian network for large-scale degra-

dation modeling with an application to a bridges network.

Computer-Aided Civil and Infrastructure Engineering, 32(8), 641–

656.

Lanza di Scalea, F., Rizzo, P., Coccia, S., Bartoli, I., Fateh, M., Viola, E.,

& Pascale, G. (2005). Non-contact ultrasonic inspection of rails and

signal processing for automatic defect detection and classification.

Insight-Non-Destructive Testing and Condition Monitoring, 47(6),

346–353.

Loveday, P., Ramatlo, D., & Burger, F. (2016). Monitoring of rail

track using guided wave ultrasound. 19th World Conference on Non-Destructive Testing 2016 Monitoring, 1–8.

Meymand, S. Z., Keylin, A., & Ahmadian, M. (2016). A survey

of wheel–rail contact models for rail vehicles. Vehicle SystemDynamics, 54(3), 386–428.

Mohammadi, R., He, Q., Ghofrani, F., Pathak, A., & Aref, A. (2019).

Exploring the impact of foot-by-foot track geometry on the occur-

rence of rail defects. Transportation Research Part C, 102(March),

153–172.

Newman, J. C., Jr. (1998). The merging of fatigue and fracture mechanics

concepts: A historical perspective. Progress in Aerospace Sciences,

34(5–6), 347–390.

Orringer, O., Tang, Y. H., Gordon, J. E., Jeong, D. Y., Morris, J. M., &

Perlman, A. B. (1988). Crack propagation life of detail fractures inrails. United States: Federal Railroad Administration.

Panunzio, A. M., Puel, G., Cottereau, R., Simon, S., & Quost, X. (2018).

Sensitivity of the wheel–rail contact interactions and Dang Van

Fatigue Index in the rail with respect to irregularities of the track

geometry. Vehicle System Dynamics, 56(11), 1768–1795.

Paris, P., & Erdogan, F. (1963). A critical analysis of crack propagation

laws. Journal of Basic Engineering, 85(4), 528–533.

Remennikov, A. M., & Kaewunruen, S. (2008). A review of loading

conditions for railway track structures due to train and track vertical

interaction. Structural Control and Health Monitoring, 15(2), 207–

234.

Sadeghi, J. M., & Askarinejad, H. (2011). Development of track con-

dition assessment model based on visual inspection. Structure andInfrastructure Engineering, 7(12), 895–905.

Schafer, D. H., & Barkan, C. P. L. (2008). A prediction model for bro-

ken rails and an analysis of their economic impact. Proceedings ofthe American Railway Engineering and Maintenance-of-Way Asso-ciation Annual Conference (AREMA), Salt Lake City, UT.

Sobczyk, K. (1986). Modelling of random fatigue crack growth. Engi-neering Fracture Mechanics, 24(4), 609–623.

Sunnåker, M., Busetto, A. G., Numminen, E., Corander, J., Foll, M.,

& Dessimoz, C. (2013). Approximate Bayesian computation. PLoSComputational Biology, 9(1), e1002803.

Tanaka, H., & Tsurui, A. (1987). Reliability degradation of structural

components in the process of fatigue crack propagation under station-

ary random loading. Engineering Fracture Mechanics, 27(5), 501–

516.

Tavaré, S., Balding, D. J., Griffiths, R. C., & Donnelly, P. (1997). Infer-

ring coalescence times from DNA sequence data. Genetics, 145(2),

505–518.

Tonge, A. L., & Ramesh, K. T. (2016). Multi-scale defect interactions in

high-rate brittle material failure. Part I : Model formulation and appli-

cation to ALON. Journal of the Mechanics and Physics of Solids, 86,

117–149.

Wang, J., Liu, X., & Ni, Y. (2018). A Bayesian probabilistic approach for

acoustic emission-based rail condition assessment. Computer-AidedCivil and Infrastructure Engineering, 33, 21–34.

Washington, S. P., Karlaftis, M. G., & Mannering, F. (2010). Statisti-cal and econometric methods for transportation data analysis. Boca

Raton, FL: CRC Press.

Willmott, C. J. (1982). Some comments on the evaluation of model per-

formance. Bulletin of the American Meteorological Society, 63(11),

1309–1313.

Willmott, C. J., & Matsuura, K. (2005). Advantages of the mean absolute

error (MAE) over the root mean square error (RMSE) in assessing

average model performance. Climate Research, 30(1), 79–82.

Xie, S., Lei, C., & Ouyang, Y. (2018). A customized hybrid approach

to infrastructure maintenance scheduling in railroad networks under

variable productivities. Computer-Aided Civil and InfrastructureEngineering, 33, 815–832.

Xu, P., Sun, Q., Liu, R., Souleyrette, R. R., & Wang, F. (2015).

Optimizing the alignment of inspection data from track geometry

cars. Computer-Aided Civil and Infrastructure Engineering, 30(1),

19–35.

GHOFRANI ET AL. 15

Yuen, K., & Huang, K. (2018). Identifiability-enhanced Bayesian

frequency-domain substructure identification. Computer-Aided Civiland Infrastructure Engineering, 33, 800–812.

Zarembski, A. M., Einbinder, D., & Attoh-Okine, N. (2016). Using mul-

tiple adaptive regression to address the impact of track geometry on

development of rail defects. Construction and Building Materials,

127, 546–555.

Zhu, L., & Jia, M. (2017). Estimation study of structure crack prop-

agation under random load based on multiple factors correction.

Journal of the Brazilian Society of Mechanical Sciences and Engi-neering, 39(3), 681–693.

Zhu, S.-P., Huang, H.-Z., Li, Y., Liu, Y., & Yang, Y. (2015). Proba-

bilistic modeling of damage accumulation for time-dependent fatigue

reliability analysis of railway axle steels. Proceedings of the Insti-tution of Mechanical Engineers, Part F: Journal of Rail and RapidTransit, 229(1), 23–33.

How to cite this article: Ghofrani F, Pathak A,

Mohammadi R, Aref A, He Q. Predicting rail defect fre-

quency: An integrated approach using fatigue modeling

and data analytics. Comput Aided Civ Inf. 2019;1–15.

https://doi.org/10.1111/mice.12453

View publication statsView publication stats

https://doi.org/10.1111/mice.12453

https://www.researchgate.net/publication/333149747

Documents

Predicting rail defect frequency: An integrated approach using …qinghe/papers/journal/2019 CACIE... · 2019. 8. 17. · a defect can occur and may occasionally lead to severe train